0

句読点の削除についてはたくさんの例があることは知っていますが、これを行うための最も効率的な方法を知りたいと思います。txtファイルから読み取って分割した単語のリストがあります

wordlist = open('Tyger.txt', 'r').read().split()

各単語をチェックして句読点を削除する最速の方法は何ですか?たくさんのコードでそれを行うことができますが、それが最も簡単な方法ではないことを私は知っています。

ありがとう!!

4

4 に答える 4

2

最も簡単な方法は、そもそも文字で構成される単語のみを抽出することだと思います。

import re

with open("Tyger.txt") as f:
    words = re.findall("\w+", f.read())
于 2012-06-07T16:15:58.870 に答える
1

例えば:

text = """
Tyger! Tyger! burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry? 
"""
import re
words = re.findall(r'\w+', text)

また

import string
ps = string.punctuation
words = text.translate(string.maketrans(ps, ' ' * len(ps))).split()

2つ目ははるかに高速です。

于 2012-06-07T16:17:32.170 に答える
1

私はこのようなもので行きます:

import re
with open("Tyger.txt") as f:
    print " ".join(re.split("[\-\,\!\?\.]", f.read())

本当に必要なものだけを削除し、オーバーマッチによる過度の過負荷を作成しません。

于 2012-06-07T16:18:47.543 に答える
1
>>> import re

>>> the_tyger
'\n    Tyger! Tyger! burning bright \n    In the forests of the night, \n    What immortal hand or eye \n    Could frame thy fearful symmetry? \n    \n    In what distant deeps or skies \n    Burnt the fire of thine eyes? \n    On what wings dare he aspire? \n    What the hand dare sieze the fire? \n    \n    And what shoulder, & what art. \n    Could twist the sinews of thy heart? \n    And when thy heart began to beat, \n    What dread hand? & what dread feet? \n    \n    What the hammer? what the chain? \n    In what furnace was thy brain? \n    What the anvil? what dread grasp \n    Dare its deadly terrors clasp? \n    \n    When the stars threw down their spears, \n    And watered heaven with their tears, \n    Did he smile his work to see? \n    Did he who made the Lamb make thee? \n    \n    Tyger! Tyger! burning bright \n    In the forests of the night, \n    What immortal hand or eye \n    Dare frame thy fearful symmetry? \n    '

>>> print re.sub(r'["-,!?.]','',the_tyger)

プリント:

Tyger Tyger burning bright 
In the forests of the night 
What immortal hand or eye 
Could frame thy fearful symmetry 

In what distant deeps or skies 
Burnt the fire of thine eyes 
On what wings dare he aspire 
What the hand dare sieze the fire 

And what shoulder  what art 
Could twist the sinews of thy heart 
And when thy heart began to beat 
What dread hand  what dread feet 

What the hammer what the chain 
In what furnace was thy brain 
What the anvil what dread grasp 
Dare its deadly terrors clasp 

When the stars threw down their spears 
And watered heaven with their tears 
Did he smile his work to see 
Did he who made the Lamb make thee 

Tyger Tyger burning bright 
In the forests of the night 
What immortal hand or eye 
Dare frame thy fearful symmetry 

または、ファイルを使用して:

>>> with open('tyger.txt', 'r') as WmBlake:
...    print re.sub(r'["-,!?.]','',WmBlake.read())

また、行のリストを作成する場合は、次のようにします。

>>> lines=[]
>>> with open('tyger.txt', 'r') as WmBlake:
...    lines.append(re.sub(r'["-,!?.]','',WmBlake.read()))
于 2012-06-07T16:37:33.900 に答える