python - 指定されたファイルからの Python アナグラムファインダー

Question

私はこれを理解するために太陽の下で絶対にすべてを試しましたが、何も得られませんでした. 問題にアプローチする方法さえわかりません。手順は次のとおりです...

プログラムは、単語のリストを含むファイルの名前をユーザーに尋ねます。単語リストは、各行に 1 つの単語が含まれるようにフォーマットされています。• 各単語について、その単語のすべてのアナグラム (複数あるものもあります) を見つけます。• 出力: 0、1、2 などのアナグラムを持つ単語の数を報告します。最も多くのアナグラムを構成する単語のリストを出力します (最大長が同じセットが複数ある場合は、それらすべてを出力します)。• 適切な機能分解を使用することが期待されます。

私がプログラミングを始めてまだ 1 か月も経っていないので、できる限りすべてを馬鹿にしてください。前もって感謝します。

score 3 · Accepted Answer

これは宿題だと思います。アナグラムは単語の単なる順列であることはご存知でしょう。ゆっくりと物事を進めてください。多くの単語のアナグラムを計算する方法を学ぶ前に、1 つの単語のアナグラムを計算する方法を学びましょう。次のインタラクティブセッションでは、単語のアナグラムを計算する方法を示します。そこから先に進むことができます。

>>> # Learn how to calculate anagrams of a word
>>> 
>>> import itertools
>>> 
>>> word = 'fun'
>>> 
>>> # First attempt: anagrams are just permutations of all the characters in a word
>>> for permutation in itertools.permutations(word):
...     print permutation
... 
('f', 'u', 'n')
('f', 'n', 'u')
('u', 'f', 'n')
('u', 'n', 'f')
('n', 'f', 'u')
('n', 'u', 'f')
>>> 
>>> # Now, refine the above block to print actual words, instead of tuple
>>> for permutation in itertools.permutations(word):
...     print ''.join(permutation)
... 
fun
fnu
ufn
unf
nfu
nuf
>>> # Note that some words with repeated characters such as 'all'
>>> # has less anagrams count:
>>> word = 'all'
>>> for permutation in itertools.permutations(word):
...     print ''.join(permutation)
... 
all
all
lal
lla
lal
lla
>>> # Note the word 'all' and 'lla' each repeated twice. We need to
>>> # eliminate redundancy. One way is to use set:
>>> word = 'all'
>>> anagrams = set()
>>> for permutation in itertools.permutations(word):
...     anagrams.add(''.join(permutation))
... 
>>> anagrams
set(['lal', 'all', 'lla'])
>>> for anagram in anagrams:
...     print anagram
... 
lal
all
lla
>>> # How many anagrams does the word 'all' have?
>>> # Just count using the len() function:
>>> len(anagrams)
3
>>>

便宜上、上記のセッションをここに貼り付けました。

アップデート

今アーロンの説明で。最低レベルの問題は、2 つの単語がアナグラムかどうかをどのように判断するかということです。答えは「文字数が同じ場合」です。（私にとって）最も簡単な方法は、すべての文字を並べ替えて比較することです。

def normalize(word):
    word = word.strip().lower() # sanitize it
    word = ''.join(sorted(word))
    return word

# sort_letter('top') ==> 'opt'
# Are 'top' and 'pot' anagrams? They are if their sorted letters are the same:
if normalize('top') == normalize('pot'):
    print 'they are the same'
    # Do something

2 つの単語を比較する方法がわかったので、単語のリストを作成してみましょう。

>>> import collections
>>> anagrams = collections.defaultdict(list)
>>> words = ['top', 'fun', 'dog', 'opt', 'god', 'pot']
>>> for word in words:
...     anagrams[normalize(word)].append(word)
... 
>>> anagrams
defaultdict(<type 'list'>, {'opt': ['top', 'opt', 'pot'], 'fnu': ['fun'], 'dgo': ['dog', 'god']})
>>> for k, v in anagrams.iteritems():
...     print k, '-', v
... 
opt - ['top', 'opt', 'pot']
fnu - ['fun']
dgo - ['dog', 'god']

上記のセッションでは、単語のリストを格納するためにアナグラム (defaultdict、デフォルト値を持つ dict と同じ) を使用しています。キーはソートされた文字です。つまり、anagrams['opt'] ==> ['top', 'opt', 'pot']. そこから、どのアナグラムが最も多いかがわかります。残りは十分に簡単なはずです。

python - 指定されたファイルからの Python アナグラム ファインダー

1 に答える 1

アップデート

Related

Reference

python - 指定されたファイルからの Python アナグラムファインダー