0

こんにちは、python で入力テキストから 1 単語と 2 単語のカウントをカウントしたいと考えています。元。

"what is your name ? what you want from me ?
 You know best way to earn money is Hardwork 
 what is your aim ?"

出力:

sinle W.C. : 
what   3
 is    3
 your  2
you    2

等々..

Double W.C. :
what is 2
is your 2
your name 1
what you 1

ansなど..これを行う方法を投稿してください? 私は単一の単語数に次のコードを使用します:

ws={}

テキスト内の行:

for wrd in line:

    if wrd not in ws:

        ws[wrd]=1

    else:

        ws[wrd]+=1
4

4 に答える 4

3
from collections import Counter

s = "..."

words = s.split()
pairs = zip(words, words[1:])

single_words, double_words = Counter(words), Counter(pairs)

出力:

print "sinle W.C."
for word, count in sorted(single_words.items(), key=lambda x: -x[1]):
    print word, count

print "double W.C."
for pair, count in sorted(double_words.items(), key=lambda x: -x[1]):
    print pair, count
于 2012-10-10T16:40:55.743 に答える
2
import nltk
from nltk import bigrams
from nltk import trigrams

tokens = nltk.word_tokenize(text)
tokens = [token.lower() for token in tokens if len(token) > 1]
bi_tokens = bigrams(tokens)

print [(item, tokens.count(item)) for item in sorted(set(tokens))]
print [(item, bi_tokens.count(item)) for item in sorted(set(bi_tokens))]
于 2012-10-10T16:41:11.067 に答える
0

これは機能します。defaultdict を使用します。パイソン2.6

>>> from collections import defaultdict
>>> d = defaultdict(int)
>>> string = "what is your name ? what you want from me ?\n
    You know best way to earn money is Hardwork\n what is your aim ?"
>>> l = string.split()
>>> for i in l:
    d[i]+=1

>>> d
defaultdict(<type 'int'>, {'me': 1, 'aim': 1, 'what': 3, 'from': 1, 'name': 1, 
    'You': 1, 'money': 1, 'is': 3, 'earn': 1, 'best': 1, 'Hardwork': 1, 'to': 1, 
    'way': 1, 'know': 1, 'want': 1, 'you': 1, 'your': 2, '?': 3})
>>> d2 = defaultdict(int)
>>> for i in zip(l[:-1], l[1:]):
    d2[i]+=1

>>> d2
defaultdict(<type 'int'>, {('You', 'know'): 1, ('earn', 'money'): 1, 
    ('is', 'Hardwork'): 1, ('you', 'want'): 1, ('know', 'best'): 1, 
    ('what', 'is'): 2, ('your', 'name'): 1, ('from', 'me'): 1, 
    ('name', '?'): 1, ('?', 'You'): 1, ('?', 'what'): 1, ('to', 'earn'): 1, 
    ('aim', '?'): 1, ('way', 'to'): 1, ('Hardwork', 'what'): 1, 
    ('money', 'is'): 1, ('me', '?'): 1, ('what', 'you'): 1, ('best', 'way'): 1,
    ('want', 'from'): 1, ('is', 'your'): 2, ('your', 'aim'): 1})
>>> 
于 2012-10-10T16:44:21.217 に答える