python - Pythonでデータを整理する方法

Question

これは、次のような data.txt ファイルです。

{'wood', 'iron', 'gold', 'silver'}
{'tungsten', 'iron', 'gold', 'timber'}

以下のような2種類の結果を取得したい：

#FIRST TYPE: sorted by item
gold: 33.3%
iron: 33.3%
silver: 16.7%
timber: 16.7%
tungsten: 16.7%

#SECOND TYPE: sorted by percentage
silver: 16.7%
timber: 16.7%
tungsten: 16.7%
gold: 33.3%
iron: 33.3%

この結果のコードを示します

import collections
counter = collections.Counter()

keywords = []
with open("data.txt") as f:
     for line in f:
         if line.strip():
             for keyword in line.split(','):
                 keywords.append(keyword.strip())
     counter.update(keywords)

     for key in counter:
         print "%s: %.1f%s" %(key, (counter[key]*1.0 / len(counter))*100, '%')

しかし、私の結果はこのように表示されます

'silver'}: 16.7%
'iron': 33.3%
....

結果の中括弧、アポストロフィを取り除きたいです。

必要な結果を表示するように変更または書き直すにはどうすればよいですか? 私はあなたの助けを待っています!!

score 2 · Accepted Answer

辞書/ Counters/ sets は順序付けされていません。最初にそれをに変換しlist、リストをソートする必要があります。

例えば：

for key, val in sorted(counter.items()):  #or with key=lambda x:x[0]
    print "%s: %.1f%s" % (key, float(val) * 100 / len(counter), "%")

キーでソートされた値を出力します。

for key, val in sorted(counter.items(), key=lambda x: (x[1], x[0])):
    print "%s: %.1f%s" % (key, float(val) * 100 / len(counter), "%")

パーセンテージで並べ替えます (2 つのアイテムのパーセンテージが同じ場合は、名前でも並べ替えられます)。

アップデート

解析の問題に関しては、 andstripも必要です:{}

for line in f:
    if line.strip():
        for keyword in line.strip().strip('{}').split(','):
            keyword = keyword.strip("'")

最近の Python バージョン (2.7 や 3 など) を使用している場合は、ast.literal_eval代わりに次を使用できます。

import ast
...
for line inf f:
    stripped = line.strip()
    if stripped:
        for keyword in ast.literal_eval(stripped):

ただし、これにより同じ行の重複キーが削除されることに注意してください。（あなたの例から、これは大丈夫だと思われます...）

それ以外の場合は、次のことができます。

import ast
...
for line inf f:
    stripped = line.strip()
    if stripped:
        for keyword in ast.literal_eval('[' + stripped[1:-1] + ']'):

重複を保持します。

score 1 · Accepted Answer

sorted辞書には順序がないため、キー/パーセンテージに基づいてアイテムを並べ替えるために使用します。

from collections import Counter
counter = Counter()
import ast
keywords = []
with open("abc") as f:
    for line in f:
        #strip {} and split the line at ", " 
        line = line.strip("{}\n").split(", ")
        counter += Counter(x.strip('"') for x in line)

le = len(counter)    
for key,val in sorted(counter.items()):
    print "%s: %.1f%s" %(key, (val*1.0 / le)*100, '%')

print

for key,val in sorted(counter.items(), key = lambda x :(x[1],x[0]) ):
    print "%s: %.1f%s" %(key, (val*1.0 / le)*100, '%')

出力：

'gold': 33.3%
'iron': 33.3%
'silver': 16.7%
'timber': 16.7%
'tungsten': 16.7%
'wood': 16.7%

'silver': 16.7%
'timber': 16.7%
'tungsten': 16.7%
'wood': 16.7%
'gold': 33.3%
'iron': 33.3%

score 1 · Accepted Answer

{迷子になる理由は、}それらを取り除けないからです。
これを行うには、for ループを次のように変更します。

 for line in f:
     line = line.strip().strip('{}') # get rid of curly braces
     if line:
         ....

印刷に関する限り：

print "Sorted by Percentage"
for k,v in sorted(c.items(), key=lambda x: x[1]):
    print '{0}: {1:.2%}'.format(k, float(v)/len(c))
print 
print "Sorted by Name"
for k,v in  sorted(c.items(), key=lambda x :x[0]):
    print '{0}: {1:.2%}'.format(k, float(v)/len(c))

python - Pythonでデータを整理する方法

3 に答える 3

Related

Reference