python - Pythonで出力をフォーマットするには?

Question

Python で一部のコードのフォーマットに問題があります。私のコードは次のとおりです。

keys = ['(Lag)=(\d+\.?\d*)','\t','(Autocorrelation Index): (\d+\.?\d*)',       '(Autocorrelation Index): (\d+\.?\d*)',     '(Semivariance): (\d+\.?\d*)']

import re
string1 = ''.join(open("dummy.txt").readlines())
found = []
for key in keys:
found.extend(re.findall(key, string1))
for result in found:
    print '%s  =  %s' % (result[0],result[1])
raw_input()

これまでのところ、次の出力が得られています。

ラグ = 1

ラグ = 2

ラグ = 3

自己相関指数 = #値

……

……

準分散 = #値

しかし、私が望む望ましい出力は次のとおりです。

 Lag        AutoCorrelation Index   AutoCorrelation Index   Semivariance
  1              #value                   #value               #value
  2              #value                   #value               #value
  3              #value                   #value               #value

この出力がCSVファイルまたは txt ファイルで可能であれば、それは素晴らしいことです!

これはループを出力する方法だと思いますが、私はループが得意ではありません。

更新されたコード (旧バージョン)

@mutzmatronの回答に基づく

keys = ['(Lag)=(\d+\.?\d*)',
    '(Autocorrelation Index): (\d+\.?\d*)',
    '(Semivariance): (\d+\.?\d*)']

import re
string1 = open("dummy.txt").readlines().join()
found = []
for key in keys:
    found.extend(re.findall(key, string1))
raw_input()
for result in found:
    print '%s  =  %s' % (result[0], result[1])

raw_input()

まだコンパイルされていません！IDLE python 2.6 を使用していますが、プロンプトの一時停止コマンドがわからないため、エラーメッセージがわかりません。

元の質問

私はPythonにまったく慣れていないので、質問があります。大きなテキストファイルを処理しようとしています。これはほんの一部です：

Band: WDRVI20((0.2*b4-b3)/((0.2*b4)+b3))
Basic Statistics:
  Min: -0.963805
  Max: 0.658219
  Mean: 0.094306
  Standard Deviation: 0.131797
Spatial Statistics, ***Lag=1***:
  Total Number of Observations (Pixels): 769995
  Number of Neighboring Pairs: 1538146
  Moran's I:
    ***Autocorrelation Index: 0.8482564597***
    Expected Value, if band is uncorrelated: -0.000001
    Standard Deviation of Expected Value (Normalized): 0.000806
    Standard Deviation of Expected Value (Randomized): 0.000806
    Z Significance Test (Normalized): 1052.029088
    Z Significance Test (Randomized): 1052.034915
  Geary's C:
    ***Autocorrelation Index: 0.1517324729***
    Expected Value, if band is uncorrelated: 1.000000
    Standard Deviation of Expected Value (Normalized): 0.000807
    Standard Deviation of Expected Value (Randomized): 0.000809
    Z Significance Test (Normalized): 1051.414163
    Z Significance Test (Randomized): 1048.752451
  ***Semivariance: 0.0026356529***
Spatial Statistics, Lag=2:
  Total Number of Observations (Pixels): 769995
  Number of Neighboring Pairs: 3068924
  Moran's I:
 Autocorrelation Index: 0.6230691635
   Expected Value, if band is uncorrelated: -0.000001
   Standard Deviation of Expected Value (Normalized): 0.000571
   Standard Deviation of Expected Value (Randomized): 0.000571
 Z Significance Test (Normalized): 1091.521976
 Z Significance Test (Randomized): 1091.528022
  Geary's C:
Autocorrelation Index: 0.3769372504
  Expected Value, if band is uncorrelated: 1.000000
  Standard Deviation of Expected Value (Normalized): 0.000574
  Standard Deviation of Expected Value (Randomized): 0.000587
 Z Significance Test (Normalized): 1085.700399
 Z Significance Test (Randomized): 1061.931158
Semivariance: 0.0065475488

Autocorrelation Index星 *** の値 ( : 、 values など)の間の情報を抽出してSemivariance処理する必要があります。おそらく、別のテキストファイルまたは Excel ファイルに書き込む必要があります。それをしてもいいですか？助けていただければ幸いです。

score 1 · Accepted Answer

検索するキー (正規表現) のリストを作成します。例えば、

keys = ['(Lag)=(\d+\.?\d*)',
        '(Autocorrelation Index): (\d+\.?\d*)',
        '(Semivariance): (\d+\.?\d*)']

そして、正規表現を使用してこれらを検索し、

import re
string1 = ''.join(open(FILE).readlines())
found = []
for key in keys:
    found.extend(re.findall(key, string1))

for result in found:
    print '%s  =  %s' % (result[0], result[1])

次に、必要なエントリのリストが表示されます。これを使用して、次に必要なことを行うことができます。

結果：

Lag  =  1
Autocorrelation Index  =  0.8482564597
Autocorrelation Index  =  0.1517324729
Semivariance  =  0.0026356529

CSV

CSV に出力するには、csvモジュールを使用します。

import csv
outfile = open('fileout.csv', 'w')
wrt = csv.writer(outfile)
wrt.writerows(found)
outfile.close()

score 1 · Accepted Answer

セクションごとにデータをフォーマットするには、次のようにセグメントを操作するのがおそらく最も簡単です。

keys =['(Lag)=(\d+\.?\d*)',
    '(Autocorrelation Index): (\d+\.?\d*)',
    '(Semivariance): (\d+\.?\d*)']

import re
string1 = ''.join(open("dummy.txt").readlines())

sections = string1.split('Spatial Statistics')

output = []
heads = []

for isec, sec in enumerate(sections):
    found = []
    output.append([])
    for key in keys:
        found.extend(re.findall(key, sec))
    for result in found:
        print '%s  =  %s' % (result[0],result[1])
        output[-1].append(result[1])
    if len(found) > 0 & len(heads) == 0:
        heads = [result[0] for result in found]    

fout = open('output.csv', 'w')
wrt = csv.writer(fout)
wrt.writerow(heads)
wrt.writerows(outputs)
fout.close()

python - Pythonで出力をフォーマットするには?

更新されたコード (旧バージョン)

元の質問

2 に答える 2

Related

Reference