python - ファイルが同じ名前であるかどうかを確認し、同じ名前のファイルの行数を保存します

Question

私はPythonに比較的慣れていないので、皆さんの入力を実際に使用できます。

次の形式でファイルを保存するスクリプトを実行しています。

201309030700__81.28.236.2.txt
201308240115__80.247.17.26.txt
201308102356__84.246.88.20.txt
201309030700__92.243.23.21.txt
201308030150__203.143.64.11.txt

各ファイルには、合計を数えたいコード行がいくつかあり、これを保存したいと考えています。たとえば、これらのファイルを調べたいのですが、ファイルに同じ日付 (ファイル名の最初の部分) がある場合、それを同じファイルに次の形式で保存したいと考えています。

201309030700__81.28.236.2.txt has 10 lines
201309030700__92.243.23.21.txt has 8 lines

日付が 20130903 のファイルを作成します (最後の 4 桁は時間です)。ファイルを作成: 20130903.txt 2 行のコード 10 8

次のコードがありますが、どこにも行きません。助けてください。

import os, os.path
asline = []
ipasline = []

def main():
    p = './results_1/'
    np = './new/'
    fd = os.listdir(p)
    run(fd)

def writeFile(fd, flines):
    fo = np+fd+'.txt'
    with open(fo, 'a') as f:    
        r = '%s\t %s\n' % (fd, flines)
        f.write(r)

def run(path):
    for root, dirs, files in os.walk(path):
       for cfile in files:
            stripFN = os.path.splitext(cfile)[0]
            fileDate = stripFN.split('_')[0]
            fileIP = stripFN.split('_')[-1]     
        if cfile.startswith(fileDate):
                hp = 0
                for currentFile in files.readlines()[1:]:
                    hp += 1
                    writeFile(fdate, hp)

私はこのスクリプトで遊んでみました：

if not os.path.exists(os.path.join(p, y)):  
    os.mkdir(os.path.join(p, y))
    np = '%s%s' % (datetime.now().strftime(FORMAT), path)
if os.path.exists(os.path.join(p, m)):
    os.chdir(os.path.join(p, month, d))
    np = '%s%s' % (datetime.now().strftime(FORMAT), path)

FORMAT の値は次のとおりです。

20130903

しかし、私はこれを機能させることができないようです。

編集：コードを次のように変更しましたが、やりたいことは少しできましたが、おそらく冗長なことをしていて、膨大な数のファイルを処理していることをまだ考慮していないので、そうではないかもしれません最も効率的な方法。ご覧ください。

import re, os, os.path


p = './results_1/'
np = './new/'
fd = os.listdir(p)
star = "*"


def writeFile(fd, flines):
    fo = './new/'+fd+'_v4.txt'
    with open(fo, 'a') as f:    
    r = '%s\n' % (flines)
    f.write(r)

for f in fd:
    pathN = os.path.join(p, f)
    files = open(pathN, 'r')
    fileN = os.path.basename(pathN)
    stripFN = os.path.splitext(fileN)[0]
    fileDate = stripFN.split('_')[0]
    fdate = fileDate[0:8]
    lnum = len(files.readlines())
    writeFile(fdate, lnum)
    files.close()

現時点では、ファイルでカウントされた行数ごとに新しい行でファイルに書き込んでいます。しかし、私はこれを分類しました。私はいくつかの入力をいただければ幸いです、どうもありがとうございました。

編集 2: ファイル名として日付を使用して、各ファイルの出力を取得しています。ファイルは次のように表示されます。

20130813.txt
20130819.txt
20130825.txt

各ファイルは次のようになります。

そして、各ファイルがさらに 200 行以上続きます。理想的には、各発生が何度も発生し、最初に最小数でソートされることが、最良の望ましい結果になります。

私は次のようなことを試しました：

import sys
from collections import Counter

p = '.txt'
d = []
with open(p, 'r') as f:
    for x in f:
        x = int(x)
        d.append(x)
    d.sort()
    o = Counter(d)
    print o

これは理にかなっていますか？

編集3：

次のスクリプトを使用して一意の数を数えますが、それでも一意の数で並べ替えることができません。

import os
from collections import Counter

p = './newR'
fd = os.listdir(p)

for f in fd:
    pathN = os.path.join(p, f)
    with open(pathN, 'r') as infile:
        fileN = os.path.basename(pathN)
        stripFN = os.path.splitext(fileN)[0]
        fileDate = stripFN.split('_')[0]
        counts = Counter(l.strip() for l in infile)
        for line, count in counts.most_common():
            print line, count

これにより、次の結果が得られます。

出力は次のようになります。

これを行う最も効率的な方法は何ですか?

score 0 · Accepted Answer

次のコードは、私の最初の質問を達成しました。

import os, os.path, subprocess
from sys import stdout

p = './new/results/v4/TRACE_v4_results_ASN_mh60'
fd = os.listdir(p)

def writeFile(fd, flines):
    fo = './new/newR/'+fd+'_v4.txt'
    with open(fo, 'a') as f:    
        r = '%s\n' % (flines)
        f.write(r)

for pfiles in dirs:
pathN = os.path.join(path, pfiles)
files = open(pathN, 'r')
fileN = os.path.basename(pathN)
stripFN = os.path.splitext(fileN)[0]
fileDate = stripFN.split('_')[0]
fdate = fileDate[0:8]
numlines = len(files.readlines()[1:])
writeFile(fdate, numlines)
files.close()

次の結果が得られました。

20130813.txt
20130819.txt
20130825.txt

ルールを守っていない場合は、心からお詫び申し上げます。

python - ファイルが同じ名前であるかどうかを確認し、同じ名前のファイルの行数を保存します

2 に答える 2

Related

Reference