python - Pythonで映画レビューコーパスのcsvファイルを作成する

Question

negids = movie_reviews.fileids('neg')
posids = movie_reviews.fileids('pos')

for f in negids:
  with open(fileids=[f], "rb") as infile, open(fileids=[f], 'wb') as outfile:
  in_txt = csv.reader(infile, delimiter = '\t')
  out_csv = csv.writer(outfile)
  out_csv.writerow(in_txt)

映画レビューコーパスの neg フォルダー内の各ファイルを読み込もうとしていて、そのファイルの完全なデータを Excel シートの行として挿入したいのですが。

score 0 · Accepted Answer

directory = raw_input("INPUT Folder:")
output = raw_input("OUTPUT Folder:")

txt_files = os.path.join(directory, '*.txt')

for txt_file in glob.glob(txt_files):
with open(txt_file, "rb") as input_file:
    in_txt = csv.reader(input_file)
    filename = os.path.splitext(os.path.basename(txt_file))[0] + '.csv'

    with open("book.csv", 'wb') as output_file:
        out_csv = csv.writer(output_file)
        out_csv.writerows(in_txt)

私はこのコードを試してみましたが、問題は映画レビューコーパスの neg フォルダー内の各テキストファイルが csv ファイルの 1 つの行として来る必要があることです (つまり、neg フォルダーには数千のファイルが含まれており、新しく作成された csv 1 つのテキストファイルの完全なテキストに対して 1 行に 1000 行が必要ですが、これは発生していません。最後のファイルデータが前のファイルデータを上書きしており、最後のファイルデータが csv ファイルの複数の行に表示されています。

score 0 · Accepted Answer

csv DictReader を使用します。

import csv
import json
data = csv.DictReader(open('filename.csv', 'r'))
print data.fieldnames
for each in data:
   row ={}
   # check condition code here
   output.append(row)
print output

出力データを csv ファイルに追加する

python - Pythonで映画レビューコーパスのcsvファイルを作成する

2 に答える 2

Related

Reference