python - Pythonを使用してxlsxファイルからデータをロードする方法

Question

これは私のxlsxファイルです:

ここに画像の説明を入力

そして、私はこのデータを次のようなdictに変更したい :

{
    0:{
       'a':1,
       'b':100,
       'c':2,
       'd':10
    },
    1:{
       'a':8,
       'b':480,
       'c':3,
       'd':14
    }
...
}

誰かがこれを行うためのPythonライブラリを知っていて、124行目から141行目の終わりまで、

ありがとう

score 1 · Accepted Answer

次のようなデータがあるとします。

a,b,c,d
1,2,3,4
2,3,4,5
...

2014 年に考えられる多くの答えの 1 つは次のとおりです。

import pyexcel


r = pyexcel.SeriesReader("yourfile.xlsx")
# make a filter function
filter_func = lambda row_index: row_index < 124 or row_index > 141
# apply the filter on the reader
r.filter(pyexcel.filters.RowIndexFilter(filter_func))
# get the data
data = pyexcel.utils.to_records(r)
print data

データは辞書の配列です。

[{
   'a':1,
   'b':100,
   'c':2,
   'd':10
},
{
   'a':8,
   'b':480,
   'c':3,
   'd':14
}...
]

ドキュメントはここで読むことができます

score 1 · Accepted Answer

xlrd のオプション:

(1) xlsx ファイルがそれほど大きくないように見えます。xls として保存します。

(2)xlrd追加のベータテストモジュールを使用しますxlsxrd(私の電子メールアドレスを見つけて問い合わせてください)。この組み合わせは、xls ファイルと xlsx ファイルからシームレスにデータを読み取ります (同じ API。ファイルの内容を調べて、xls、xlsx、または偽物であるかどうかを判断します)。

どちらの場合でも、以下の (テストされていない) コードのようなものは、あなたが望むことをするはずです:

from xlrd import open_workbook
from xlsxrd import open_workbook
# Choose one of the above

# These could be function args in real live code
column_map = {
    # The numbers are zero-relative column indexes
    'a': 1,
    'b': 2,
    'c': 4,
    'd': 6,
    }
first_row_index = 124 - 1
last_row_index = 141 - 1
file_path = 'your_file.xls'

# The action starts here
book = open_workbook(file_path)
sheet = book.sheet_by_index(0) # first worksheet
key0 = 0
result = {}
for row_index in xrange(first_row_index, last_row_index + 1):
    d = {}
    for key1, column_index in column_map.iteritems():
        d[key1] = sheet.cell_value(row_index, column_index)
    result[key0] = d
    key0 += 1

score 0 · Accepted Answer

これは、標準ライブラリのみを使用した非常に大まかな実装です。

def xlsx(fname):
    import zipfile
    from xml.etree.ElementTree import iterparse
    z = zipfile.ZipFile(fname)
    strings = [el.text for e, el in iterparse(z.open('xl/sharedStrings.xml')) if el.tag.endswith('}t')]
    rows = []
    row = {}
    value = ''
    for e, el in iterparse(z.open('xl/worksheets/sheet1.xml')):
        if el.tag.endswith('}v'): # <v>84</v>
            value = el.text
        if el.tag.endswith('}c'): # <c r="A3" t="s"><v>84</v></c>
            if el.attrib.get('t') == 's':
                value = strings[int(value)]
            letter = el.attrib['r'] # AZ22
            while letter[-1].isdigit():
                letter = letter[:-1]
            row[letter] = value
        if el.tag.endswith('}row'):
            rows.append(row)
            row = {}
    return dict(enumerate(rows))

score 0 · Accepted Answer

別のオプションはopenpyxlです。試してみたいとは思っていたのですが、まだ慣れていないので、良し悪しは言えません。

python - Pythonを使用してxlsxファイルからデータをロードする方法

4 に答える 4

Related

Reference