python - Python: 多数の任意のオブジェクトの保存/取得/更新

Question

かなり頻繁に保存、取得、削除したい数百万のレコードがあります。これらの各レコードには「キー」がありますが、「値」は、私が書いていないモジュールメソッドから返された任意の Python オブジェクトであるため、辞書に簡単に変換できません (多くの階層データ構造がjson辞書としてうまく機能し、どのようjsonな場合でも優先されるデータベースであるかどうかはわかりません)。

各エントリを個別のファイルにピクルすることを考えています。より良い方法はありますか？

score 3 · Accepted Answer

モジュールを使用しshelveます。

のように辞書として使用できますが、jsonpickle を使用してオブジェクトを格納します。

Pythonの公式ドキュメントから：

import shelve

d = shelve.open(filename) # open -- file may get suffix added by low-level
                          # library

d[key] = data   # store data at key (overwrites old data if
                # using an existing key)
data = d[key]   # retrieve a COPY of data at key (raise KeyError if no
                # such key)
del d[key]      # delete data stored at key (raises KeyError
                # if no such key)
flag = d.has_key(key)   # true if the key exists
klist = d.keys() # a list of all existing keys (slow!)

# as d was opened WITHOUT writeback=True, beware:
d['xx'] = range(4)  # this works as expected, but...
d['xx'].append(5)   # *this doesn't!* -- d['xx'] is STILL range(4)!

# having opened d without writeback=True, you need to code carefully:
temp = d['xx']      # extracts the copy
temp.append(5)      # mutates the copy
d['xx'] = temp      # stores the copy right back, to persist it

# or, d=shelve.open(filename,writeback=True) would let you just code
# d['xx'].append(5) and have it work as expected, BUT it would also
# consume more memory and make the d.close() operation slower.

d.close()       # close it

score 1 · Accepted Answer

berkeleydb、kyoto Cabinet などのキー/値データベースの使用を評価します。これにより、すべての優れた機能に加えて、ディスク容量のより適切な処理が可能になります。ブロックサイズが 4096B のファイルシステムでは、オブジェクトのサイズに関係なく、100 万個のファイルが最大 4GB を占有します (下限として、オブジェクトが 4096B より大きい場合はサイズが大きくなります)。

python - Python: 多数の任意のオブジェクトの保存/取得/更新

2 に答える 2

Related

Reference