python - メモリ効率の良いジェネレーターで PyMySql を使用する適切な方法

Question

PyMySql (または MySQLDb) を使用して選択クエリの結果を 1 つずつ返すメモリ制限のあるシステムで実行されるジェネレーター関数を作成したいと考えています。以下の作品：

#execute a select query and return results as a generator
def SQLSelectGenerator(self,stmt):
    #error handling code removed
    cur.execute(stmt)

    row = ""
    while row is not None:
        row = self.cur.fetchone()
        yield row

ただし、次も機能するようですが、fetchall() を実行しているかどうかは不明です。カーソルオブジェクトをリストとして反復すると正確に何が起こるか、Python DB API で見つけることができません。

#execute a select query and return results as a generator
def SQLSelectGenerator(self,stmt):
    #error handling code removed
    cur.execute(stmt)

 for row in self.cur:
    yield row

どちらの場合も、以下はすべての行を正常に出力します

stmt = "select * from ..."
for l in SQLSelectGenerator(stmt):
    print(l)

したがって、2 番目の実装のほうが良いか悪いか、そして fetchall を呼び出しているのか、fetchone でトリッキーなことをしているのかを知りたいと思います。何百万もの行があるため、Fetchall はこれが実行されるシステムを爆破します。

score 3 · Accepted Answer

PyMySql sourceによると、

for row in self.cur:
   yield row

fetchone()最初の例と同じように、内部で繰り返し実行していることを意味します。

class Cursor(object):
    '''
    This is the object you use to interact with the database.
    '''
    ...
    def __iter__(self):
        return iter(self.fetchone, None)

したがって、メモリ使用量とパフォーマンスの点で、2 つのアプローチは本質的に等しいと思います。よりクリーンでシンプルなため、2 番目のものを使用することもできます。

python - メモリ効率の良いジェネレーターで PyMySql を使用する適切な方法

1 に答える 1

Related

Reference