python - Python: ジェネレーターを使用して SQL メモリの問題を回避する方法

Question

私はmysqlデータベースにアクセスする次の方法を持っており、メモリの増加に関して何も変更するアクセス権がないサーバーでクエリが実行されます。私はジェネレーターを初めて使用し、それについてもっと読み始め、これを変換してジェネレーターを使用できると考えました。

def getUNames(self):
    globalUserQuery = ur'''SELECT gu_name FROM globaluser WHERE gu_locked = 0'''
    global_user_list = []
    try:
        self.gdbCursor.execute(globalUserQuery)
        rows = self.gdbCursor.fetchall()
        for row in rows:
            uName = unicode(row['gu_name'], 'utf-8')
            global_user_list.append(uName)
        return global_user_list
    except Exception, e:
        traceback.print_exc()

そして、このコードを次のように使用します。

for user_name in getUNames():
...

これは、サーバー側から取得していたエラーです。

^GOut of memory (Needed 725528 bytes)
Traceback (most recent call last):
...
packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
OperationalError: (2008, 'MySQL client ran out of memory')

これを回避するには、ジェネレーターをどのように使用すればよいですか。

while true:
   self.gdbCursor.execute(globalUserQuery)
   row = self.gdbCursor.fetchone()
   if row is None: break
   yield row

データベースメソッドの結果としてリストを期待しているので、上記が正しい方法であるかどうかはわかりません。クエリからチャンクを取得してリストを返すことは素晴らしいことだと思います。クエリが結果を返す限り、ジェネレーターは次のセットを取得します。

score 13 · Accepted Answer

MySQLdb では、呼び出しcursor.execute(..)が行われると、デフォルトのカーソルが結果セット全体を Python リストにロードします。ジェネレーターを使用するかどうかに関係なく、MemoryError を引き起こす可能性がある大きなクエリの場合。

代わりに、SSCursorまたは SSDictCursor を使用してください。これらはサーバー側で結果セットを保持し、クライアント側で結果セットのアイテムを操作できるようにします。

import MySQLdb  
import MySQLdb.cursors as cursors
import traceback

def getUNames(self):
    # You may of course want to define `self.gdbCursor` somewhere else...
    conn = MySQLdb.connect(..., cursorclass=cursors.SSCursor)
    #                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    #                       Set the cursor class to SSCursor here
    self.gdbCursor = conn.cursor()

    globalUserQuery = ur'''SELECT gu_name FROM globaluser WHERE gu_locked = 0'''
    try:
        self.gdbCursor.execute(globalUserQuery)
        for row in self.gdbCursor:
            uName = unicode(row['gu_name'], 'utf-8')
            yield uName
    except Exception as e:
        traceback.print_exc()

Cursorデフォルトとの違いに関するドキュメントはあまりありませんSSCursor。私が知っている最良の情報源は、Cursor Mixin クラス自体の docstring です。

デフォルトのカーソルは次を使用しますCursorStoreResultMixIn。

In [2]: import MySQLdb.cursors as cursors
In [8]: print(cursors.CursorStoreResultMixIn.__doc__)
This is a MixIn class which causes the entire result set to be
    stored on the client side, i.e. it uses mysql_store_result(). If the
    result set can be very large, consider adding a LIMIT clause to your
    query, or using CursorUseResultMixIn instead.

SSCursor は以下を使用しますCursorUseResultMixIn。

In [9]: print(cursors.CursorUseResultMixIn.__doc__)
This is a MixIn class which causes the result set to be stored
    in the server and sent row-by-row to client side, i.e. it uses
    mysql_use_result(). You MUST retrieve the entire result set and
    close() the cursor before additional queries can be peformed on
    the connection.

ジェネレーターに変更getUNamesしたので、次のように使用されます。

for row in self.getUnames():
    ...

python - Python: ジェネレーターを使用して SQL メモリの問題を回避する方法

1 に答える 1

Related

Reference