python - App Engine からの Google Cloud SQL の接続制限と、DB 接続を最適に再利用する方法を教えてください。

Question

データの保存に Google Cloud SQL インスタンスを使用する Google App Engine アプリがあります。私のインスタンスは、それぞれが 1 つまたは少数の DB クエリをもたらす安らかな呼び出しを介して、一度に数百のクライアントにサービスを提供できる必要があります。DB アクセスが必要なメソッドをラップし、DB 接続へのハンドルを os.environ に格納しました。基本的に私がそれをどのように行っているかについては、このSOの質問/回答を参照してください。

しかし、数百のクライアントが私のアプリに接続してデータベース呼び出しをトリガーするとすぐに、Google App Engine エラーログに次のエラーが記録され始めます (もちろん、私のアプリは 500 を返します)。

could not connect: ApplicationError: 1033 Instance has too many concurrent requests: 100 Traceback (most recent call last): File "/base/python27_run

Google App Engine と Google Cloud SQL の経験豊富なユーザーからのヒントはありますか? 前もって感謝します。

DB接続を必要とするメソッドの周りで使用するデコレータのコードは次のとおりです。

def with_db_cursor(do_commit = False):
    """ Decorator for managing DB connection by wrapping around web calls.
    Stores connections and open connection count in the os.environ dictionary
    between calls.  Sets a cursor variable in the wrapped function. Optionally
    does a commit.  Closes the cursor when wrapped method returns, and closes
    the DB connection if there are no outstanding cursors.

    If the wrapped method has a keyword argument 'existing_cursor', whose value
    is non-False, this wrapper is bypassed, as it is assumed another cursor is
    already in force because of an alternate call stack.

    Based mostly on post by : Shay Erlichmen
    At: https://stackoverflow.com/a/10162674/379037
    """

    def method_wrap(method):
        def wrap(*args, **kwargs):
            if kwargs.get('existing_cursor', False):
                #Bypass everything if method called with existing open cursor
                vdbg('Shortcircuiting db wrapper due to exisiting_cursor')
                return  method(None, *args, **kwargs)

            conn = os.environ.get("__data_conn")

            # Recycling connection for the current request
            # For some reason threading.local() didn't work
            # and yes os.environ is supposed to be thread safe 
            if not conn:                    
                conn = _db_connect()
                os.environ["__data_conn"] = conn
                os.environ["__data_conn_ref"] = 1
                dbg('Opening first DB connection via wrapper.')
            else:
                os.environ["__data_conn_ref"] = (os.environ["__data_conn_ref"] + 1)
                vdbg('Reusing existing DB connection. Count using is now: {0}',
                    os.environ["__data_conn_ref"])        
            try:
                cursor = conn.cursor()
                try:
                    result = method(cursor, *args, **kwargs)
                    if do_commit or os.environ.get("__data_conn_commit"):
                        os.environ["__data_conn_commit"] = False
                        dbg('Wrapper executing DB commit.')
                        conn.commit()
                    return result                        
                finally:
                    cursor.close()                    
            finally:
                os.environ["__data_conn_ref"] = (os.environ["__data_conn_ref"] -
                        1)  
                vdbg('One less user of DB connection. Count using is now: {0}',
                    os.environ["__data_conn_ref"])
                if os.environ["__data_conn_ref"] == 0:
                    dbg("No more users of this DB connection. Closing.")
                    os.environ["__data_conn"] = None
                    db_close(conn)
        return wrap
    return method_wrap

def db_close(db_conn):
    if db_conn:
        try:
            db_conn.close()
        except:
            err('Unable to close the DB connection.', )
            raise
    else:
        err('Tried to close a non-connected DB handle.')

score 15 · Accepted Answer

簡単な回答: クエリが遅すぎる可能性があり、mysql サーバーには、送信しようとしているすべての要求を処理するのに十分なスレッドがありません。

長い答え:

背景として、Cloud SQL には、ここで関連する 2 つの制限があります。

接続: これらは、コード内の「conn」オブジェクトに対応します。サーバーには対応するデータ構造があります。これらのオブジェクトが多すぎると (現在は 1000 に設定されています)、使用頻度の最も低いオブジェクトが自動的に閉じられます。接続が閉じられると、次にその接続を使用しようとしたときに、不明な接続エラー (ApplicationError: 1007) が発生します。
同時リクエスト: これらは、サーバー上で実行されているクエリです。実行中のクエリごとにサーバー内のスレッドが結び付けられるため、制限は 100 です。同時リクエストが多すぎると、後続のリクエストは拒否され、エラーが発生します (ApplicationError: 1033)

接続制限が影響しているようには見えませんが、念のため言及したいと思います。

同時リクエストに関しては、制限を増やすと役立つ場合がありますが、通常は問題が悪化します。過去に私たちが見た 2 つのケースがあります。

デッドロック: 実行時間の長いクエリにより、データベースの重要な行がロックされています。後続のすべてのクエリは、そのロックでブロックされます。アプリはこれらのクエリでタイムアウトしますが、サーバー上で実行され続け、デッドロックタイムアウトがトリガーされるまでこれらのスレッドを拘束します。
遅いクエリ: 各クエリは非常に遅いです。これは通常、クエリで一時ファイルの並べ替えが必要な場合に発生します。クエリの最初の試行がまだ実行中であり、同時要求制限に対してカウントされている間に、アプリケーションがタイムアウトになり、クエリを再試行します。平均クエリ時間を見つけることができれば、mysql インスタンスがサポートできる QPS の見積もりを取得できます (たとえば、クエリごとに 5 ミリ秒は、スレッドごとに 200 QPS を意味します。100 個のスレッドがあるため、20,000 QPS を実行できます。50 ミリ秒)クエリあたりは 2000 QPS を意味します)。

EXPLAINとSHOW ENGINE INNODB STATUSを使用して、2 つの問題のどちらが発生しているかを確認する必要があります。

もちろん、インスタンスで大量のトラフィックを処理しているだけで、十分なスレッドがない可能性もあります。その場合、いずれにせよインスタンスの CPU を使い果たしている可能性が高いため、スレッドを追加しても効果はありません。

score 5 · Accepted Answer

ドキュメントを読んだところ、12 接続/インスタンスの制限があることに気付きました。

「各 App Engine インスタンスは、Google Cloud SQL インスタンスに対して 12 を超える同時接続を持つことはできません」を探します。https://developers.google.com/appengine/docs/python/cloud-sql/で

python - App Engine からの Google Cloud SQL の接続制限と、DB 接続を最適に再利用する方法を教えてください。

2 に答える 2

Related

Reference