django - Djangoのリクエストごとのキャッシュ?

Question

ビューだけでなく、任意のメソッドにリクエストごとのキャッシュを提供するデコレータを実装したいと思います。使用例を次に示します。

レコードの長いリスト内のレコードが「お気に入り」かどうかを判断するカスタムタグがあります。アイテムがお気に入りかどうかを確認するには、データベースにクエリを実行する必要があります。理想的には、1 つのクエリを実行してすべてのお気に入りを取得し、キャッシュされたリストを各レコードに対してチェックするだけです。

1 つの解決策は、ビュー内のすべてのお気に入りを取得し、そのセットをテンプレートに渡し、次に各タグ呼び出しに渡すことです。

あるいは、タグ自体がクエリ自体を実行することもできますが、それは最初に呼び出されたときだけです。その後、後続の呼び出しのために結果をキャッシュできます。利点は、ビューに警告することなく、任意のビューで任意のテンプレートからこのタグを使用できることです。

既存のキャッシュメカニズムでは、結果を 50 ミリ秒だけキャッシュし、それが現在の要求に相関すると想定できます。その相関関係を信頼できるものにしたい。

これは私が現在持っているタグの例です。

@register.filter()
def is_favorite(record, request):

    if "get_favorites" in request.POST:
        favorites = request.POST["get_favorites"]
    else:

        favorites = get_favorites(request.user)

        post = request.POST.copy()
        post["get_favorites"] = favorites
        request.POST = post

    return record in favorites

Django から現在のリクエストオブジェクトを取得する方法はありますか? タグから、常に存在するリクエストを渡すだけで済みます。しかし、他の関数からこのデコレータを使用したいと思います。

リクエストごとのキャッシュの既存の実装はありますか?

score 27 · Accepted Answer

カスタムミドルウェアを使用すると、リクエストごとにクリアされることが保証された Django キャッシュインスタンスを取得できます。

これは私がプロジェクトで使用したものです：

from threading import currentThread
from django.core.cache.backends.locmem import LocMemCache

_request_cache = {}
_installed_middleware = False

def get_request_cache():
    assert _installed_middleware, 'RequestCacheMiddleware not loaded'
    return _request_cache[currentThread()]

# LocMemCache is a threadsafe local memory cache
class RequestCache(LocMemCache):
    def __init__(self):
        name = 'locmemcache@%i' % hash(currentThread())
        params = dict()
        super(RequestCache, self).__init__(name, params)

class RequestCacheMiddleware(object):
    def __init__(self):
        global _installed_middleware
        _installed_middleware = True

    def process_request(self, request):
        cache = _request_cache.get(currentThread()) or RequestCache()
        _request_cache[currentThread()] = cache

        cache.clear()

ミドルウェアを使用するには、settings.py に登録します。例:

MIDDLEWARE_CLASSES = (
    ...
    'myapp.request_cache.RequestCacheMiddleware'
)

その後、次のようにキャッシュを使用できます。

from myapp.request_cache import get_request_cache

cache = get_request_cache()

詳細については、django 低レベルキャッシュ API ドキュメントを参照してください。

Django 低レベルキャッシュ API

リクエストキャッシュを使用するように memoize デコレータを変更するのは簡単です。memoize デコレーターの良い例については、Python Decorator Library をご覧ください。

Python デコレータライブラリ

score 4 · Accepted Answer

編集：

私が思いついた最終的な解決策は、PyPI パッケージにコンパイルされています: https://pypi.org/project/django-request-cache/

編集 2016-06-15:

私はこの問題に対する非常に簡単な解決策を発見し、最初からこれがどれほど簡単であったかを理解していなかったことに顔をしかめました。

from django.core.cache.backends.base import BaseCache
from django.core.cache.backends.locmem import LocMemCache
from django.utils.synch import RWLock


class RequestCache(LocMemCache):
    """
    RequestCache is a customized LocMemCache which stores its data cache as an instance attribute, rather than
    a global. It's designed to live only as long as the request object that RequestCacheMiddleware attaches it to.
    """

    def __init__(self):
        # We explicitly do not call super() here, because while we want BaseCache.__init__() to run, we *don't*
        # want LocMemCache.__init__() to run, because that would store our caches in its globals.
        BaseCache.__init__(self, {})

        self._cache = {}
        self._expire_info = {}
        self._lock = RWLock()

class RequestCacheMiddleware(object):
    """
    Creates a fresh cache instance as request.cache. The cache instance lives only as long as request does.
    """

    def process_request(self, request):
        request.cache = RequestCache()

request.cacheこれにより、が存続する限り存続するキャッシュインスタンスとして使用できrequest、要求が完了するとガベージコレクタによって完全にクリーンアップされます。

通常はアクセスできないコンテキストからオブジェクトにアクセスする必要がある場合はrequest、オンラインで見つけることができる、いわゆる「グローバルリクエストミドルウェア」のさまざまな実装の 1 つを使用できます。

** 最初の回答: **

ここで解決できる解決策が他にない主要な問題は、1 つのプロセスの存続期間中に複数の LocMemCache を作成および破棄すると、LocMemCache がメモリリークを起こすという事実です。django.core.cache.backends.locmemは、すべての LocalMemCache インスタンスのキャッシュデータへの参照を保持する複数のグローバルディクショナリを定義します。これらのディクショナリは決して空にはなりません。

次のコードは、この問題を解決します。@href_の回答と、@squarelogic.haydenのコメントにリンクされているコードで使用されるよりクリーンなロジックの組み合わせとして始まり、それをさらに改良しました。

from uuid import uuid4
from threading import current_thread

from django.core.cache.backends.base import BaseCache
from django.core.cache.backends.locmem import LocMemCache
from django.utils.synch import RWLock


# Global in-memory store of cache data. Keyed by name, to provides multiple
# named local memory caches.
_caches = {}
_expire_info = {}
_locks = {}


class RequestCache(LocMemCache):
    """
    RequestCache is a customized LocMemCache with a destructor, ensuring that creating
    and destroying RequestCache objects over and over doesn't leak memory.
    """

    def __init__(self):
        # We explicitly do not call super() here, because while we want
        # BaseCache.__init__() to run, we *don't* want LocMemCache.__init__() to run.
        BaseCache.__init__(self, {})

        # Use a name that is guaranteed to be unique for each RequestCache instance.
        # This ensures that it will always be safe to call del _caches[self.name] in
        # the destructor, even when multiple threads are doing so at the same time.
        self.name = uuid4()
        self._cache = _caches.setdefault(self.name, {})
        self._expire_info = _expire_info.setdefault(self.name, {})
        self._lock = _locks.setdefault(self.name, RWLock())

    def __del__(self):
        del _caches[self.name]
        del _expire_info[self.name]
        del _locks[self.name]


class RequestCacheMiddleware(object):
    """
    Creates a cache instance that persists only for the duration of the current request.
    """

    _request_caches = {}

    def process_request(self, request):
        # The RequestCache object is keyed on the current thread because each request is
        # processed on a single thread, allowing us to retrieve the correct RequestCache
        # object in the other functions.
        self._request_caches[current_thread()] = RequestCache()

    def process_response(self, request, response):
        self.delete_cache()
        return response

    def process_exception(self, request, exception):
        self.delete_cache()

    @classmethod
    def get_cache(cls):
        """
        Retrieve the current request's cache.

        Returns None if RequestCacheMiddleware is not currently installed via 
        MIDDLEWARE_CLASSES, or if there is no active request.
        """
        return cls._request_caches.get(current_thread())

    @classmethod
    def clear_cache(cls):
        """
        Clear the current request's cache.
        """
        cache = cls.get_cache()
        if cache:
            cache.clear()

    @classmethod
    def delete_cache(cls):
        """
        Delete the current request's cache object to avoid leaking memory.
        """
        cache = cls._request_caches.pop(current_thread(), None)
        del cache

編集 2016-06-15: 私はこの問題に対する非常に簡単な解決策を発見しました。

from django.core.cache.backends.base import BaseCache
from django.core.cache.backends.locmem import LocMemCache
from django.utils.synch import RWLock


class RequestCache(LocMemCache):
    """
    RequestCache is a customized LocMemCache which stores its data cache as an instance attribute, rather than
    a global. It's designed to live only as long as the request object that RequestCacheMiddleware attaches it to.
    """

    def __init__(self):
        # We explicitly do not call super() here, because while we want BaseCache.__init__() to run, we *don't*
        # want LocMemCache.__init__() to run, because that would store our caches in its globals.
        BaseCache.__init__(self, {})

        self._cache = {}
        self._expire_info = {}
        self._lock = RWLock()

class RequestCacheMiddleware(object):
    """
    Creates a fresh cache instance as request.cache. The cache instance lives only as long as request does.
    """

    def process_request(self, request):
        request.cache = RequestCache()

request.cacheこれにより、が存続する限り存続するキャッシュインスタンスとして使用できrequest、要求が完了するとガベージコレクタによって完全にクリーンアップされます。

通常はアクセスできないコンテキストからオブジェクトにアクセスする必要がある場合はrequest、オンラインで見つけることができる、いわゆる「グローバルリクエストミドルウェア」のさまざまな実装の 1 つを使用できます。

score 3 · Accepted Answer

（memcached、ファイル、データベースなどに関連付けられる標準キャッシュを使用する代わりに）リクエストオブジェクトに直接キャッシュするためのハックを思いつきました。

# get the request object's dictionary (rather one of its methods' dictionary)
mycache = request.get_host.__dict__

# check whether we already have our value cached and return it
if mycache.get( 'c_category', False ):
    return mycache['c_category']
else:
    # get some object from the database (a category object in this case)
    c = Category.objects.get( id = cid )

    # cache the database object into a new key in the request object
    mycache['c_category'] = c

    return c

したがって、基本的には、キャッシュされた値（この場合はカテゴリオブジェクト）を、リクエストのディクショナリの新しいキー「c_category」に格納しているだけです。または、より正確に言うと、リクエストオブジェクトにキーを作成するだけでは不十分なため、リクエストオブジェクトのメソッドの1つであるget_host（）にキーを追加しています。

ジョージー。

score 3 · Accepted Answer

数年後、単一の Django リクエスト内で SELECT ステートメントをキャッシュするためのスーパーハック。patch()ミドルウェアのように、リクエストスコープで早い段階からメソッドを実行する必要があります。

from threading import local
import itertools
from django.db.models.sql.constants import MULTI
from django.db.models.sql.compiler import SQLCompiler
from django.db.models.sql.datastructures import EmptyResultSet
from django.db.models.sql.constants import GET_ITERATOR_CHUNK_SIZE


_thread_locals = local()


def get_sql(compiler):
    ''' get a tuple of the SQL query and the arguments '''
    try:
        return compiler.as_sql()
    except EmptyResultSet:
        pass
    return ('', [])


def execute_sql_cache(self, result_type=MULTI):

    if hasattr(_thread_locals, 'query_cache'):

        sql = get_sql(self)  # ('SELECT * FROM ...', (50)) <= sql string, args tuple
        if sql[0][:6].upper() == 'SELECT':

            # uses the tuple of sql + args as the cache key
            if sql in _thread_locals.query_cache:
                return _thread_locals.query_cache[sql]

            result = self._execute_sql(result_type)
            if hasattr(result, 'next'):

                # only cache if this is not a full first page of a chunked set
                peek = result.next()
                result = list(itertools.chain([peek], result))

                if len(peek) == GET_ITERATOR_CHUNK_SIZE:
                    return result

            _thread_locals.query_cache[sql] = result

            return result

        else:
            # the database has been updated; throw away the cache
            _thread_locals.query_cache = {}

    return self._execute_sql(result_type)


def patch():
    ''' patch the django query runner to use our own method to execute sql '''
    _thread_locals.query_cache = {}
    if not hasattr(SQLCompiler, '_execute_sql'):
        SQLCompiler._execute_sql = SQLCompiler.execute_sql
        SQLCompiler.execute_sql = execute_sql_cache

patch() メソッドは、Django の内部の execute_sql メソッドを execute_sql_cache というスタンドインに置き換えます。そのメソッドは、実行する sql を調べ、select ステートメントの場合は、最初にスレッドローカルキャッシュをチェックします。キャッシュに見つからない場合にのみ、SQL の実行に進みます。他のタイプのSQLステートメントでは、キャッシュを吹き飛ばします。大規模な結果セット (100 レコードを超えるもの) をキャッシュしないようにするロジックがいくつかあります。これは、Django の遅延クエリセット評価を保持するためです。

score 2 · Accepted Answer

これは (django のキャッシュではなく) キャッシュとして python dict を使用し、非常にシンプルで軽量です。

スレッドが破棄されるたびに、そのキャッシュはあまりにも自動的になります。
ミドルウェアを必要とせず、アクセスのたびにコンテンツがピクルおよびデピクルされないため、高速です。
テスト済みで、gevent のモンキーパッチで動作します。

同じことは、おそらくスレッドローカルストレージでも実装できます。このアプローチの欠点については認識していません。コメントに自由に追加してください。

from threading import currentThread
import weakref

_request_cache = weakref.WeakKeyDictionary()

def get_request_cache():
    return _request_cache.setdefault(currentThread(), {})

score 1 · Accepted Answer

キャッシングはいつでも手動で行うことができます。

    ...
    if "get_favorites" in request.POST:
        favorites = request.POST["get_favorites"]
    else:
        from django.core.cache import cache

        favorites = cache.get(request.user.username)
        if not favorites:
            favorites = get_favorites(request.user)
            cache.set(request.user.username, favorites, seconds)
    ...

score 1 · Accepted Answer

@href_ からの回答は素晴らしいです。

潜在的にトリックを実行できる短いものが必要な場合に備えて：

from django.utils.lru_cache import lru_cache

def cached_call(func, *args, **kwargs):
    """Very basic temporary cache, will cache results
    for average of 1.5 sec and no more then 3 sec"""
    return _cached_call(int(time.time() / 3), func, *args, **kwargs)


@lru_cache(maxsize=100)
def _cached_call(time, func, *args, **kwargs):
    return func(*args, **kwargs)

次に、次のように呼び出してお気に入りを取得します。

favourites = cached_call(get_favourites, request.user)

このメソッドはlru キャッシュを利用し、それをタイムスタンプと組み合わせて、キャッシュが数秒以上保持されないようにします。コストのかかる関数を短時間で数回呼び出す必要がある場合、これで問題が解決します。

ごく最近のデータが見つからない場合があるため、キャッシュを無効にするのに最適な方法ではありませint(..2.99.. / 3)んint(..3.00..) / 3)。この欠点にもかかわらず、ほとんどのヒットで非常に効果的です。

また、おまけとして、セロリタスクや管理コマンドジョブなど、リクエスト/レスポンスサイクル以外でも使用できます。

django - Djangoのリクエストごとのキャッシュ?

7 に答える 7

Related

Reference