python - Python2.7のメモ化ライブラリ

Question

Python3.2にはfunctoolsライブラリのデコレータとしてメモ化が含まれていることがわかります。 http://docs.python.org/py3k/library/functools.html#functools.lru_cache

残念ながら、まだ2.7にバックポートされていません。2.7で利用できない特別な理由はありますか？同じ機能を提供するサードパーティのライブラリはありますか、それとも自分で作成する必要がありますか？

score 43 · Accepted Answer

2.7で利用できない特別な理由はありますか？

@Nirkはすでに理由を提供しています。残念ながら、2.xラインはバグ修正のみを受け取り、新機能は3.x専用に開発されています。

同じ機能を提供するサードパーティのライブラリはありますか？

repoze.lruは、Python 2.6、Python 2.7、およびPython3.2のLRUキャッシュ実装です。

ドキュメントとソースコードはGitHubで入手できます。

簡単な使用法：

from repoze.lru import lru_cache

@lru_cache(maxsize=500)
def fib(n):
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

score 30 · Accepted Answer

Python 2.7およびPyPyで使用するためのPython3.2.3functoolsのモジュールのバックポートがあります：functools32。

lru_cacheデコレータが含まれています。

score 24 · Accepted Answer

私も同じ状況で、自分で実装することを余儀なくされました。python3.xの実装には他にもいくつかの問題がありました。

主な問題は、インスタンスごとに個別のキャッシュを有効にしないことです（キャッシュされる関数がインスタンスメソッドの場合）。つまり、キャッシュに最大サイズを100に設定し、100個のインスタンスがあり、すべてが等しくアクティブである場合、キャッシュは事実上何もしません。
- また、clear_cacheを実行すると、すべてのインスタンスのキャッシュがクリアされます。
2つ目の重要な点は、X秒ごとにキャッシュをクリアするタイムアウト機能が必要だったことです。

Python 2.7の関数lru_cacheの実装：

import time
import functools
import collections

def lru_cache(maxsize = 255, timeout = None):
    """lru_cache(maxsize = 255, timeout = None) --> returns a decorator which returns an instance (a descriptor).

        Purpose         - This decorator factory will wrap a function / instance method and will supply a caching mechanism to the function.
                            For every given input params it will store the result in a queue of maxsize size, and will return a cached ret_val
                            if the same parameters are passed.

        Params          - maxsize - int, the cache size limit, anything added above that will delete the first values enterred (FIFO).
                            This size is per instance, thus 1000 instances with maxsize of 255, will contain at max 255K elements.
                        - timeout - int / float / None, every n seconds the cache is deleted, regardless of usage. If None - cache will never be refreshed.

        Notes           - If an instance method is wrapped, each instance will have it's own cache and it's own timeout.
                        - The wrapped function will have a cache_clear variable inserted into it and may be called to clear it's specific cache.
                        - The wrapped function will maintain the original function's docstring and name (wraps)
                        - The type of the wrapped function will no longer be that of a function but either an instance of _LRU_Cache_class or a functool.partial type.

        On Error        - No error handling is done, in case an exception is raised - it will permeate up.
    """

    class _LRU_Cache_class(object):
        def __init__(self, input_func, max_size, timeout):
            self._input_func        = input_func
            self._max_size          = max_size
            self._timeout           = timeout

            # This will store the cache for this function, format - {caller1 : [OrderedDict1, last_refresh_time1], caller2 : [OrderedDict2, last_refresh_time2]}.
            #   In case of an instance method - the caller is the instance, in case called from a regular function - the caller is None.
            self._caches_dict        = {}

        def cache_clear(self, caller = None):
            # Remove the cache for the caller, only if exists:
            if caller in self._caches_dict:
                del self._caches_dict[caller]
                self._caches_dict[caller] = [collections.OrderedDict(), time.time()]

        def __get__(self, obj, objtype):
            """ Called for instance methods """
            return_func = functools.partial(self._cache_wrapper, obj)
            return_func.cache_clear = functools.partial(self.cache_clear, obj)
            # Return the wrapped function and wraps it to maintain the docstring and the name of the original function:
            return functools.wraps(self._input_func)(return_func)

        def __call__(self, *args, **kwargs):
            """ Called for regular functions """
            return self._cache_wrapper(None, *args, **kwargs)
        # Set the cache_clear function in the __call__ operator:
        __call__.cache_clear = cache_clear


        def _cache_wrapper(self, caller, *args, **kwargs):
            # Create a unique key including the types (in order to differentiate between 1 and '1'):
            kwargs_key = "".join(map(lambda x : str(x) + str(type(kwargs[x])) + str(kwargs[x]), sorted(kwargs)))
            key = "".join(map(lambda x : str(type(x)) + str(x) , args)) + kwargs_key

            # Check if caller exists, if not create one:
            if caller not in self._caches_dict:
                self._caches_dict[caller] = [collections.OrderedDict(), time.time()]
            else:
                # Validate in case the refresh time has passed:
                if self._timeout != None:
                    if time.time() - self._caches_dict[caller][1] > self._timeout:
                        self.cache_clear(caller)

            # Check if the key exists, if so - return it:
            cur_caller_cache_dict = self._caches_dict[caller][0]
            if key in cur_caller_cache_dict:
                return cur_caller_cache_dict[key]

            # Validate we didn't exceed the max_size:
            if len(cur_caller_cache_dict) >= self._max_size:
                # Delete the first item in the dict:
                cur_caller_cache_dict.popitem(False)

            # Call the function and store the data in the cache (call it with the caller in case it's an instance function - Ternary condition):
            cur_caller_cache_dict[key] = self._input_func(caller, *args, **kwargs) if caller != None else self._input_func(*args, **kwargs)
            return cur_caller_cache_dict[key]


    # Return the decorator wrapping the class (also wraps the instance to maintain the docstring and the name of the original function):
    return (lambda input_func : functools.wraps(input_func)(_LRU_Cache_class(input_func, maxsize, timeout)))

ユニットテストコード：

#!/usr/bin/python
# -*- coding: utf-8 -*-
import time
import random
import unittest
import lru_cache

class Test_Decorators(unittest.TestCase):
    def test_decorator_lru_cache(self):
        class LRU_Test(object):
            """class"""
            def __init__(self):
                self.num = 0

            @lru_cache.lru_cache(maxsize = 10, timeout = 3)
            def test_method(self, num):
                """test_method_doc"""
                self.num += num
                return self.num

        @lru_cache.lru_cache(maxsize = 10, timeout = 3)
        def test_func(num):
            """test_func_doc"""
            return num

        @lru_cache.lru_cache(maxsize = 10, timeout = 3)
        def test_func_time(num):
            """test_func_time_doc"""
            return time.time()

        @lru_cache.lru_cache(maxsize = 10, timeout = None)
        def test_func_args(*args, **kwargs):
            return random.randint(1,10000000)



        # Init vars:
        c1 = LRU_Test()
        c2 = LRU_Test()
        m1 = c1.test_method
        m2 = c2.test_method
        f1 = test_func

        # Test basic caching functionality:
        self.assertEqual(m1(1), m1(1)) 
        self.assertEqual(c1.num, 1)     # c1.num now equals 1 - once cached, once real
        self.assertEqual(f1(1), f1(1))

        # Test caching is different between instances - once cached, once not cached:
        self.assertNotEqual(m1(2), m2(2))
        self.assertNotEqual(m1(2), m2(2))

        # Validate the cache_clear funcionality only on one instance:
        prev1 = m1(1)
        prev2 = m2(1)
        prev3 = f1(1)
        m1.cache_clear()
        self.assertNotEqual(m1(1), prev1)
        self.assertEqual(m2(1), prev2)
        self.assertEqual(f1(1), prev3)

        # Validate the docstring and the name are set correctly:
        self.assertEqual(m1.__doc__, "test_method_doc")
        self.assertEqual(f1.__doc__, "test_func_doc")
        self.assertEqual(m1.__name__, "test_method")
        self.assertEqual(f1.__name__, "test_func")

        # Test the limit of the cache, cache size is 10, fill 15 vars, the first 5 will be overwritten for each and the other 5 are untouched. Test that:
        c1.num = 0
        c2.num = 10
        m1.cache_clear()
        m2.cache_clear()
        f1.cache_clear()
        temp_list = map(lambda i : (test_func_time(i), m1(i), m2(i)), range(15))

        for i in range(5, 10):
            self.assertEqual(temp_list[i], (test_func_time(i), m1(i), m2(i)))
        for i in range(0, 5):
            self.assertNotEqual(temp_list[i], (test_func_time(i), m1(i), m2(i)))
        # With the last run the next 5 vars were overwritten, now it should have only 0..4 and 10..14:
        for i in range(5, 10):
            self.assertNotEqual(temp_list[i], (test_func_time(i), m1(i), m2(i)))

        # Test different vars don't collide:
        self.assertNotEqual(test_func_args(1), test_func_args('1'))
        self.assertNotEqual(test_func_args(1.0), test_func_args('1.0'))
        self.assertNotEqual(test_func_args(1.0), test_func_args(1))
        self.assertNotEqual(test_func_args(None), test_func_args('None'))
        self.assertEqual(test_func_args(test_func), test_func_args(test_func))
        self.assertEqual(test_func_args(LRU_Test), test_func_args(LRU_Test))
        self.assertEqual(test_func_args(object), test_func_args(object))
        self.assertNotEqual(test_func_args(1, num = 1), test_func_args(1, num = '1'))
        # Test the sorting of kwargs:
        self.assertEqual(test_func_args(1, aaa = 1, bbb = 2), test_func_args(1, bbb = 2, aaa = 1))
        self.assertNotEqual(test_func_args(1, aaa = '1', bbb = 2), test_func_args(1, bbb = 2, aaa = 1))


        # Sanity validation of values
        c1.num = 0
        c2.num = 10
        m1.cache_clear()
        m2.cache_clear()
        f1.cache_clear()
        self.assertEqual((f1(0), m1(0), m2(0)), (0, 0, 10))
        self.assertEqual((f1(0), m1(0), m2(0)), (0, 0, 10))
        self.assertEqual((f1(1), m1(1), m2(1)), (1, 1, 11))
        self.assertEqual((f1(2), m1(2), m2(2)), (2, 3, 13))
        self.assertEqual((f1(2), m1(2), m2(2)), (2, 3, 13))
        self.assertEqual((f1(3), m1(3), m2(3)), (3, 6, 16))
        self.assertEqual((f1(3), m1(3), m2(3)), (3, 6, 16))
        self.assertEqual((f1(4), m1(4), m2(4)), (4, 10, 20))
        self.assertEqual((f1(4), m1(4), m2(4)), (4, 10, 20))

        # Test timeout - sleep, it should refresh cache, and then check it was cleared:
        prev_time = test_func_time(0)
        self.assertEqual(test_func_time(0), prev_time)
        self.assertEqual(m1(4), 10)
        self.assertEqual(m2(4), 20)
        time.sleep(3.5)
        self.assertNotEqual(test_func_time(0), prev_time)
        self.assertNotEqual(m1(4), 10)
        self.assertNotEqual(m2(4), 20)


if __name__ == '__main__':
    unittest.main()

score 3 · Accepted Answer

http://www.python.org/download/releases/3.2.3/

Python 2.7の最終リリース以降、2.x行はバグ修正のみを受け取り、新機能は3.x専用に開発されています。

Python 2.7には3.1からのいくつかの機能がありますが、lru_cacheは3.2で追加されました

コメントで特定されているように、http：//code.activestate.com/recipes/578078-py26-and-py30-backport-of-python-33s-lru-cache/は潜在的なソリューションです

python - Python2.7のメモ化ライブラリ

4 に答える 4

Python 2.7の関数lru_cacheの実装：

ユニットテストコード：

Related

Reference