python - `scipy.stat.distributions` の組み込みの確率密度関数は、ユーザーが提供するものより遅いですか?

Question

配列 : があり、パラメーターおよびadata=array([0.5, 1.,2.,3.,6.,10.])が与えられた場合に、この配列のワイブル分布の対数尤度を計算したいとします。で既に提供されているため、このタスクのために独自のワイブル確率密度関数を記述する必要があるとは考えていませんでした。したがって、これはそれを行う必要があります：[5.,1.5][5.1,1.6]scipy.stat.distributions

from scipy import stats
from numpy import *
adata=array([0.5, 1.,2.,3.,6.,10.])
def wb2LL(p, x): #log-likelihood of 2 parameter Weibull distribution
    return sum(log(stats.weibull_min.pdf(x, p[1], 0., p[0])), axis=1)

と：

>>> wb2LL(array([[5.,1.5],[5.1,1.6]]).T[...,newaxis], adata)
array([-14.43743911, -14.68835298])

または、車輪を再発明して、次のような新しいワイブル pdf 関数を作成します。

wbp=lambda p, x: p[1]/p[0]*((x/p[0])**(p[1]-1))*exp(-((x/p[0])**p[1]))
def wb2LL1(p, x): #log-likelihood of 2 paramter Weibull
    return sum(log(wbp(p,x)), axis=1)

と：

>>> wb2LL1(array([[5.,1.5],[5.1,1.6]]).T[...,newaxis], adata)
array([-14.43743911, -14.68835298])

確かに、何かがすでににある場合scipy、それは非常に適切に最適化されているはずであり、車輪を再発明しても高速になることはめったにないと常に思っています。しかし、ここで驚きがあります。 I timeit, の 100000 回の呼び出しでwb2LL1(array([[5.,1.5],[5.1,1.6]])[...,newaxis], adata)2.156 秒かかり、wb2LL(array([[5.,1.5],[5.1,1.6]])[...,newaxis], adata)の 100000 回の呼び出しで 5.219 秒かかる場合、組み込みstats.weibull_min.pdf()は 2 倍以上遅くなります。

ソースコードを確認してpython_path/sitepackage/scipy/stat/distributions.pyも、少なくとも私にとっては簡単な答えは得られませんでした。どちらかといえば、それから私はstats.weibull_min.pdf()ほぼwbp().

関連するソースコード: 2999 ～ 3033 行目:

class frechet_r_gen(rv_continuous):
    """A Frechet right (or Weibull minimum) continuous random variable.

    %(before_notes)s

    See Also
    --------
    weibull_min : The same distribution as `frechet_r`.
    frechet_l, weibull_max

    Notes
    -----
    The probability density function for `frechet_r` is::

        frechet_r.pdf(x, c) = c * x**(c-1) * exp(-x**c)

    for ``x > 0``, ``c > 0``.

    %(example)s

    """
    def _pdf(self, x, c):
        return c*pow(x,c-1)*exp(-pow(x,c))
    def _logpdf(self, x, c):
        return log(c) + (c-1)*log(x) - pow(x,c)
    def _cdf(self, x, c):
        return -expm1(-pow(x,c))
    def _ppf(self, q, c):
        return pow(-log1p(-q),1.0/c)
    def _munp(self, n, c):
        return special.gamma(1.0+n*1.0/c)
    def _entropy(self, c):
        return -_EULER / c - log(c) + _EULER + 1
frechet_r = frechet_r_gen(a=0.0, name='frechet_r', shapes='c')
weibull_min = frechet_r_gen(a=0.0, name='weibull_min', shapes='c')

だから、問題は: はstats.weibull_min.pdf()本当に遅いですか? もしそうなら、どうしてですか？

python - `scipy.stat.distributions` の組み込みの確率密度関数は、ユーザーが提供するものより遅いですか?

1 に答える 1

Related

Reference