numpy - オクターブのパーセンタイルとnumpyのパーセンタイルの違いはなぜですか?

Question

私は、matlab/octave プログラムを numpy に書き直していて、結果の値の違いに出くわしました。これは、percentile/prctile 関数と stdard-deviation 関数の両方で発生します。

ナンピーで：

import matplotlib.mlab as ml
import numpy

>>> t = numpy.linspace(0,100, 100)
>>> numpy.percentile(t,95)
95.0
>>> numpy.std(t)
29.157646512850626
>>> ml.prctile(t,95)
95.000000000000014

オクターブで:

octave:1> t = linspace(0,100,100)';
octave:2> prctile(t,95)
ans =  95.454545
octave:3> std(t)
ans =  29.304537

't' の配列値は同じですが、結果は予想以上に異なります。

numpy help(numpy.std) では、アルゴリズムが次のとおりであることを具体的に述べています。

std = sqrt(mean(abs(x - x.mean())**2))

だから私はそれをオクターブで実装し、numpyが与える正確な答えを得ました。そのため、std-deviation 関数が異なるようです。
しかし、なぜ/どのように？そして、どれが正しいですか？（そういうのがあれば）

そして、パーセンタイル/パーセンタイルでさえ？

念のため、私はLinux aptosidにいるので...

GNU Octave、バージョン 3.6.2

うるさい。バージョン「1.6.2rc1」

score 1 · Accepted Answer

Octave は、少なくともデフォルトで ddof=1 を想定しており、numpy はデフォルトで 0 を使用しているようです。

>>> numpy.std(t, ddof=0)
29.157646512850633
>>> numpy.std(t, ddof=1)
29.304537349375785

score 1 · Accepted Answer

Numpy simply uses a different algorithm when the percentile lies between two data points. Octave, Matlab and R always center it exactly between two points when needed (I believe), numpy does a bit more then that... if you check http://en.wikipedia.org/wiki/Percentile you will see there are a couple of ways to calculate percentiles.

numpy - オクターブのパーセンタイルとnumpyのパーセンタイルの違いはなぜですか?

2 に答える 2

Related

Reference