precision - IEEE-754 の倍精度と分割方法

Question

初等関数を計算するときは、定数の変更を適用します。特に exp(x) の実装で。これらすべての実装で、ln(2) による修正は 2 つのステップで行われます。ln(2) は 2 つの数値に分割されます。

static const double ln2p1   = 0.693145751953125;
static const double ln2p2   = 1.42860682030941723212E-6;
// then ln(2) = ln2p1 + ln2p2

次に、ln(2) を使用した計算は次のように行われます。

 blablabla -= ln2p1
 blablabla -= ln2p2

丸め効果を避けるためであることはわかっています。しかし、なぜこの 2 つの数字が特別に含まれているのでしょうか。これらの 2 つの数値を取得する方法を知っている人もいますか?

ありがとうございました！

最初のコメントに続いて、より多くの資料と非常に奇妙な質問を追加して、この投稿を完成させます。私はチームと協力して、数値 ln(2) を 2 つの数値に分割することで精度を潜在的に 2 倍にするという契約に同意しました。このために、2 つの変換が適用されます。最初の変換は次のとおりです。

1) c_h = floor(2^k ln(2))/2^k
2) c_l = ln(2) - c_h

k は精度を示します。Cephes ライブラリ (~1980) では、float k は 9、double では 16、long long double では 16 に固定されています (理由はわかりません)。したがって、double c_h の精度は 16 ビットですが、c_l の精度は 52 ビットです。

このことから、以下のプログラムを書き、c_h を 52 ビット精度で求めます。

 #include <iostream>
 #include <math.h>
 #include <iomanip>

 enum precision { nine = 9, sixteen = 16, fiftytwo = 52 };

 int64_t k_helper(double x){
     return floor(x/log(2));
 }

 template<class C>
 double z_helper(double x, int64_t k){
     x -= k*C::c_h;
     x -= k*C::c_l;
     return x;
 }

 template<precision p>
 struct coeff{};

 template<>
 struct coeff<nine>{
     constexpr const static double c_h = 0.693359375;
     constexpr const static double c_l = -2.12194440e-4;
 };

 template<>
 struct coeff<sixteen>{
     constexpr const static double c_h = 6.93145751953125E-1;
     constexpr const static double c_l = 1.42860682030941723212E-6;
 };

 template<>
 struct coeff<fiftytwo>{
     constexpr const static double c_h = 0.6931471805599453972490664455108344554901123046875;
     constexpr const static double c_l = -8.78318343240526578874146121703272447458793199905066E-17;
 };


 int main(int argc, const char * argv[]) {

    double x = atof(argv[1]);
    int64_t k = k_helper(x);

    double z_9  = z_helper<coeff<nine> >(x,k);
    double z_16 = z_helper<coeff<sixteen> >(x,k);
    double z_52 = z_helper<coeff<fiftytwo> >(x,k);


    std::cout << std::setprecision(16) << " 9  bits precisions " << z_9 << "\n"
                                       << " 16 bits precisions " << z_16 << "\n"
                                       << " 52 bits precisions " << z_52 << "\n";



    return 0;

}

一連の異なる値について今計算すると、次のようになります。

bash-3.2$ g++ -std=c++11 main.cpp  
bash-3.2$ ./a.out 1
9  bits precisions 0.30685281944
16 bits precisions 0.3068528194400547
52 bits precisions 0.3068528194400547
bash-3.2$ ./a.out 2
9  bits precisions 0.61370563888
16 bits precisions 0.6137056388801094
52 bits precisions 0.6137056388801094
bash-3.2$ ./a.out 100
9  bits precisions 0.18680599936
16 bits precisions 0.1868059993678755
52 bits precisions 0.1868059993678755
bash-3.2$ ./a.out 200
9  bits precisions 0.37361199872
16 bits precisions 0.3736119987357509
52 bits precisions 0.3736119987357509
bash-3.2$ ./a.out 300
9  bits precisions 0.56041799808
16 bits precisions 0.5604179981036264
52 bits precisions 0.5604179981036548
bash-3.2$ ./a.out 400
9  bits precisions 0.05407681688
16 bits precisions 0.05407681691155647
52 bits precisions 0.05407681691155469
bash-3.2$ ./a.out 500
9  bits precisions 0.24088281624
16 bits precisions 0.2408828162794319
52 bits precisions 0.2408828162794586
bash-3.2$ ./a.out 600
9  bits precisions 0.4276888156
16 bits precisions 0.4276888156473074
52 bits precisions 0.4276888156473056
bash-3.2$ ./a.out 700
9  bits precisions 0.61449481496
16 bits precisions 0.6144948150151828
52 bits precisions 0.6144948150151526

x が 300 を超えると差が出るのが好きです。gnulibcの実装を見てみました

http://osxr.org:8080/glibc/source/sysdeps/ieee754/ldbl-128/s_expm1l.c

現在、c_h に 16 ビットプリビジョンを使用しています (84 行目)。

まあ、おそらくIEEE標準で何かが欠けています.glibcの精度のエラーは想像できません。どう思いますか？

一番、

precision - IEEE-754 の倍精度と分割方法

1 に答える 1

Related

Reference