c++ - プロセス間のタイミングの変動は大きいが、同じタスクのプロセス内のタイミングの変動は小さい

Question

このコード (完全なコードはこちら: http://codepad.org/5OJBLqIA ) を実行して、事前にキャッシュからオペランドをフラッシュする場合とフラッシュしない場合の繰り返しの daxpy 関数呼び出しの時間を計っています。

#define KB 1024

int main()
{
    int cache_size = 32*KB;
    double alpha = 42.5;

    int operand_size = cache_size/(sizeof(double)*2);
    double* X = new double[operand_size];
    double* Y = new double[operand_size];


    //95% confidence interval
    double max_risk = 0.05;
    //Interval half width
    double w;
    int n_iterations = 50000;
    students_t dist(n_iterations-1);
    double T = boost::math::quantile(complement(dist,max_risk/2));
    accumulator_set<double, stats<tag::mean,tag::variance> > unflushed_acc;

    for(int i = 0; i < n_iterations; ++i)
    {
        fill(X,operand_size);
        fill(Y,operand_size);
        double seconds = wall_time();
        daxpy(alpha,X,Y,operand_size);
        seconds = wall_time() - seconds;
        unflushed_acc(seconds);
    }

    w = T*sqrt(variance(unflushed_acc))/sqrt(count(unflushed_acc));
    printf("Without flush: time=%g +/- %g ns\n",mean(unflushed_acc)*1e9,w*1e9);

    //Using clflush instruction
    //We need to put the operands back in cache
    accumulator_set<double, stats<tag::mean,tag::variance> > clflush_acc;
    for(int i = 0; i < n_iterations; ++i)
    {
        fill(X,operand_size);
        fill(Y,operand_size);

        flush_array(X,operand_size);
        flush_array(Y,operand_size);
        double seconds = wall_time();
        daxpy(alpha,X,Y,operand_size);
        seconds = wall_time() - seconds;
        clflush_acc(seconds);
    }

    w = T*sqrt(variance(clflush_acc))/sqrt(count(clflush_acc));
    printf("With clflush: time=%g +/- %g ns\n",mean(clflush_acc)*1e9,w*1e9);

    return 0;
}

このコードは、指定された反復回数で平均化された率と不確実性を測定します。多くの反復を平均化することで、さまざまなコアからのメモリアクセスの競合によって引き起こされる分散を最小限に抑えることができます (以前の質問で説明しました) 。

$ ./variance
Without flush: time=3107.76 +/- 0.268198 ns
With clflush: time=5862.33 +/- 9.84313 ns
$ ./variance
Without flush: time=3105.71 +/- 0.237823 ns
With clflush: time=7802.66 +/- 12.3163 ns

これらは次々に実行されました。フラッシュされたケース (フラッシュされていないケースではない) のタイミングがプロセス間で大きく異なるのに、特定のプロセス内ではほとんど変わらないのはなぜですか?

付録

コードは Intel Xeon 5650 上の Mac OS X 10.8 で実行されます。

c++ - プロセス間のタイミングの変動は大きいが、同じタスクのプロセス内のタイミングの変動は小さい

0 に答える 0

Related

Reference