c++ - pthread を使用して複数のコアを利用する際の問題

Question

SDL と Pthread を使用して C++ でレイトレーサーを開発しています。プログラムで 2 つのコアを使用する際に問題が発生しています。スレッドは機能しますが、両方のコアを 100% 使用していません。SDL とのインターフェイスとして、SDL_Surface.pixels のメモリに直接書き込むので、SDL によってロックされることはないと思います。

私のスレッド関数は次のようになります。

void* renderLines(void* pArg){
while(true){
    //Synchronize
    pthread_mutex_lock(&frame_mutex);
    pthread_cond_wait(&frame_cond, &frame_mutex);
    pthread_mutex_unlock(&frame_mutex);

    renderLinesArgs* arg = (renderLinesArgs*)pArg;
    for(int y = arg->y1; y < arg->y2; y++){
        for(int x = 0; x < arg->width; x++){
            Color C = arg->scene->renderPixel(x, y);
            putPixel(arg->screen, x, y, C);
        }
    }

    sem_post(&frame_rendered);
    }
}

注: scene->renderPixel は const であるため、両方のスレッドが同じメモリから読み取ることができると想定しています。これを行う2つのワーカースレッドがあり、メインループでこれらを使用して動作させます:

//Signal a new frame
pthread_mutex_lock(&frame_mutex);
pthread_cond_broadcast(&frame_cond);
pthread_mutex_unlock(&frame_mutex);

//Wait for workers to be done
sem_wait(&frame_rendered);
sem_wait(&frame_rendered);

//Unlock SDL surface and flip it...

注：スレッドを同期する代わりに、スレッドを作成して参加することも試みました。これを「-lpthread -D_POSIX_PTHREAD_SEMANTICS -pthread」でコンパイルすると、gcc は文句を言いません。

私の問題は、実行中の CPU 使用率のグラフを使用して最もよく説明されています: _(ソース:_jopsen.dk₎

グラフからわかるように、私のプログラムは一度に 1 つのコアしか使用せず、時々 2 つのコアを切り替えますが、両方を 100% 駆動することはありません。私はいったい何を間違えたのでしょうか？シーンでミューテックスやセマフォを使用していません。バグを見つけるにはどうすればよいですか?

また、scene->renderPixel() の周りに while(true) を配置すると、両方のコアを 100% にプッシュできます。したがって、これはオーバーヘッドが原因ではないかと疑っていますが、複雑なシーンを考えると、0.5 秒ごと (たとえば FPS: 0.5) だけ同期します。私のバグが何であるかを伝えるのは簡単ではないかもしれませんが、これをデバッグするアプローチも素晴らしいでしょう...私は以前にpthreadsで遊んだことがありません...

また、これはハードウェアまたはカーネルの問題である可能性があります。私のカーネルは次のとおりです。

$uname -a
Linux jopsen-laptop 2.6.27-14-generic #1 SMP Fri Mar 13 18:00:20 UTC 2009 i686 GNU/Linux

ノート：

score 2 · Accepted Answer

これは役に立たない：

pthread_mutex_lock(&frame_mutex);
pthread_cond_wait(&frame_cond, &frame_mutex);
pthread_mutex_unlock(&frame_mutex);

新しいフレームを待つのを待つ場合は、次のようにします。

int new_frame = 0;

最初のスレッド：

pthread_mutex_lock(&mutex); 
new_frame = 1; 
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex);

他のスレッド：

pthread_mutex_lock(&mutex); 
while(new_frame == 0)
  pthread_cond_wait(&cond, &mutex); 
/* Here new_frame != 0, do things with the frame*/
pthread_mutex_unlock(&mutex);

pthread_cond_wait（）は、実際にミューテックスを解放し、条件が通知されるまでスレッドのスケジュールを解除します。状態が通知されると、スレッドがウェイクアップされ、ミューテックスが再取得されます。これはすべて、pthread_cond_wait（）関数内で発生します

score 1 · Accepted Answer

暗闇の中でワイルドな突き刺しをして、ワーカースレッドが条件変数の待機に多くの時間を費やしていると言います。コードの大部分が CPU バウンドであるこの種の状況で良好な CPU パフォーマンスを得るには、スレッドを「プール」として扱い、キュー構造を使用して作業をフィードするタスク指向スタイルのプログラミングを使用することが理解されています。彼ら。キューから作業を取り出すのにごくわずかな時間を費やし、ほとんどの時間を実際の作業に費やす必要があります。

あなたが今持っているのは、彼らがおそらくしばらくの間仕事をしていて、セマフォを介してメインスレッドに仕事が終わったことを通知しているという状況です。メインスレッドは、両方のスレッドが現在処理中のフレームでの作業を完了するまで、それらを解放しません。

C++ を使用しているため、Boost.Threads の使用を検討しましたか? これにより、マルチスレッドコードでの作業がはるかに簡単になり、API は実際には pthread に似ていますが、「最新の C++」のような方法です。

score 1 · Accepted Answer

I'm no pthreads guru, but it seems to me that the following code is wrong:

pthread_mutex_lock(&frame_mutex);
pthread_cond_wait(&frame_cond, &frame_mutex);
pthread_mutex_unlock(&frame_mutex);

To quote this article

pthread_cond_wait() blocks the calling thread until the specified condition is signalled. This routine should be called while mutex is locked, and it will automatically release the mutex while it waits. After signal is received and thread is awakened, mutex will be automatically locked for use by the thread. The programmer is then responsible for unlocking mutex when the thread is finished with it.

so it seems to me that you should be releasing the mutex after the block of code follwing the pthread_cond_wait.

c++ - pthread を使用して複数のコアを利用する際の問題

3 に答える 3

Related

Reference