c++ - OpenMP(C/C++): Efficient way of sharing an unordered_map> and a vector between threads

Question

I have a for loop that I would like to make parallel, however the threads must share an unordered_map and a vector.

Because the for loop is somewhat big I will post here a concise overview of it so that I can make my main problem clear. Please read the comments.

   unordered_map<string, vector<int>> sharedUM;

   /*
      here I call a function that updates the unordered_map with some
      initial data, however the unordered_map will need to be updated by
      the threads inside the for loop
   */

   vector<int> sharedVector;
  /* 
     the shared vector initially is empty, the threads will 
    fill it with integers, the order of these integers should be in ascending
    order, however I can simply sort the array after all the 
    threads finish executing so I guess we can assume that the order 
    does not matter
  */

   #pragma omp parallel for
   for(int i=0; i<N; i++){

      key = generate_a_key_value_according_to_an_algorithm();
      std::unordered_map<string, vector<int>::iterator it = sharedUM.find(key);

      /*
       according to the data inside it->second(the value), 
       the thread makes some conclusions which then
       uses in order to figure out whether 
       it should run a high complexity algorithm
       or not.
      */
       bool conclusion = make_conclusion();

       if(conclusion == true){

           results = run_expensive_algorithm();

          /*
             According to the results, 
             the thread updates some values of
             the key that it previously searched for inside the unordered_map
             this update may help other threads avoid running 
             the expensive algorithm
          */

       }

       sharedVector.push_back(i);

   }

Initially I left the code as it is, so I just used that #pragma over the for loop, however I got a few problems regarding the update of the sharedVector. So I decided to use simple locks in order to force a thread acquire the lock before writing to the vector. So in my implementation I had something like this:

      omp_lock_t sharedVectorLock;
      omp_init_lock(&sharedVectorLock);
      ...
      for(...)
      ...
       omp_set_lock(&sharedVectorLock);
       sharedVector.push_back(i);
       omp_unset_lock(&sharedVectorLock);
      ...
      omp_destroy_lock(&sharedVectorLock);

I had run my application many times and everything seemed to be working great, and that's until I decided to rerun it automatically too many times until I got wrong results. Because I'm very new to the world of OpenMP and the threads in general, I wasn't aware of the fact that we should lock all the readers when a writer is updating some shared data. As you can see here in my application the threads always read some data from the unordered_map in order make some conclusions and learn things about the key that was assigned to them. What happens though if two threads have to work with the same key, and while some other thread is trying to read the values of this key, another one has reached the point of updating those values? I believe that's where my problem occurs.

However my main problem right now is that I'm not sure what would be the best way to avoid such things from happening. It's like my system works for 99% of the time, but that 1% ruins everything because two threads are rarely assigned with the same key which in turn is because my unordered_map is usually big.

Would locking the unordered_map do my job? Most likely, but that wouldn't be efficient because a thread A that wants to work with the key x would have to wait for a thread B that is already working with the key y where y can be different than x to finish.

So my main question is, how should I approach this problem? How can I lock the unordered_map if and only if two threads are working with the same key?

Thank you in advance

score 1 · Accepted Answer

1ロックとミューテックスの使用について。並列ブロックの外側( の前) でロック変数を宣言して初期化し、#pragma omp parallelそれらを並列ブロック内で使用する必要があります。(1) ロックを取得します (別のスレッドがロックしている場合、これはブロックされる可能性があります)。 (3) ロックを解除する。最後に、並列ブロックを抜けた後に破棄します。並列ブロック内で宣言されたロックはスレッドに対してローカルであるため、同期を提供できません。これはあなたの問題を説明するかもしれません。

複雑な C++ コンテナーへの書き込みに関する2 。OpenMP は、元々単純な FORTRAN do ループ (整数制御変数を使用した C/C++ for ループに似ています) 用に設計されました。より複雑なものはすべて頭痛の種になります。安全のために、C++ コンテナーに対する非定数操作は、ロック (同じコンテナーに対するそのような操作には同じロックを使用する) または omp クリティカル領域 (そのような操作には同じ名前を使用する)内で実行する必要があります。同じ容器）。これには、単純な読み取り以外のものが含まpop()れます。push()これは、そのような一定でないコンテナ操作がごくわずかな時間しかかからない場合にのみ、効率的であり続けることができます。

3もし私があなただったら、openMP を気にしないでしょう (使ったことはありますが、今は後悔しています)。C++ では、TBB を使用できます。TBB には、スレッドセーフだがロックフリーのコンテナーもいくつか付属しています。また、スレッドではなく、再帰的に実行されるタスク (親タスクが子タスクを生成するなど) の観点から考えることができますが、TBB には、並列 for ループなどの単純な実装がいくつかあります。

score 0 · Accepted Answer

別のアプローチは、TBB のconcurrent_unordered_mapを使用することです。

TBB の残りの並列処理サポートを使用する必要はありません (ただし、C++ でゼロから始める場合は、OpenMP よりも「c++ っぽい」ことは確かです)。

score -1 · Accepted Answer

これが役立つかもしれません：

    vector<bool> sv(N);

交換

    sharedVector.push_back(i);

に

    sv[i]=true;

これにより、ロックを回避でき（非常に時間がかかります）、sharedVector を簡単にソートできます。

    for(int i=0; i<N;i++){
        if(sv[i])sharedVector.push_back(i);
    }

c++ - OpenMP(C/C++): Efficient way of sharing an unordered_map> and a vector between threads

3 に答える 3

Related

Reference