c++ - マップに値を割り当てる最も効率的な方法

Question

マップに値を割り当てる最も効率的な方法はどれですか? それとも、それらはすべて同じコードに最適化されていますか (最新のコンパイラのほとんどで)?

   // 1) Assignment using array index notation
   Foo["Bar"] = 12345;

   // 2) Assignment using member function insert() and STL pair
   Foo.insert(std::pair<string,int>("Bar", 12345));

   // 3) Assignment using member function insert() and "value_type()"
   Foo.insert(map<string,int>::value_type("Bar", 12345));

   // 4) Assignment using member function insert() and "make_pair()"
   Foo.insert(std::make_pair("Bar", 12345));

(コンパイラの出力をベンチマークして確認できることはわかっていますが、この質問が今出てきて、手元にあるのは携帯電話だけです...へへ)

score 30 · Accepted Answer

まず、との間には意味的な違いが[]ありinsertます。

[]古い値を置き換えます (存在する場合)
insert新しい値は無視されます (古い値が既に存在する場合)

したがって、この 2 つを比較することは、一般的に無意味です。

インサートのバージョンについて:

std::map<std::string, int>::value_type std::pair<std::string const, int>3と4の間に（重要な）違いはありません
std::make_pair("Bar", 12345)型がコピー時に実行する操作を備えた本格的なクラスであり、単なる文字列リテラル (したがって単なるポインターコピー)でstd::pair<std::string, int>("Bar", 12345)あるためよりも安価です。ただし、最後に...を作成する必要があるため、コンパイラの品質に依存しますstd::string"Bar"std::string

一般的に、私はお勧めします：

[]更新用
insert(std::make_pair())重複を無視するため

std::make_pair短いだけでなく、DRY ガイドラインも尊重します: Don't Repeat Yourself.

ただし、完全を期すために、最速の（そして最も簡単な）ものはemplace（C++ 11が有効）です：

map.emplace("Bar", 12345);

その動作はの動作ですがinsert、新しい要素をその場で構築します。

score 2 · Accepted Answer

std::map::operator[]1)オブジェクトがまだ存在しない場合は最初にデフォルトでオブジェクトを作成し、次にoperator=目的の値を設定するために使用できる参照を返すため、他のメソッドよりもわずかに遅くなる可能性があります。つまり、2 つの操作です。

2-4) は同じ型map::value_typeの typedef であるstd::pairため、同等である必要があり、したがってmake_pair同等でもあります。コンパイラは、これらを同じように扱う必要があります。

また、要素の存在を確認し (たとえば、存在するかどうかに応じて特別なロジックを実行する) map::lower_bound、要素を挿入する必要がある場合は、パフォーマンスをさらに向上させることができます。したがって、全体を再度map::insert検索する必要はありません。map

 // get the iterator to where the key *should* be if it existed:
 std::map::iterator hint = mymap.lower_bound(key);

 if (hint == mymap.end() || mymap.key_comp()(key, hint->first)) {
     // key didn't exist in map
     // special logic A here...

     // insert at the correct location
     mymap.insert(hint, make_pair(key, new_value));
 } else { 
     // key exists in map already
     // special logic B here...

     // just update value
     hint->second = new_value;
 }

score 1 · Accepted Answer

最初の可能性：Foo["Bar"] = 12345;他のオブジェクトとはセマンティクスが多少異なります-存在しない場合は新しいオブジェクトを挿入しますが（他のオブジェクトと同様）、存在しない場合は現在のコンテンツを上書きinsertします（そのキーがすでに存在する場合、他のユーザーは失敗します）。

速度に関しては、他の速度よりも遅くなる可能性があります。新しいオブジェクトを挿入すると、指定されたキーとデフォルトで作成されたvalue_typeのペアが作成され、後で正しいvalue_typeが割り当てられます。他のすべては、正しいキーと値のペアを構築し、そのオブジェクトを挿入します。ただし、公平を期すために、私の経験では、違いが重要になることはめったにありません（古いコンパイラではより重要でしたが、新しいコンパイラではごくわずかです）。

次の2つは互いに同等です。同じタイプに名前を付けるために2つの異なる方法を使用しているだけです。実行時では、それらの間にまったく違いはありません。

4つ目は、理論的には追加レベルの関数呼び出しを伴う可能性のあるテンプレート関数（make_pair）を使用します。ただし、コンパイラが最適化（特にインライン化）をまったく行わないように注意した場合を除いて、これとの実際の違いを見て非常に驚きます。

結論：最初の部分は他の部分よりも少し遅くなることがよくあります（ただし、常にではなく、それほどではありません）。他の3つはほとんどの場合等しくなります（たとえば、通常、妥当なコンパイラーが3つすべてに対して同一のコードを生成することを期待します）。

score 1 · Accepted Answer

すでにいくつかの良い答えがありましたが、簡単なベンチマークを行ったほうがよいと思いました. それぞれ 500 万回実行し、c++11 の chrono を使用して所要時間を測定しました。

コードは次のとおりです。

#include <string>
#include <map>
#include <chrono>
#include <cstdio>

// 5 million
#define times 5000000

int main()
{
    std::map<std::string, int> foo1, foo2, foo3, foo4, foo5;
    std::chrono::steady_clock::time_point timeStart, timeEnd;
    int x = 0;

    // 1) Assignment using array index notation
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo1[std::to_string(x)] = 12345;
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("1) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    // 2) Assignment using member function insert() and STL pair
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo2.insert(std::pair<std::string, int>(std::to_string(x), 12345));
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("2) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    // 3) Assignment using member function insert() and "value_type()"
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo3.insert(std::map<std::string, int>::value_type(std::to_string(x), 12345));
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("3) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    // 4) Assignment using member function insert() and "make_pair()"
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo4.insert(std::make_pair(std::to_string(x), 12345));
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("4) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    // 5) Matthieu M.'s suggestion of C++11's emplace
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo5.emplace(std::to_string(x), 12345);
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("5) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    return 0;
}

500 万回の反復の出力は次のとおりです。

1) took 23448 milliseconds
2) took 22854 milliseconds
3) took 22372 milliseconds
4) took 22988 milliseconds
5) took 21356 milliseconds

GCC バージョン:

g++ (Built by MinGW-builds project) 4.8.0 20121225 (experimental)

私のマシン：

Intel i5-3570k overclocked at 4.6 GHz

EDIT1: コードを修正し、ベンチマークをやり直しました。

EDIT2: C++ 11 の emplace に対する Matthieu M. の提案を追加しました。彼は正しいです。emplace は最速です

score 0 · Accepted Answer

If there is no object at that key location, then:

std::map::emplace is most efficient. insert is second (but will be extremely close). [] is least efficient.

[], if there is no object there, trivial constructs one. It then calls operator=.

insert does a copy constructor call on the std::pair argument.

However, in the case of maps, map.insert( make_pair( std::move(key), std::move(value) ) ) is going to be close to map.emplace( std::move(key), std::move(value) ).

If there is an object at the key location, then [] will call operator=, while insert/emplace will destroy the old object and create a new one. [] could easily be cheaper in this case.

In the end, it depends on what your operator= vs copy-construct vs trivial-construct vs destructor costs are for your key and value.

The actual work looking things up within the tree structure of the std::map will be so close to identical it isn't funny.

score 0 · Accepted Answer

3 番目が最良の選択 (IMHO) ですが、2、3、4 は同じです。

// 3) Assignment using member function insert() and "value_type()"
Foo.insert(map<string,int>::value_type("Bar", 12345));

3 番目の方法が最適な選択であると考える理由:値を挿入するための操作は 1 つだけです: 挿入するだけで (検索もあります)、値が挿入されたかどうかを確認しsecond、戻り値のメンバーを確認できます。実装は、値を上書きしないことを許可します。

の使用にvalue_typeも利点があります。マップされた型またはキーの型を知る必要がないため、テンプレートプログラミングに役立ちます。

最悪（IMHO）は最初のものです：

// 1) Assignment using array index notation
Foo["Bar"] = 12345;

std::map::operator[]オブジェクトを作成してそれへの参照を返すと、マップされたオブジェクトoperator =が呼び出されます。挿入のために 2 つの操作を行っています。1 つ目は挿入、2 つ目は割り当てです。

また、別の問題があります。値が挿入されたのか、上書きされたのかわかりません。

c++ - マップに値を割り当てる最も効率的な方法

6 に答える 6

Related

Reference