c++ - 挿入の順序を追跡する std::map ?

Question

現在std::map<std::string,int> 、整数値を一意の文字列識別子に格納するがあり、文字列を検索しています。挿入順序を追跡しないことを除いて、ほとんど私が望むことを行います。したがって、マップを反復して値を出力すると、文字列に従って並べ替えられます。しかし、（最初の）挿入の順序に従って並べ替えたいと思います。

代わりにa を使用することも考えましたが、文字列を調べて整数値を約 10,000,000 回インクリメントする必要があるため、aが大幅に遅くなるvector<pair<string,int>>かどうかはわかりません。std::vector

使用方法はありますか、それとも私のニーズにより適したstd::map別の容器はありますか?std

私は GCC 3.4 を使用しており、std::map..

score 62 · Accepted Answer

値が 50 個しかない場合は、印刷する前に値std::mapをコピーして、適切なファンクターを使用して並べ替えることができます。std::vectorstd::sort

または、 boost::multi_indexを使用できます。複数のインデックスを使用できます。あなたの場合、次のようになります。

struct value_t {
      string s;
      int    i;
};

struct string_tag {};

typedef multi_index_container<
    value_t,
    indexed_by<
        random_access<>, // this index represents insertion order
        hashed_unique< tag<string_tag>, member<value_t, string, &value_t::s> >
    >
> values_t;

score 33 · Accepted Answer

std::vectorをstd::tr1::unordered_map(ハッシュテーブル)と組み合わせることができます。Boost のドキュメントへのリンクは次のとおりですunordered_map。ベクトルを使用して挿入順序を追跡し、ハッシュテーブルを使用して頻繁に検索を行うことができます。std::map数十万回のルックアップを行っている場合、O(log n) ルックアップとハッシュテーブルの O(1)の違いは重要です。

std::vector<std::string> insertOrder;
std::tr1::unordered_map<std::string, long> myTable;

// Initialize the hash table and record insert order.
myTable["foo"] = 0;
insertOrder.push_back("foo");
myTable["bar"] = 0;
insertOrder.push_back("bar");
myTable["baz"] = 0;
insertOrder.push_back("baz");

/* Increment things in myTable 100000 times */

// Print the final results.
for (int i = 0; i < insertOrder.size(); ++i)
{
    const std::string &s = insertOrder[i];
    std::cout << s << ' ' << myTable[s] << '\n';
}

score 15 · Accepted Answer

平行を保ちlist<string> insertionOrderます。

印刷するときは、リストを繰り返し、マップを検索します。

each element in insertionOrder  // walks in insertionOrder..
    print map[ element ].second // but lookup is in map

score 4 · Accepted Answer

両方のルックアップ戦略が必要な場合は、最終的に 2 つのコンテナーが必要になります。vector実際の値 ( ints) でa を使用し、そのmap< string, vector< T >::difference_type> 隣に a を配置して、インデックスをベクトルに返すことができます。

これらすべてを完了するには、両方を 1 つのクラスにカプセル化できます。

しかし、ブーストには複数のインデックスを持つコンテナがあると思います。

score 3 · Accepted Answer

あなたが（Boostに頼らずに）欲しいのは、私が「順序付けられたハッシュ」と呼んでいるものです。これは、本質的にハッシュと、文字列または整数キー（または同時に両方）を持つリンクリストのマッシュアップです。順序付けられたハッシュは、ハッシュの絶対的なパフォーマンスで反復中に要素の順序を維持します。

私は比較的新しい C++ スニペットライブラリをまとめました。これは、C++ ライブラリ開発者にとって C++ 言語の穴として私が見ているものを埋めるものです。ここに行きます：

https://github.com/cubiclesoft/cross-platform-cpp

掴む：

templates/detachable_ordered_hash.cpp
templates/detachable_ordered_hash.h
templates/detachable_ordered_hash_util.h

ユーザー制御のデータがハッシュに配置される場合は、次のことも必要になる場合があります。

security/security_csprng.cpp
security/security_csprng.h

それを呼び出します：

#include "templates/detachable_ordered_hash.h"
...
// The 47 is the nearest prime to a power of two
// that is close to your data size.
//
// If your brain hurts, just use the lookup table
// in 'detachable_ordered_hash.cpp'.
//
// If you don't care about some minimal memory thrashing,
// just use a value of 3.  It'll auto-resize itself.
int y;
CubicleSoft::OrderedHash<int> TempHash(47);
// If you need a secure hash (many hashes are vulnerable
// to DoS attacks), pass in two randomly selected 64-bit
// integer keys.  Construct with CSPRNG.
// CubicleSoft::OrderedHash<int> TempHash(47, Key1, Key2);
CubicleSoft::OrderedHashNode<int> *Node;
...
// Push() for string keys takes a pointer to the string,
// its length, and the value to store.  The new node is
// pushed onto the end of the linked list and wherever it
// goes in the hash.
y = 80;
TempHash.Push("key1", 5, y++);
TempHash.Push("key22", 6, y++);
TempHash.Push("key3", 5, y++);
// Adding an integer key into the same hash just for kicks.
TempHash.Push(12345, y++);
...
// Finding a node and modifying its value.
Node = TempHash.Find("key1", 5);
Node->Value = y++;
...
Node = TempHash.FirstList();
while (Node != NULL)
{
  if (Node->GetStrKey())  printf("%s => %d\n", Node->GetStrKey(), Node->Value);
  else  printf("%d => %d\n", (int)Node->GetIntKey(), Node->Value);

  Node = Node->NextList();
}

私は調査段階でこの SO スレッドに遭遇し、大規模なライブラリにドロップする必要なく、OrderedHash のようなものが既に存在するかどうかを確認しました。がっかりしたよ。だから私は自分自身を書きました。そして今、私はそれを共有しました。

score 2 · Accepted Answer

これを実装する別の方法は、のmap代わりにを使用することvectorです。このアプローチを示し、違いについて説明します。

舞台裏で 2 つのマップを持つクラスを作成するだけです。

#include <map>
#include <string>

using namespace std;

class SpecialMap {
  // usual stuff...

 private:
  int counter_;
  map<int, string> insertion_order_;
  map<string, int> data_;
};

data_その後、適切な順序で反復子を反復子に公開できます。それを行う方法は、を反復処理しinsertion_order_、その反復から取得した各要素についてdata_、insertion_order_

hash_mapを直接反復することを気にしないため、より効率的なを insert_order に使用できますinsertion_order_。

挿入を行うには、次のようなメソッドを使用できます。

void SpecialMap::Insert(const string& key, int value) {
  // This may be an over simplification... You ought to check
  // if you are overwriting a value in data_ so that you can update
  // insertion_order_ accordingly
  insertion_order_[counter_++] = key;
  data_[key] = value;
}

設計を改善し、パフォーマンスを気にする方法はたくさんありますが、これは、この機能を独自に実装するための良い骨組みです。テンプレート化することができ、実際にはペアを値として data_ に格納して、insertion_order_ のエントリを簡単に参照できるようにすることができます。しかし、これらの設計上の問題は演習として残しておきます :-)。

更新: insert_order_ に map と vector を使用する効率について何か言うべきだと思います

データへの直接ルックアップ、どちらの場合も O(1)
ベクトルアプローチでの挿入は O(1) であり、マップアプローチでの挿入は O(logn) です。
削除する項目をスキャンする必要があるため、ベクトルアプローチでの削除は O(n) です。マップアプローチでは、それらは O(logn) です。

おそらく、削除をあまり使用しない場合は、ベクトルアプローチを使用する必要があります。挿入順序ではなく別の順序 (優先度など) をサポートしている場合は、マップアプローチの方が適しています。

score 2 · Accepted Answer

ブーストのマルチインデックスを使用せずに標準テンプレートライブラリのみを必要とするソリューションを次に示し
ますstd::map<std::string,int>;。vector <data>;マップのどこにデータの場所のインデックスをベクターに保存し、ベクターはデータを挿入順に保存します。ここで、データへのアクセスには O(log n) の複雑さがあります。挿入順序でデータを表示するには、O(n) の複雑さがあります。データの挿入の複雑さは O(log n) です。

例えば：

#include<iostream>
#include<map>
#include<vector>

struct data{
int value;
std::string s;
}

typedef std::map<std::string,int> MapIndex;//this map stores the index of data stored 
                                           //in VectorData mapped to a string              
typedef std::vector<data> VectorData;//stores the data in insertion order

void display_data_according_insertion_order(VectorData vectorData){
    for(std::vector<data>::iterator it=vectorData.begin();it!=vectorData.end();it++){
        std::cout<<it->value<<it->s<<std::endl;
    }
}
int lookup_string(std::string s,MapIndex mapIndex){
    std::MapIndex::iterator pt=mapIndex.find(s)
    if (pt!=mapIndex.end())return it->second;
    else return -1;//it signifies that key does not exist in map
}
int insert_value(data d,mapIndex,vectorData){
    if(mapIndex.find(d.s)==mapIndex.end()){
        mapIndex.insert(std::make_pair(d.s,vectorData.size()));//as the data is to be
                                                               //inserted at back 
                                                               //therefore index is
                                                               //size of vector before
                                                               //insertion
        vectorData.push_back(d);
        return 1;
    }
    else return 0;//it signifies that insertion of data is failed due to the presence
                  //string in the map and map stores unique keys
}

score 2 · Accepted Answer

マップではそれを行うことはできませんが、マップとベクターの 2 つの別個の構造を使用してそれらを同期させることができます。つまり、マップから要素を削除し、ベクターから要素を見つけて削除する場合です。または、map<string, pair<int,int>>- を作成し、レコード位置への挿入時にマップの size() を int の値とともに格納し、印刷時に位置メンバーを使用してソートすることもできます。

score 1 · Accepted Answer

// この人のようになるべきです!

// これは、挿入の複雑さが O(logN) であり、削除も O(logN) であることを維持します。

class SpecialMap {
private:
  int counter_;
  map<int, string> insertion_order_;
  map<string, int> insertion_order_reverse_look_up; // <- for fast delete
  map<string, Data> data_;
};

score 1 · Accepted Answer

これは、ファイサルの回答に多少関連しています。マップとベクターの周りにラッパークラスを作成するだけで、それらを簡単に同期させることができます。適切なカプセル化により、アクセス方法を制御できるため、ベクターまたはマップのどちらのコンテナーを使用するかを制御できます。これにより、ブーストなどの使用が回避されます。

score 1 · Accepted Answer

考慮すべきことの 1 つは、使用しているデータ要素の数が少ないことです。ベクトルのみを使用する方が高速になる可能性があります。単純なベクトルよりも小さいデータセットでルックアップを実行する方がコストが高くなる可能性があるマップのオーバーヘッドがあります。したがって、常にほぼ同じ数の要素を使用することがわかっている場合は、ベンチマークを実行して、マップとベクターのパフォーマンスが実際に考えているとおりかどうかを確認してください。要素が 50 個しかないベクトルでのルックアップは、マップとほぼ同じであることがわかる場合があります。

score 0 · Accepted Answer

広告掲載順を追跡するために別のコンテナやその他のコンテナを使用する必要はありません。std::vector以下に示すように、必要なことを行うことができます。挿入順序を維持したい場合は、次のプログラム (バージョン 1) を使用できます。

バージョン 1 :挿入順序std::map<std::string,int>で使用する一意の文字列をカウントするため

#include <iostream>
#include <map>
#include <sstream>
int findExactMatchIndex(const std::string &totalString, const std::string &toBeSearched)
{
    std::istringstream ss(totalString);
    std::string word;
    std::size_t index = 0;
    while(ss >> word)
    {
        if(word == toBeSearched)
        {
            return index;
        }
        ++index;
    }
    return -1;//return -1 when the string to be searched is not inside the inputString
}
int main() {
    std::string inputString = "this is a string containing my name again and again and again ", word;
   
   //this map maps the std::string to their respective count
    std::map<std::string, int> wordCount;
    
    std::istringstream ss(inputString);
    
    while(ss >> word)
    {
        //std::cout<<"word:"<<word<<std::endl;
    wordCount[word]++;
    }      
  
    std::cout<<"Total unique words are: "<<wordCount.size()<<std::endl;
    
    std::size_t i = 0;
    
    std::istringstream gothroughStream(inputString);
    
    //just go through the inputString(stream) instead of map
    while( gothroughStream >> word)
    {
        int index = findExactMatchIndex(inputString, word);
        
        
        if(index != -1 && (index == i)){
         std::cout << word <<"-" << wordCount.at(word)<<std::endl;
         
        }
        ++i;
    }
   
    return 0;
}

上記のプログラムの出力は次のとおりです。

Total unique words are: 9
this-1
is-1
a-1
string-1
containing-1
my-1
name-1
again-3
and-2

上記のプログラムでは、コンマまたはその他の区切り文字がある場合、それは別の単語としてカウントされることに注意してください。たとえば、this is, my name is文字列があり、文字列is,のカウントが 1 で、文字列のカウントが 1 であるとしisます。つまりis,、とisは異なります。これは、コンピューターが単語の定義を認識していないためです。

ノート

上記のプログラムは、How do i make the char in an array output in order in this nested for loop?に対する私の回答を修正したものです。これは、以下のバージョン 2 として提供されます。

バージョン 2 :挿入順序std::map<char, int>で使用して一意の文字をカウントする場合

#include <iostream>
#include <map>
int main() {
    std::string inputString;
    std::cout<<"Enter a string: ";
    std::getline(std::cin,inputString);
    //this map maps the char to their respective count
    std::map<char, int> charCount;
    
    for(char &c: inputString)
    {
        charCount[c]++;
    }
    
    std::size_t i = 0;
    //just go through the inputString instead of map
    for(char &c: inputString)
    {
        std::size_t index = inputString.find(c);
        if(index != inputString.npos && (index == i)){
         std::cout << c <<"-" << charCount.at(c)<<std::endl;
         
        }
        ++i;
    }
    return 0;
}

どちらの場合/バージョンでも、別のコンテナーまたは他のコンテナーを使用して挿入順序を追跡する必要はありません。std::vector

score -1 · Accepted Answer

-1

boost::multi_indexマップおよびリストインデックスと共に使用します。

于 2013-10-21T09:04:40.320 に答える

c++ - 挿入の順序を追跡する std::map ?

15 に答える 15

ノート

Related

Reference