c++ - C ++でハッシュマップの複合キーを持つ

Question

私は持っているデータ構造を持っています、

<Book title>, <Author>, and <rate>

書名や著者が重複する可能性があるので、複合キーを構築したい。（IDなどの追加の一意のキーを作成できないとしましょう）

データは非常に大きいため、速度を上げるために GCC unordered_map を使用し、次のように構造を構築しました。

typedef pair<string, string> keys_t
typedef unordered_map<keys_t, double> map_t;

一般的にはすべて問題なく動作しますが、特定のキーを参照したい場合に問題が発生します。

たとえば、「数学」というタイトルの本の中で最も評価の高い本を見つけたい、または「トルストイ」の本の平均レートを見つけたいとします。
この場合、キーペアの 1 つだけを参照することはできないため、これは非常に面倒です。

たまたま見つけboost::multi_indexたのですが、ドキュメントを理解するのに苦労しています。誰かがこれについて何らかのアイデアやガイドラインを持っていますか?

複数のインデックスを作成するためのソリューション、multi_index の簡潔な例、その他のアプローチなど。

ありがとうございました。

score 3 · Accepted Answer

boost::multi_index このコードを参照した使用方法を理解しました： MEM_FUNを使用してmulti_index複合キーをブーストする

これが私のコードです。

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/mem_fun.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/composite_key.hpp>
#include <boost/multi_index/member.hpp>
#include <iostream>
#include <string>

using namespace boost::multi_index;
using namespace std;

class Book {
public:
    Book(const string &lang1, const string &lang2, const double &value) : m_lang1(lang1) , m_lang2(lang2) , m_value(value) {}

    friend std::ostream& operator << (ostream& os,const Book& n)    {
        os << n.m_lang1 << " " << n.m_lang2 << " " << n.m_value << endl;
        return os;
    }

    const string &lang1() const { return m_lang1; }
    const string &lang2() const { return m_lang2; }
    const double &value() const { return m_value; }
private:
    string m_lang1, m_lang2;
    double m_value;
};

// These will be Tag names
struct lang1 {};
struct lang2 {};
struct value {};

typedef multi_index_container <
    Book, 
    indexed_by<
        ordered_non_unique<tag<lang1>, BOOST_MULTI_INDEX_CONST_MEM_FUN( Book, const string &, lang1)
        >,
        ordered_non_unique<tag<lang2>, BOOST_MULTI_INDEX_CONST_MEM_FUN(Book, const string &, lang2)
        >,
        ordered_non_unique<tag<value>, BOOST_MULTI_INDEX_CONST_MEM_FUN(Book, const double &, value), greater<double>
        >,
        ordered_unique<
            // make as a composite key with Title and Author
            composite_key<
                Book,
                BOOST_MULTI_INDEX_CONST_MEM_FUN(Book, const string &, lang1),
                BOOST_MULTI_INDEX_CONST_MEM_FUN(Book, const string &, lang2)
            >
        >
    >
> Book_set;

// Indices for iterators
typedef Book_set::index<lang1>::type Book_set_by_lang1;
typedef Book_set::index<lang2>::type Book_set_by_lang2;
typedef Book_set::index<value>::type Book_set_by_value;

int main() {

    Book_set books;
    books.insert(Book("Math", "shawn", 4.3));
    books.insert(Book("Math", "john", 4.2));
    books.insert(Book("Math2", "abel", 3.8));
    books.insert(Book("Novel1", "Tolstoy", 5.0));
    books.insert(Book("Novel1", "Tolstoy", 4.8)); // This will not be inserted(duplicated)
    books.insert(Book("Novel2", "Tolstoy", 4.2));
    books.insert(Book("Novel3", "Tolstoy", 4.4));
    books.insert(Book("Math", "abel", 2.5));
    books.insert(Book("Math2", "Tolstoy", 3.0));

    cout << "SORTED BY TITLE" << endl;
    for (Book_set_by_lang1::iterator itf = books.get<lang1>().begin(); itf != books.get<lang1>().end(); ++itf)
        cout << *itf;

    cout << endl<<"SORTED BY AUTHOR" << endl;
    for (Book_set_by_lang2::iterator itm = books.get<lang2>().begin(); itm != books.get<lang2>().end(); ++itm)
        cout << *itm;

    cout << endl<<"SORTED BY RATING" << endl;
    for (Book_set_by_value::iterator itl = books.get<value>().begin(); itl != books.get<value>().end(); ++itl)
        cout << *itl;

    // Want to see Tolstoy's books? (in descending order of rating)
    cout << endl;
    Book_set_by_lang2::iterator mitchells = books.get<lang2>().find("Tolstoy");
    while (mitchells->lang2() == "Tolstoy")
        cout << *mitchells++;

    return 0;
}

コメントをくださった皆様、ありがとうございます！

score 1 · Accepted Answer

同じテーマに関する記事があります：http： //marknelson.us/2011/09/03/hash-functions-for-c-unordered-containers/

著者のMarkNelsonは、「人の名前を保持するための単純なクラスまたは構造の使用」と同様のことを試みていました。基本的に、彼はunordered_mapのキーとして（あなたと同じように）ペアを使用しています。

typedef pair<string,string> Name;

int main(int argc, char* argv[])
{
    unordered_map<Name,int> ids;
    ids[Name("Mark", "Nelson")] = 40561;
    ids[Name("Andrew","Binstock")] = 40562;
    for ( auto ii = ids.begin() ; ii != ids.end() ; ii++ )
        cout << ii->first.first
        << " "
        << ii->first.second
        << " : "
        << ii->second
        << endl;
        return 0;
}

彼は、unordered_mapがstd::pairの特定のキータイプのハッシュを作成する方法を知らないことに気づきました。そこで彼は、unordered_mapで使用するハッシュ関数を作成する4つの方法を示しています。

score 1 · Accepted Answer

同様のケースで私が行ったことは、単一のコンテナを使用してオブジェクトを格納し、std::multiset<ObjectType const*, CmpType>可能なインデックスごとに分離することでした。挿入するときは、を実行しpush_back、からアドレスを復元して、back()それぞれのに挿入しstd::setます。（std::unordered_setそしてstd::unordered_multiset、あなたの場合はより良いでしょう：私の場合、順序が重要だっただけでなく、最近のコンパイラにもアクセスできませんでしたunordered_set。）

これは、オブジェクトがコンテナーに入ると不変であると想定していることに注意してください。それらの 1 つを変更する場合は、おそらくすべてのセットからそれを抽出し、変更を行ってから再挿入する必要があります。

これは、メインコンテナータイプがオブジェクトへのポインターと参照を無効にしないことも前提としています。私の場合、事前に最大サイズを知っていたので、reserve()and を使用できましたstd::vector。これに失敗すると、主 (完全) キーにをstd::deque使用するか、単純にを使用できます。std::map

これでも、キー内の完全な要素にアクセスする必要があります。これで十分かどうかは、あなたの投稿からは明らかではありません。「数学というタイトルの本」では、タイトルの部分文字列検索が必要になる可能性があると思います (「トルストイ」は「レオトルストイ」と一致する必要がありますか?)。任意の部分文字列に一致させたい場合は、マルチセットが非常に大きくなるか (可能なすべての部分文字列をエントリとして挿入するため)、線形検索を実行します。(エントリが変更されていない長期実行システムでは、妥協する価値があるかもしれません: 部分文字列が最初に要求されたときに線形検索を行いますが、結果をマルチセットにキャッシュして、次回はそれらを見つけることができるようにしますタイトルに「math」が含まれる本には「math」など、同じ部分文字列を使用することがよくあります。)

score -1 · Accepted Answer

頻度の低い操作の場合は、値を検索できます。

for(auto& p : m)
{
     if(p.second.name==name_to_find)
     {
          //you now have the element
     }
}

ただし、マップが大きい場合、O(log n) ではなく線形手順になるため、これは問題になります。マップは本質的に遅いため、これは問題です。

c++ - C ++でハッシュマップの複合キーを持つ

4 に答える 4

Related

Reference