c++ - C++ で Hashmap を実装する :: テンプレート化されたデータ型のハッシュ関数

Question

最近、STL の unordered_map を使用していますが、うまく機能しているように見えますが、データ型がテンプレートパラメーターとして指定されている場合、ハッシュ関数がどのように機能するかはよくわかりません。このデータ構造をより完全に理解するために、私は独自の小さな Hashmap クラスを C++ で実装しました。

ハッシュマップインターフェイス:

#ifndef _HASHMAP_H_
#define _HASHMAP_H_

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <iostream>
#include <vector.h>


//Beginning of Hashmap class definition

template <class Key, class Value>
class Hashmap{
private:

int mappedElementCount;



public:
explicit Hashmap();
virtual ~Hashmap();


virtual void test();

virtual int hash(Key*);

int* getSize();

void putKVPair(Key*,Value*);

void clearMap();


//When we use these methods, we'll want a linear vector of keys and values to 
    //iterate over, so vector is good here
std::vector<Key>* getKeys();
std::vector<Value>* getValues();

}; //end hashmap class definition
#endif /*_HASHMAP_H_*/

ハッシュマップの実装:

#include "Hashmap.h"

template<class Key,class Value> Hashmap<Key,Value>::Hashmap(){
mappedElementCount = 0;
}
template<class Key,class Value> Hashmap<Key,Value>::~Hashmap(){
printf("\nDestroying the base Hashmap object!\n");
}

template<class Key,class Value> void Hashmap<Key,Value>::test(){
printf("The size of our Key is %i and the size of our Value is
    %i\n",sizeof(Key),sizeof(Value));
}


template<class Key,class Value> int Hashmap<Key,Value>::hash(Key* k_ptr){

    unsigned int hashval;

    /* we start our hash out at 0 */
    hashval = 0;

        //TODO: How do we generate a hash signature when we don't know what data type 
        //we're going to be working with?

    return hashval % mappedElementCount;

}

template<class Key,class Value> std::vector<Key>* Hashmap<Key,Value>::getKeys(){
//TODO: prepare a vector initialized with all Key objects and return it here
return keys;    
}

template<class Key,class Value> std::vector<Value>* Hashmap<Key,Value>::getValues(){
//TODO: prepare a vector initialized with all Value objects and return it here
return values;  
}

template<class Key,class Value> int* Hashmap<Key,Value>::getSize(){
return &mappedElementCount;
}

template<class Key,class Value> void Hashmap<Key,Value>::putKVPair(Key* k, Value* v){
    //TODO: implement hashing of the key object k to determine
    //the address of the value object v

    //first step, generate a hash from our key
    int tempHash = hash(k);

       //TODO: store the Value at an address given by or influenced by tempHash

    //If all was successfully completed, increment the mapped records counter
    mappedElementCount++;
}



template<class Key,class Value> void Hashmap<Key,Value>::clearMap(){
//TODO: implement a cascading chain of deallocation of stored objects within the 
    //hashmap
//MAYBE-- only if we create new objects rather than just mapping reference 
    //associations,
//which is really the goal here...  In the latter case, just empty the Hashmap 
    //itself
}

この問題に対処するために考えられる OOP メソッドの 1 つは、Hashmap を基本クラスとして使用し、次の Stringmap などの既知の Key データ型を持つ派生クラスを提供することです。

文字列マップインターフェイス:

#ifndef _STRINGMAP_H_
#define _STRINGMAP_H_

#include "Hashmap.h"

template <class Value>
class Stringmap:public Hashmap<std::string,Value>{
private:

public:
//Con/de 'structors
explicit Stringmap();
~Stringmap();

//Here we know our Key will be of type std::string
//so we can generate our hash sig by char values
    //Override hash from the base class
int hash(std::string*);

//override test from base class
void test();


};
#endif /*_STRINGMAP_H_ def*/

文字列マップの実装:

#include "Stringmap.h"

template<class Value> Stringmap<Value>::Stringmap():Hashmap<std::string,Value>(){

}
template<class Value> Stringmap<Value>::~Stringmap(){
printf("\nDestroying the derived stringmap object!\n");
}

template<class Value> void Stringmap<Value>::test(){
printf("The size of our Value is %i\n",sizeof values[0]);
}

template<class Value> int Stringmap<Value>::hash(std::string* str_ptr){

    unsigned int hashval;

    /* we start our hash out at 0 */
    hashval = 0;


    /* for each character, we multiply the old hash by 31 and add the current
     * character.  Remember that shifting a number left is equivalent to
     * multiplying it by 2 raised to the number of places shifted.  So we
     * are in effect multiplying hashval by 32 and then subtracting hashval.
     * Why do we do this?  Because shifting and subtraction are much more
     * efficient operations than multiplication.
     */
    for(int i=0;i<str_ptr->length();i++) {
        hashval = (*(str_ptr))[i] + ((hashval << 5) - hashval);
    }

    /* we then return the hash value mod the hashmap size so that it will
     * fit into the necessary range
     */
    return hashval % (*(Hashmap<std::string,Value>::getSize()));

}

質問は次のとおりです。ハッシュされるデータ型が現在不明な場合にハッシュ署名を作成することは可能ですか? もしそうなら、どのように？std::hash のドキュメントを見ると、C++ 標準では単純に各プリミティブデータ型と T* (任意の型 T) に対してハッシュ関数が定義されているように見えます...欠けているのは、指定されたデータに対してこのハッシュがどのように実装されているかです。プリミティブデータ型と、それがジェネリック T* でどのように実装されているかについて、さらに詳しく説明します。hash(Key) を呼び出して最善を尽くすことができると思いますが、舞台裏で何が起こっているのかを理解するのは良いことです。

ありがとう、CCJ

score 5 · Accepted Answer

std::unorderd_map2つの明示的なテンプレートパラメーター（、、KeyおよびValue）を取り、多数の非表示のテンプレートパラメーターもあります。これらのパラメーターのハッシュ関数はデフォルトで。になっていstd::hash<Key>ます。

このSTLハッシュ関数は、をstd::hash<Key>取り、Keyを返しますstd::size_t。すでにすべての整数型とに特化していますstd::string。このリファレンスサイトから

ハッシュテンプレートは、ハッシュ関数を実装する関数オブジェクトを定義します。この関数オブジェクトのインスタンスは、次のようなoperator（）を定義します。

Keyタイプの単一のパラメーターを受け入れます。

パラメータのハッシュ値を表すsize_t型の値を返します。

呼び出されたときに例外をスローしません。

等しい2つのパラメーターk1とk2の場合、std :: hash（）（k1）== std :: hash（）（k2）。

等しくない2つの異なるパラメーターk1とk2の場合、std :: hash（）（k1）== std :: hash（）（k2）の確率は非常に小さく、1.0 / std :: neuro_limits::maxに近づくはずです。（）。

ハッシュテンプレートは、CopyConstructibleとDestructibleの両方です。順序付けされていない連想コンテナstd::unordered_set、std :: unordered_multiset、std :: unordered_map、std :: unordered_multimapは、デフォルトのハッシュ関数としてテンプレートstd::hashの特殊化を使用します。

参照はこの引用で終わります：

** 実際のハッシュ関数は実装に依存しており、上記で指定されたものを除く他の品質基準を満たす必要はありません。 ****

したがって、システムの実装を確認することはできますが、他のシステムの実装については何も保証されません。

score 3 · Accepted Answer

さまざまなタイプに特化したstd::hash<T>テンプレートがあり、独自のタイプに特化できます。

デフォルトでは、std::unordered_map<T>ハッシュをに委任するだけですstd::hash<T>（または、テンプレート引数として別のハッシュ関数を指定できます）。

したがってstd::unordered_map、ハッシュの仕組みについて何も知る必要はありません。

std::hash実装方法については、指定されていません。ただし、適切なコンパイラが高品質の実装を提供すると想定するのは合理的だと思います。覚えておくべき1つの落とし穴はstd::hash<char*>、C文字列をハッシュせず、ポインタの値をハッシュするだけであるということです（そこにあります:)）

c++ - C++ で Hashmap を実装する :: テンプレート化されたデータ型のハッシュ関数

2 に答える 2

Related

Reference