java - Java TreeMap の高速な代替手段が必要速度を落とさずに多くのマッピングを保持できる

Question

TreeMap を使用する Java プログラムを作成していますが、数万の整数、文字マッピングがあるとパフォーマンスが低下します。

int および char プリミティブを使用でき、「headMap」および「tailMap」関数のようなものを持つ、ある種のソート済みセット実装の実装があるかどうか疑問に思っていました。

私は現在Troveを見ています。また、挿入ソートを使用するリンクリストの実装についても調べましたが、head 関数と tail 関数は含まれていません。挿入ソートを使用した連結リストはツリーよりも遅いと思いますが、そうではありませんか?

score 2 · Accepted Answer

のようなものの代わりを探していてTreeMap<Integer,Character>、整数キーが密集している場合は、配列が最も効率的です。ただし、-key に応じてを検索したいのでchar[]、 an ではなくa になります。それから私は「ゲノム」について何か読んだ?! Adenin、Guanin、Cytosin、および Thymin (私はその専門家ではありません) を表すためにを使用すると仮定すると、それぞれ 16 ビットかかることを思い出してください。おそらく、次のようなことができますint[]charintcharchar

...
public static final byte UNDEF = (byte)-1;
public static final byte ADENIN = 0;
public static final byte GUANIN = 1;
public static final byte CYTOSIN = 2;
public static final byte THYMIN = 3;
...
private byte[] genome = new byte[ 26000000 ]; // or which size ever
...

そして、これでも大量のメモリを消費する場合は、注意が必要です: UNDEF4 つの値に対して 2 ビットしか必要としない値を必要としないとします。約6.5MB。しかし、そのようなことについては、少しいじる必要があります...

score 1 · Accepted Answer

If I have understood the question, you want a data structure that preserves the order of the keys, that is, the position of the char that replaces the one in the reference sequence for an individual.

I am assuming that you process the items by increasing position order.

Now, since a TreeMap is implementing a Red-Black Tree, it has logarithmic complexity for the basic operations.

If you just need to iterate the sequence in order, you are taking a serious performance hit on each insert.

If my assumptions are correct, I would say you may use a LinkedHashMap.

As the javadoc explains:

This implementation spares its clients from the unspecified, generally chaotic ordering provided by HashMap (and Hashtable), without incurring the increased cost associated with TreeMap.

Meaning that you can iterate over your elements in the same order you entered them, but the basic operations have the same complexity as a normal HashMap, with a performance hit due to the linked list handling.

You may picture this as an HashMap traversed by a double-linked list connecting the keys in the order they were inserted.

Please note that I am not addressing the fact that your sequence fits in memory or not. Also, be aware that a LinkedHashMap will take more memory than a simple HashMap.

score 0 · Accepted Answer

より高速な Map 実装が必要な場合は、HashMapを検討しましたか? これはまだオブジェクトを使用しますが、最初に作成された場合 (前のリンクのコンストラクターの 3 番目の形式を参照)、十分な容量があれば、TreeMap.

または、マップ内の SortedSet のような動作のみに関心がある場合は、TreeSetを使用してパフォーマンスを向上させることができる場合があります。

Trove に関しては、私はよく知りませんが、データから何が必要かを調べるという少しの余分な努力だけで、サードパーティのライブラリに頼るのではなく、Java が提供するクラスから大幅なパフォーマンスの向上を得ることができると思います。構造と、必要のない機能を提供するために無駄になっている余分な作業。

score 0 · Accepted Answer

Steve が書いているように、TreeMap が原因であることをプロファイラーを使用して確認する価値があるかもしれません。

他のいくつかのオプションは次のとおりです。

HashMap大きな_initialCapacity
キーセットが密集している場合は、int[]. それが最速になります。

score 0 · Accepted Answer

PriorityQueue を調べましたか? いくつかの便利なメソッドがあり、定義したコンパレータに応じて要素を並べ替えます。

score 0 · Accepted Answer

それがパフォーマンスのボトルネックやメモリの問題であることがわかっている場合は、 trove の使用を検討しTIntCharHashMapます。過去に、パフォーマンスを向上させ、メモリ消費を削減するためにトローブマップを使用しました。

キーはソートされないことに注意してください。しかし、int[]非常に安価にキーを取得できるので、ソートすることができます。したがって、ソートされたトラバーサルがたまにしか必要ない場合は、必要に応じてソートできます。

醜い（またはパフォーマンスを妨げる）ことがわかった場合は、をラップして独自のソート済みマップにTIntCharHashMapソートint[]できます。不変式を自分で維持するだけで済みます。

trove がツリーベースの順序を維持するマップ/セットクラスを直接提供しないのは少し残念ですが、提供されるツールには感謝しています。

java - Java TreeMap の高速な代替手段が必要速度を落とさずに多くのマッピングを保持できる

9 に答える 9

Related

Reference