performance - mongodb の更新により読み取りが極端に遅くなる

Question

私はmongoDBに比較的慣れていません。2 つのレプリカセットを持つシャード mongo クラスターをセットアップしました。それぞれがシャードにセットされています。->モンゴデーモン4体

デーモンは、それぞれ 8 GB のRAM を搭載した 2 つの WIN サーバーに分散されています。10 mio のドキュメント (ドキュメントあたり約 600 バイト) を含むテストコレクションがあり、C# ドライバーを使用して mongos (primaryPreferred) に接続しています。

ここで、シャードキーに対して数千回の単一読み取りクエリを実行すると、mongo がますます多くのメモリを消費し、約 7.2 GB で停止することがわかります。ページフォールトはほとんど発生せず、クエリは非常に高速です。良い！異なるドキュメントプロパティに対するより複雑なクエリでも同じです (これらのクエリの複合インデックスが存在します)。

しかし

ほんの数回の更新クエリを実行すると、メモリ使用量が大幅に減少しました... mongoが 3GB の RAMをすぐに解放し、非常に高速な読み取りクエリが非常に遅くなるように。

500k アップサート (保存) を続けて起動すると、さらに悪化します。 実行に 2 秒ほどかかっていた複雑なクエリが、今では 22 分かかります。

同じクエリパラメータを持つ Count-Queries でも同じ動作が得られます。

それはかなり正常な mongoDB の動作ですか、それとも設定し忘れたものがありますか?

--- 更新 @hwatkins

MongoDB バージョン: 2.2.2
1 回の upsert でスキャンされた 1 つのドキュメント
一括アップサート中に非常に高いディスクアクティビティがあります

アップサート前の複雑なcount-クエリの Explain()

Count Explain: { "clusteredType" : "ParallelSort", "shards" : { "set1/xxxx:1234,yyyy:1234" : [{ "cursor" : "BtreeCursor AC", "isMultiKey" : false, "n" : 20799, "nscannedObjects" : 292741, "nscanned" : 292741, "nscannedObjectsAllPlans" : 294290, "nscannedAllPlans" : 294290, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 2, "nChunkSkips" : 0, "millis" : 2382, "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] }, "allPlans" : [{ "cursor" : "BtreeCursor AC", "n" : 20795, "nscannedObjects" : 292741, "nscanned" : 292741, "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] } }, { "cursor" : "BasicCursor", "n" : 4, "nscannedObjects" : 1549, "nscanned" : 1549, "indexBounds" : { } }], "oldPlan" : { "cursor" : "BtreeCursor AC", "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] } }, "server" : "xxxx:1234" }], "set2/xxxx:56789,yyyy:56789" : [{ "cursor" : "BtreeCursor AC", "isMultiKey" : false, "n" : 7000, "nscannedObjects" : 97692, "nscanned" : 97692, "nscannedObjectsAllPlans" : 98941, "nscannedAllPlans" : 98941, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 729, "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] }, "allPlans" : [{ "cursor" : "BtreeCursor AC", "n" : 6996, "nscannedObjects" : 97692, "nscanned" : 97692, "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] } }, { "cursor" : "BasicCursor", "n" : 4, "nscannedObjects" : 1249, "nscanned" : 1249, "indexBounds" : { } }], "oldPlan" : { "cursor" : "BtreeCursor AC", "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] } }, "server" : "yyyy:56789" }] }, "cursor" : "BtreeCursor AC", "n" : 27799, "nChunkSkips" : 0, "nYields" : 2, "nscanned" : 390433, "nscannedAllPlans" : 393231, "nscannedObjects" : 390433, "nscannedObjectsAllPlans" : 393231, "millisShardTotal" : 3111, "millisShardAvg" : 1555, "numQueries" : 2, "numShards" : 2, "millis" : 2384 }

同じクエリのアップサート後の Explain()

{ "clusteredType" : "ParallelSort", "shards" : { "set1/xxxx:1234,yyyy:1234" : [{ "cursor" : "BtreeCursor AC", "isMultiKey" : false, "n" : 20799, "nscannedObjects" : 292741, "nscanned" : 292741, "nscannedObjectsAllPlans" : 294290, "nscannedAllPlans" : 294290, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 379, "nChunkSkips" : 0, "millis" : 391470, "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] }, "allPlans" : [{ "cursor" : "BtreeCursor AC", "n" : 20795, "nscannedObjects" : 292741, "nscanned" : 292741, "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] } }, { "cursor" : "BasicCursor", "n" : 4, "nscannedObjects" : 1549, "nscanned" : 1549, "indexBounds" : { } }], "server" : "xxxx:1234" }], "set2/xxxx:56789,yyyy:56789" : [{ "cursor" : "BtreeCursor AC", "isMultiKey" : false, "n" : 7000, "nscannedObjects" : 97692, "nscanned" : 97692, "nscannedObjectsAllPlans" : 98941, "nscannedAllPlans" : 98941, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 910, "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] }, "allPlans" : [{ "cursor" : "BtreeCursor AC", "n" : 6996, "nscannedObjects" : 97692, "nscanned" : 97692, "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] } }, { "cursor" : "BasicCursor", "n" : 4, "nscannedObjects" : 1249, "nscanned" : 1249, "indexBounds" : { } }], "oldPlan" : { "cursor" : "BtreeCursor AC", "indexBounds" : { "f.14.b" : [["A", "A"]], "f.500.b" : [[10, 50]] } }, "server" : "yyyy:56789" }] }, "cursor" : "BtreeCursor AC", "n" : 27799, "nChunkSkips" : 0, "nYields" : 379, "nscanned" : 390433, "nscannedAllPlans" : 393231, "nscannedObjects" : 390433, "nscannedObjectsAllPlans" : 393231, "millisShardTotal" : 392380, "millisShardAvg" : 196190, "numQueries" : 2, "numShards" : 2, "millis" : 391486 }

ところで: * 1 回のアップサート (影響を受ける 1 つのドキュメント) により、メモリ使用量が約 600MB 減少します。--> ~ 4,5GB のメモリ使用量は、いくつかのクエリの後にのみ到達しました。

上記のクエリを取得し、mongoCursor を使用して結果セットをループすると、時間がかかります... (クエリは入力しても実行されます) :(

更新II @ダニエル

ここでは、mongoDB-Cluster に保存されたサンプルドキュメントを取得します。 The Shard Key is the b -my doc のプロパティ (電話番号に対応)

Upsert: シャードキーで既存のドキュメントを検索し、f - 配列のいくつかのプロパティを更新します。次に、mongoDB ドライバーで、これらすべてのドキュメントに対して 1 つずつ (50 万回のように) Save を呼び出します。

インデックスがあります: { "f.14.b" : 1, "f.500.b" : 1 } このインデックスは、複雑なクエリに使用されます。上記のように、これらのクエリは一括更新前は高速で、更新後は非常に遅くなります。

   {
  "_id" : ObjectId("51248d6xxxxxxxxxxxxx"),
  "b" : "33600000000",
  "f" : {
    "500" : {
      "a" : ISODate("2013-02-20T08:45:38.075Z"),
      "b" : 91
    },
    "14" : {
      "a" : ISODate("2013-02-20T08:45:38.075Z"),
      "b" : "A"
    },
    "1501" : {
      "a" : ISODate("2013-02-20T08:45:38.141Z"),
      "b" : ["X", "Y", "Z"]
    },
    "2000" : {
      "a" : ISODate("2013-02-20T08:45:38.141Z"),
      "b" : false
    }
  }
}

どうもありがとう、ブルーム

score 0 · Accepted Answer

どのバージョンのmongodbを使用していますか？
.explain()アップサートを実行するときに、スキャンしているドキュメントの数を確認するためにアップサートを実行できますか。
アップサート中のディスクioはどのように見えますか

score 0 · Accepted Answer

これは面白い。まず、データが均等に分散されていないようです。あなたの説明は、最初のセットで nscanned: 292741 を示し、2番目のセットで nscanned: 97692 を示しています。かなり大きな違い。また、最初のセット nyields:379 と 2 番目のセット nyields:0 にも表示されます。これは、セットから不均一に読み取っているだけで、おそらくそれらに不均一に書き込んでいることを意味します。分散がより均一なシャードキーを選択すると、クラスターからより多くの情報が得られます。

アップサートでこれが具体的に起こっている理由については、既存のドキュメントにさらにデータを追加していますか? もしそうなら、あなたはおそらく文書移動の被害者です。mongodb ログに、moved: 1 のクエリが表示されますか? これは、ログ内の遅いクエリがディスク上でドキュメントの移動を行い、配列/サブドキュメントへのインデックスで多くの混乱を引き起こしたことを意味します。Mongodb は、ドキュメントが移動した場合、基本的にドキュメント全体でインデックスを再構築する必要があり、すべてのインデックスをサブドキュメント/配列に大幅に更新する必要があると思います。

ドキュメント移動の回避策は、ドキュメントの作成時に余分なデータを事前に割り当て、すぐにドキュメントから削除することです。Mongo は、固定スペース + パディングファクターを使用してすべてのドキュメントをディスクに割り当てます。それらがスペースを超えて大きくなった場合は、ディスク上のより大きな領域に移動する必要があります。すでに余分なデータを含むドキュメントを作成して削除した場合、ドキュメントの増大に対応するために、ディスク上に多くの余分なパディングが必要になります。これは確かにスペースを浪費する可能性がありますが、パフォーマンスを大幅に節約できます。

performance - mongodb の更新により読み取りが極端に遅くなる

2 に答える 2

Related

Reference