c# - マルチスレッドを使用した SQL Server の単一テーブルからの読み取り

Question

複数のスレッドを使用して SQL Server の単一のテーブルからの読み取りに取り組み、c# を使用して別のスレッドで同じレコードを 2 回読み取らないようにする最善の方法

事前にご協力いただきありがとうございます

score 3 · Accepted Answer

データの取得を高速化するためにテーブルからレコードを並行して読み取ろうとしていますか、それとも同じデータにアクセスするスレッドによるデータの破損が心配ですか?

MsSQL のようなデータベース管理システムは同時実行性を非常にうまく処理するため、同じテーブルを読み取る複数のスレッドがある場合、その点でのスレッドセーフはコードで考慮する必要はありません。

オーバーラップせずにデータを並行して読み取りたい場合は、ページングを使用して SQL コマンドを実行し、各スレッドに異なるページをフェッチさせることができます。20 のスレッドすべてが一度に 20 の異なるページを読み込んで、同じ行を読み込んでいないことが保証されると言うことができます。その後、データを連結できます。ページサイズが大きいほど、スレッドの作成によってパフォーマンスが向上します。

ページングを実装する効率的な方法

score 0 · Accepted Answer

SQL Server に依存していると仮定すると、SQL Server Service Broker の機能を調べて、キューイングを提供することができます。覚えておくべきことの 1 つは、現在 SQL Server Service Broker は SQL Azure で利用できないため、Azure クラウドに移行する計画がある場合、これは問題になる可能性があるということです。

とにかく - SQL Server Service Broker を使用すると、同時アクセスはデータベースエンジンレイヤーで管理されます。もう 1 つの方法は、データベースを読み取り、メッセージを入力としてスレッドにディスパッチする 1 つのスレッドを用意することです。これは、データベースでトランザクションを使用してメッセージが 2 回読み取られないようにするよりも少し簡単です。

私が言ったように、SQL Server Service Broker はおそらく進むべき道です。または、適切な外部キューイングメカニズム。

score 0 · Accepted Answer

Solution 1:
I am assuming that you are attempting to process or extract data from a large table. If I were assigned this task I would first look at paging . If you are trying to split work among threads that is. So Thread 1 handles pages 0 to 10, Thread 2 handles pages 11 to 20, etc... or you could batch rows using the actual rownumber. So in your stored proc you would do this;

WITH result_set AS (
  SELECT
    ROW_NUMBER() OVER (ORDER BY <ordering>) AS [row_number],
    x, y, z
  FROM
    table
  WHERE
    <search-clauses>
) SELECT
  *
FROM
  result_set
WHERE
  [row_number] BETWEEN @IN_Thread_Row_Start AND @IN_Thread_Row_End;

Another choice which would be more efficient is if you have a natural key, or a darn good surrogate, is to page using that and have the thread pass in the key parameters rather than the records it is interested in ( or page numbers ).

Immediate concerns with this solution would be:

ROW_NUMBER performance
CTE Performance (I believe they are stored in memory)

So if this was my problem to resolve I would look at paging via a key.

Solution 2:
The second solution would be to mark the rows as they are processing, virtually locking them, that is if you have data writer permission. So your table would have a field called Processed or Locked, as the rows are selected by your thread, they are updated as Locked = 1;

Then your select from other threads selects only rows that aren't locked. When your process is done and all rows are processed you could reset the lock.

Hard to say what will perform best w.o some trials... GL

c# - マルチスレッドを使用した SQL Server の単一テーブルからの読み取り

4 に答える 4

Related

Reference