multithreading - ロックはどのように実装されますか?

Question

私は次のコードを持っています:

 while(lock)
      ;   
 lock = 1;
 // critical section
 lock = 0;

ロック値の読み取りまたは変更は、それ自体がマルチ命令であるため

read lock
change value
write it

次のように発生した場合：

1) One thread reads the lock and stops there
2) Another thread reads it and sees it is free; lock it and do something untill half
3) First thread wakes up and goes into CS

では、ロックはシステムにどのように実装されますか? 変数を別の変数の上に置くのは正しくありません。

他のプロセッサのスレッドを停止することも正しくありませんか?

score 2 · Accepted Answer

It is 100% platform specific. Generally, the CPU provides some form of atomic operation such as exchange or compare and swap. A typical lock might work like this:

1) Create: Store 0 (unlocked) in the variable.

2) Lock: Atomically attempt to switch the value of the variable from 0 (unlocked) to 1 (locked). If we failed (because it wasn't unlocked to begin with), let the CPU rest a bit, and then retry. Use a memory barrier to ensure no future memory operations sneak behind this one.

3) Unlock: Use a memory barrier to ensure previous memory operations don't sneak past this one. Atomically write 0 (unlocked) to the variable.

Note that you really don't need to understand this unless you want to design your own synchronization primitives. And if you want to do that, you need to understand an awful lot more. It's certainly a good idea for every programmer to have a general idea of what he's making the hardware do. But this is an area filled with seriously heavy wizardry. There are so many, many ways this can go horribly wrong. So just use the locking primitives provided by the geniuses who made your platform, compiler, and threading library. Here be dragons.

For example, SMP Pentium Pro systems have an erratum that requires special handling in the unlock operation. A naive implementation of the lock algorithm will cause the branch prediction logic to expect the operation to keep spinning, incurring a massive performance penalty at the worst possible time -- when you first acquire the lock. A naive implementation of the lock algorithm may cause two cores each waiting for the same lock to saturate the bus, slowing the CPU that needs to get work done in order to release the lock to a crawl. These all require heavy wizardry and deep understanding of the hardware to deal with.

score 0 · Accepted Answer

私が Uni で学んだコースでは、ロックを実装するための可能なファームウェアソリューションが、プロセッサによって開始されるメモリ操作に関連付けられた「原子性ビット」の形で提示されました。

基本的に、ロックするとき、アトミックに実行する必要がある一連の操作があることに気付くでしょう: フラグの値をテストし、設定されていない場合はに設定しlocked、それ以外の場合は再試行します。このシーケンスは、CPU によって送信される各メモリ要求にビットを関連付けることによって、アトミックにすることができます。最初の N-1 操作ではビットが設定され、最後の操作ではビットが設定されず、アトミックシーケンスの終了がマークされます。

フラグデータが格納されているメモリモジュール (複数のモジュールが存在する可能性があります) がシーケンスの最初の操作の要求を受け取ると (そのビットが設定されている)、それを処理し、CPU がそれを実行するまで他の CPU からの要求を受け取りません。アトミックシーケンスを開始すると、アトミック性ビットが設定されていないリクエストが送信されます (これらのトランザクションは通常短いため、このような粗粒度のアプローチは受け入れられます)。これは通常、前に述べたことを正確に行う「比較および設定」タイプの特殊な命令を提供するアセンブラによって簡単になることに注意してください。

multithreading - ロックはどのように実装されますか?

2 に答える 2

Related

Reference