0

I have a question about using 128-bit registers to gain speed in a code. Consider the following C/C++ code: I define two unsigned long long ints a and b, and give them some values.

unsigned long long int a = 4368, b = 56480;

Then, I want to compute

a & b;

Here a is represented in the computer as a 64-bit number 4369 = 100010001001, and same for b = 56481 = 1101110010100001, and I compute a & b, which is still a 64-bit number given by the bit-by-bit logical AND between a and b:

a & b = 1000000000001

My question is the following: Do computers have a 128-bit register where I could do the operation above, but with 128-bits integers rather than with 64-bit integers, and with the same computer time? To be clearer: I would like to gain a factor two of speed in my code by using 128 bit numbers rather than 64 bit numbers, e. g. I would like to compute 128 ANDs rather than 64 ANDs (one AND for every bit) with the same computer time. If this is possible, do you have a code example? I have heard that the SSE regiters might do this, but I am not sure.

4

1 に答える 1

6

はい、SSE2 には 128 ビットのビットごとの AND があります。たとえば、C または C++ の組み込み関数を介して使用できます。

#include "emmintrin.h"          // SSE2 intrinsics

__m128i v0, v1, v2;             // 128 bit variables

v2 = _mm_and_si128(v0, v1);     // bitwise AND

または、アセンブラーで直接使用できます。命令はPAND.

AVX2 を搭載した Haswell 以降の CPU では、256 ビット AND を実行することもできます。

#include "immintrin.h"          // AVX2 intrinsics

__m256i v0, v1, v2;             // 256 bit variables

v2 = _mm256_and_si256(v0, v1);  // bitwise AND

この場合の対応命令はVPAND.

于 2013-09-10T18:53:12.327 に答える