16

ベクトルの正規化を高速化するために、Java にFast Inverse Square Rootを実装しようとしています。ただし、Java で単精度バージョンを実装すると1F / (float)Math.sqrt()、最初とほぼ同じ速度が得られますが、すぐに半分の速度に低下します。これは興味深いことです。なぜなら、Math.sqrt は (私が推測するに) ネイティブな方法を使用していますが、これには浮動小数点除算が含まれており、非常に遅いと聞いています。数値を計算するための私のコードは次のとおりです。

public static float fastInverseSquareRoot(float x){
    float xHalf = 0.5F * x;
    int temp = Float.floatToRawIntBits(x);
    temp = 0x5F3759DF - (temp >> 1);
    float newX = Float.intBitsToFloat(temp);
    newX = newX * (1.5F - xHalf * newX * newX);
    return newX;
}

私が書いた短いプログラムを使用して、1,600 万回ごとに反復し、結果を集計して繰り返すと、次のような結果が得られます。

1F / Math.sqrt() took 65209490 nanoseconds.
Fast Inverse Square Root took 65456128 nanoseconds.
Fast Inverse Square Root was 0.378224 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 64131293 nanoseconds.
Fast Inverse Square Root took 26214534 nanoseconds.
Fast Inverse Square Root was 59.123647 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 27312205 nanoseconds.
Fast Inverse Square Root took 56234714 nanoseconds.
Fast Inverse Square Root was 105.895914 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 26493281 nanoseconds.
Fast Inverse Square Root took 56004783 nanoseconds.
Fast Inverse Square Root was 111.392402 percent slower than 1F / Math.sqrt()

両方でほぼ同じ速度の数値を一貫して取得し、続いて高速逆平方根が必要な時間の約 60% を節約する反復が1F / Math.sqrt()続き、続いて高速逆平方根が実行されるのに約 2 倍の時間がかかる反復が続きます。制御。なぜ FISR が同じ -> 60% 速い -> 100% 遅くなるのか、私は混乱しています。これは、プログラムを実行するたびに発生します。

編集:上記のデータは、Eclipseで実行したときのものです。プログラムを実行すると、javac/javaまったく異なるデータが得られます。

1F / Math.sqrt() took 57870498 nanoseconds.
Fast Inverse Square Root took 88206794 nanoseconds.
Fast Inverse Square Root was 52.421004 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 54982400 nanoseconds.
Fast Inverse Square Root took 83777562 nanoseconds.
Fast Inverse Square Root was 52.371599 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 21115822 nanoseconds.
Fast Inverse Square Root took 76705152 nanoseconds.
Fast Inverse Square Root was 263.259133 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 20159210 nanoseconds.
Fast Inverse Square Root took 80745616 nanoseconds.
Fast Inverse Square Root was 300.539585 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 21814675 nanoseconds.
Fast Inverse Square Root took 85261648 nanoseconds.
Fast Inverse Square Root was 290.845374 percent slower than 1F / Math.sqrt()

EDIT2:いくつかの応答の後、数回の反復後に速度が安定するように見えますが、安定する数は非常に不安定です。誰でも理由がわかりますか?

これが私のコードです(正確には簡潔ではありませんが、ここにすべてがあります):

public class FastInverseSquareRootTest {

    public static FastInverseSquareRootTest conductTest() {
        float result = 0F;
        long startTime, endTime, midTime;
        startTime = System.nanoTime();
        for (float x = 1F; x < 4_000_000F; x += 0.25F) {
            result = 1F / (float) Math.sqrt(x);
        }
        midTime = System.nanoTime();
        for (float x = 1F; x < 4_000_000F; x += 0.25F) {
            result = fastInverseSquareRoot(x);
        }
        endTime = System.nanoTime();
        return new FastInverseSquareRootTest(midTime - startTime, endTime
                - midTime);
    }

    public static float fastInverseSquareRoot(float x) {
        float xHalf = 0.5F * x;
        int temp = Float.floatToRawIntBits(x);
        temp = 0x5F3759DF - (temp >> 1);
        float newX = Float.intBitsToFloat(temp);
        newX = newX * (1.5F - xHalf * newX * newX);
        return newX;
    }

    public static void main(String[] args) throws Exception {
        for (int i = 0; i < 7; i++) {
            System.out.println(conductTest().toString());
        }
    }

    private long controlDiff;

    private long experimentalDiff;

    private double percentError;

    public FastInverseSquareRootTest(long controlDiff, long experimentalDiff) {
        this.experimentalDiff = experimentalDiff;
        this.controlDiff = controlDiff;
        this.percentError = 100D * (experimentalDiff - controlDiff)
                / controlDiff;
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        sb.append(String.format("1F / Math.sqrt() took %d nanoseconds.%n",
                controlDiff));
        sb.append(String.format(
                "Fast Inverse Square Root took %d nanoseconds.%n",
                experimentalDiff));
        sb.append(String
                .format("Fast Inverse Square Root was %f percent %s than 1F / Math.sqrt()%n",
                        Math.abs(percentError), percentError > 0D ? "slower"
                                : "faster"));
        return sb.toString();
    }
}
4

3 に答える 3

10

JIT オプティマイザーが呼び出しを破棄したようMath.sqrtです。

あなたの変更されていないコードで、私は得ました

1F / Math.sqrt() took 65358495 nanoseconds.
Fast Inverse Square Root took 77152791 nanoseconds.
Fast Inverse Square Root was 18,045544 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 52872498 nanoseconds.
Fast Inverse Square Root took 75242075 nanoseconds.
Fast Inverse Square Root was 42,308531 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 23386359 nanoseconds.
Fast Inverse Square Root took 73532080 nanoseconds.
Fast Inverse Square Root was 214,422951 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 23790209 nanoseconds.
Fast Inverse Square Root took 76254902 nanoseconds.
Fast Inverse Square Root was 220,530610 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 23885467 nanoseconds.
Fast Inverse Square Root took 74869636 nanoseconds.
Fast Inverse Square Root was 213,452678 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 23473514 nanoseconds.
Fast Inverse Square Root took 73063699 nanoseconds.
Fast Inverse Square Root was 211,260168 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 23738564 nanoseconds.
Fast Inverse Square Root took 71917013 nanoseconds.
Fast Inverse Square Root was 202,954353 percent slower than 1F / Math.sqrt()

の時間は一貫して遅くfastInverseSquareRoot、その時間はすべて同じ球場にありますが、Math.sqrt通話はかなり高速化されています。

Math.sqrtへの呼び出しを回避できないようにコードを変更すると、

    for (float x = 1F; x < 4_000_000F; x += 0.25F) {
        result += 1F / (float) Math.sqrt(x);
    }
    midTime = System.nanoTime();
    for (float x = 1F; x < 4_000_000F; x += 0.25F) {
        result -= fastInverseSquareRoot(x);
    }
    endTime = System.nanoTime();
    if (result == 0) System.out.println("Wow!");

私は得た

1F / Math.sqrt() took 184884684 nanoseconds.
Fast Inverse Square Root took 85298761 nanoseconds.
Fast Inverse Square Root was 53,863804 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 182183542 nanoseconds.
Fast Inverse Square Root took 83040574 nanoseconds.
Fast Inverse Square Root was 54,419278 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 165269658 nanoseconds.
Fast Inverse Square Root took 81922280 nanoseconds.
Fast Inverse Square Root was 50,431143 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 163272877 nanoseconds.
Fast Inverse Square Root took 81906141 nanoseconds.
Fast Inverse Square Root was 49,834815 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 165314846 nanoseconds.
Fast Inverse Square Root took 81124465 nanoseconds.
Fast Inverse Square Root was 50,927296 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 164079534 nanoseconds.
Fast Inverse Square Root took 80453629 nanoseconds.
Fast Inverse Square Root was 50,966689 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 162350821 nanoseconds.
Fast Inverse Square Root took 79854355 nanoseconds.
Fast Inverse Square Root was 50,813704 percent faster than 1F / Math.sqrt()

の場合はかなり遅くなりMath.sqrt、 の場合はわずかに遅くなりますfastInverseSqrt(今では各反復で減算を行う必要がありました)。

于 2013-05-14T19:37:51.983 に答える
0

私の jit には、高速化するための 2 つのステップがありました。1 つ目はおそらくアルゴリズムの最適化で、2 つ目はアセンブリの最適化です。

1F / Math.sqrt() took 78202645 nanoseconds.
Fast Inverse Square Root took 79248400 nanoseconds.
Fast Inverse Square Root was 1,337237 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 76856008 nanoseconds.
Fast Inverse Square Root took 24788247 nanoseconds.
Fast Inverse Square Root was 67,747158 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 24162119 nanoseconds.
Fast Inverse Square Root took 70651968 nanoseconds.
Fast Inverse Square Root was 192,407996 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 24163301 nanoseconds.
Fast Inverse Square Root took 70598983 nanoseconds.
Fast Inverse Square Root was 192,174414 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 24201621 nanoseconds.
Fast Inverse Square Root took 70667344 nanoseconds.
Fast Inverse Square Root was 191,994259 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 24219835 nanoseconds.
Fast Inverse Square Root took 70698568 nanoseconds.
Fast Inverse Square Root was 191,903591 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 24231663 nanoseconds.
Fast Inverse Square Root took 70633991 nanoseconds.
Fast Inverse Square Root was 191,494608 percent slower than 1F / Math.sqrt()
于 2013-05-14T19:30:01.290 に答える
0

投稿されたコードの私の出力は次のとおりです。

1F / Math.sqrt() took 165769968 nanoseconds.
Fast Inverse Square Root took 251809517 nanoseconds.
Fast Inverse Square Root was 51.902977 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 162953919 nanoseconds.
Fast Inverse Square Root took 251212721 nanoseconds.
Fast Inverse Square Root was 54.161816 percent slower than 1F / Math.sqrt()

1F / Math.sqrt() took 161524902 nanoseconds.
Fast Inverse Square Root took 36242909 nanoseconds.
Fast Inverse Square Root was 77.562030 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 162289014 nanoseconds.
Fast Inverse Square Root took 36552036 nanoseconds.
Fast Inverse Square Root was 77.477196 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 163157620 nanoseconds.
Fast Inverse Square Root took 36152720 nanoseconds.
Fast Inverse Square Root was 77.841844 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 162511997 nanoseconds.
Fast Inverse Square Root took 36426705 nanoseconds.
Fast Inverse Square Root was 77.585221 percent faster than 1F / Math.sqrt()

1F / Math.sqrt() took 162302698 nanoseconds.
Fast Inverse Square Root took 36797410 nanoseconds.
Fast Inverse Square Root was 77.327912 percent faster than 1F / Math.sqrt()

JIT が効いたようで、パフォーマンスは 10 倍近くになりました。JITをよりよく把握している誰かが来て、これを説明してくれることを願っています。私の環境:Java 6、Eclipse。

于 2013-05-14T19:26:52.127 に答える