c - Pollard の rho アルゴリズムの実装から適切な出力が得られない

Question

Pollard の rho アルゴリズムを使用して素因数分解を計算しようとして、どこが間違っているのかわかりません。

#include<stdio.h>
#define f(x)  x*x-1

int pollard( int );
int gcd( int, int);

int main( void ) {
    int n;
    scanf( "%d",&n );
    pollard( n );
    return 0;  
}

int pollard( int n ) {
    int i=1,x,y,k=2,d;
    x = rand()%n;
    y = x;

    while(1) {
        i++;
        x = f( x ) % n;
        d = gcd( y-x, n);

        if(d!=1 && d!=n)
            printf( "%d\n", d);

        if(i == k) {
            y = x;
            k = 2 * k;
        }
    }
}   
int gcd( int a, int b ) {

    if( b == 0) 
        return a;
    else 
        return gcd( b, a % b);
}

score 6 · Accepted Answer

差し迫った問題の 1 つは、Peter de Rivaz が疑ったように、

#define f(x)  x*x-1

したがって、行

x = f(x)%n;

になる

x = x*x-1%n;

の優先順位はの優先順位%より高い-ため、式は次のように暗黙的に括弧で囲まれます。

x = (x*x) - (1%n);

これはx = x*x - 1;( だと思いますがn > 1、とにかくx = x*x - constant;) と同等であり、 valuex >= 2で開始すると、要因を見つける現実的なチャンスが得られる前にオーバーフローが発生します。

2 -> 2*2-1 = 3 -> 3*3 - 1 = 8 -> 8*8 - 1 = 63 -> 3968 -> 15745023 -> int が 32 ビットの場合はオーバーフロー

gcd(y-x,n)ただし、それが要因であることがすぐに不可能になるわけではありません。理論的に因数が見つかった段階で、オーバーフローによって数学的に存在する共通因数が破壊される可能性が高くなります。オーバーフローによってもたらされる共通因数よりも可能性が高くなります。

符号付き整数のオーバーフローは未定義の動作であるため、プログラムがどのように動作するかは保証されませんが、通常は一貫して動作するためf、アルゴリズムが原則的に機能する明確に定義されたシーケンスが反復によって生成されます。

もう 1 つの問題は、y-xが頻繁に負になることです。また、計算結果gcdも負になることがよくあり-1ます。その場合、印刷します-1。

そして、f両方の素因数を法とするサイクル ( n2 つの異なる素数の積の例) は同じ長さを持ち、同時。そのようなケースを検出しようとはしません。いつでもgcd(|y-x|, n) == n、そのシーケンスのそれ以上の作業は無意味なので、breakいつループから抜け出す必要がありますd == n。

また、がn素数であるかどうかを確認することはありません。その場合、因数を見つけようとすることは、最初から無駄な作業です。

さらに、の完全な結果にが適用されるf(x)ように修正した後でも、比較的小さい(標準の符号付き 32 ビットs では) に対してオーバーフローが発生するという問題があるため、より大きな因数分解はオーバーフローのために失敗する可能性があります。少なくとも、計算にはを使用して、のオーバーフローを回避する必要があります。ただし、このような小さな数の素因数分解は、通常、試行除算を使用するとより効率的に行われます。ポラードの Rho 法やその他の高度な因数分解アルゴリズムは、試行分割がもはや効率的ではない、または実行可能ではない、より大きな数を対象としています。% nf(x)x*xxintx >= 46341nunsigned long longn < 2^32

score 3 · Accepted Answer

私は C++ の初心者であり、スタックオーバーフローも初めてなので、私が書いたもののいくつかは雑に見えるかもしれませんが、これで正しい方向に進むはずです。ここに掲載されているプログラムは、通常、プロンプトで入力した数値の重要な因数を 1 つ見つけて返す必要があります。

いくつかの半素数でテストしたところ、うまくいきました。371156167103 の場合、Enter キーを押した後、検出可能な遅延なしで 607619 を見つけます。これより大きな数でチェックしたことはありません。unsigned long long 変数を使用しましたが、可能であれば、さらに大きな整数型を提供するライブラリを入手して使用する必要があります。

追加する編集、X のメソッド f への 1 回の呼び出しと Y の 2 つのそのような呼び出しは意図的であり、アルゴリズムの動作方法に従っています。Y の呼び出しを別のそのような呼び出し内にネストして 1 行に収めることを考えましたが、従うのが簡単になるように、このようにすることにしました。

#include "stdafx.h"
#include <stdio.h>
#include <iostream>
typedef unsigned long long ULL;

ULL pollard(ULL numberToFactor);
ULL gcd(ULL differenceBetweenCongruentFunctions, ULL numberToFactor);
ULL f(ULL x, ULL numberToFactor);

int main(void)
{
    ULL factor;
    ULL n;
    std::cout<<"Enter the number for which you want a prime factor: ";
    std::cin>>n;
    factor = pollard(n);
    if (factor == 0) std::cout<<"No factor found.  Your number may be prime, but it is     not certain.\n\n";
    else std::cout<<"One factor is: "<<factor<<"\n\n";
}

ULL pollard(ULL n)
{
    ULL x = 2ULL;
    ULL y = 2ULL;
    ULL d = 1ULL;

    while(d==1||d==n)
    {
        x = f(x,n);
        y = f(y,n);
        y = f(y,n);
        if (y>x)
        {
            d = gcd(y-x, n);
        }
        else
        {
            d = gcd(x-y, n);
        }
    }

    return d;

}


ULL gcd(ULL a, ULL b)
{
    if (a==b||a==0)
        return 0;   // If x==y or if the absolute value of (x-y) == the number     to be factored, then we have failed to find
                    // a factor.  I think this is not proof of     primality, so the process could be repeated with a new function.
                    // For example, by replacing x*x+1 with x*x+2, and     so on.  If many such functions fail, primality is likely.

    ULL currentGCD = 1;
    while (currentGCD!=0) // This while loop is based on Euclid's algorithm
    {
        currentGCD = b % a;
        b=a;
        a=currentGCD;
    }

    return b;
}

ULL f(ULL x, ULL n)
{
    return (x * x + 1) % n;
}

score 1 · Accepted Answer

これに戻るのが長く遅れて申し訳ありません。最初の回答で述べたように、私は C++ の初心者です。これは、グローバル変数の過度の使用、BigIntegers と BigUnsigned の過度の使用 (他の型の方が優れている可能性がある場合)、エラーチェックの欠如、およびその他のプログラミング習慣から明らかです。より熟練した人が展示しないかもしれないディスプレイ。そうは言っても、私が何をしたかを説明してから、コードを投稿します。

最初の回答は、Pollard の Rho アルゴリズムが何をするかを理解したら実装する方法の非常に単純なデモとして役立つため、2 番目の回答でこれを行っています。そして、最初に 2 つの変数を取り、それらを x と y と呼び、2 の開始値を割り当てます。次に、通常は (x^2+1)%n の関数で x を実行します。ここで、n は指定した数値です。因数分解したい。そして、各サイクルで同じ関数を y に 2 回実行します。次に、x と y の差を計算し、最終的にこの差と n の最大公約数を求めます。その数が 1 の場合は、関数で x と y を再度実行します。

GCD が 1 でなくなるか、x と y が再び等しくなるまで、このプロセスを続けます。1 でない GCD が見つかった場合、その GCD は n の非自明因子です。x と y が等しくなった場合、(x^2+1)%n 関数は失敗しています。その場合は、(x^2+2)%n などの別の関数で再試行する必要があります。

ここに例があります。素因数が 5 と 7 であることがわかっている 35 を取ります。Pollard Rho について説明し、重要な因数を見つける方法を示します。

サイクル #1: X は 2 から始まります。次に、関数 (x^2+1)%n, (2^2+1)%35 を使用して、x に 5 を取得します。Y も 2 から始まり、関数を 1 回実行すると、値も 5 になります。しかし、y は常に関数を 2 回実行するため、2 回目の実行は (5^2+1)%35、つまり 26 になります。 x と y の差は 21 です。21 (差) と 35 (n) の GCD は 7 です。35 の素因数は既に見つかりました! 任意の 2 つの数値の GCD は、非常に大きな指数であっても、Euclid のアルゴリズムを使用した数式によって非常に迅速に見つけることができます。これが、ここに投稿するプログラムが行うことです。

GCD 関数に関しては、このプログラム用にダウンロードした 1 つのライブラリを使用しています。これは、BigIntegers と BigUnsigned を使用できるようにするライブラリです。そのライブラリには GCD 関数も組み込まれており、それを使用できたはずです。しかし、私は説明のために手書きの GCD 関数を使用することにしました。プログラムの実行時間を改善したい場合は、ライブラリの GCD 関数を使用することをお勧めします。これは、Euclid よりも高速なメソッドがあり、それらの高速なメソッドの 1 つを使用するようにライブラリが作成されている可能性があるためです。

別の補足事項。.Net 4.5 ライブラリは、BigIntegers と BigUnsigned の使用もサポートしています。すべてを C++/CLI ではなく C++ で書きたかったので、このプログラムではそれを使用しないことにしました。.Net ライブラリからパフォーマンスが向上する場合もあれば、そうでない場合もあります。わかりませんが、それもオプションであることを共有したかった.

ここでは少し飛び回っていますので、プログラムの機能を大まかに説明することから始めましょう。最後に、Visual Studio 11 (Visual Studio 2012 とも呼ばれます) を使用している場合にコンピューターにセットアップする方法を説明します。

プログラムは、処理する数値の因数を格納するために 3 つの配列を割り当てます。これらの配列は 1000 要素幅であり、これは過剰かもしれませんが、素因数が 1000 以下の任意の数が適合することを保証します。

プロンプトで数値を入力すると、その数値は合成であると見なされ、compositeFactors 配列の最初の要素に配置されます。次に、Miller-Rabin を使用して数値が合成されているかどうかを確認する、明らかに非効率な while ループをいくつか通過します。この検定では、数値が 100% の信頼度で複合数であると言うか、または数値が素数であり、非常に高い (ただし 100% ではない) 信頼度であると言うことができることに注意してください。信頼度は、プログラム内の変数confidenceFactorによって調整できます。プログラムは、2 からconfidenceFactor までのすべての値に対して 1 つのチェックを行うため、confidenceFactor 自体の値よりも合計チェックが 1 つ少なくなります。

私がconfidenceFactorに設定したのは101で、100回のチェックを行います。数字が素数であると言う場合、それが実際に合成である確率は 4^100 分の 1 であり、公正なコインの裏返しを 200 回連続して正しくコールする確率と同じです。要するに、数が素数であると言われている場合、おそらく素数ですが、速度を犠牲にして信頼度を高めるには、confidenceFactor 数を増やすことができます。

これは、Pollard の Rho アルゴリズムが long long 型の小さい数を因数分解するのに非常に効果的である一方で、ある数が合成されているかどうかを確認する Miller-Rabin 検定は、BigInteger がなければ多かれ少なかれ役に立たないことを言及するのと同じくらい良い場所かもしれません。および BigUnsigned 型。BigInteger ライブラリは、このように大きな数を確実に素因数分解できるようにするための要件です。

Miller Rabin が因子が合成であると言った場合、それは因子分解され、因子は temp 配列に格納され、composites 配列の元の因子は同じ因子で除算されます。数値が素数である可能性が高いと識別されると、それらは素因数配列に移動され、画面に出力されます。このプロセスは、複合因子がなくなるまで続きます。要因は昇順で見つかる傾向がありますが、これは偶然です。プログラムはそれらを昇順でリストする努力をせず、見つかった順にリストするだけです。

c に与えた値に関係なく、数値 4 を因数分解する関数 (x^2+c)%n を見つけることができなかったことに注意してください。Pollard Rho は、すべての完全な正方形を扱うのに非常に苦労しているようですが、4 は、説明されている形式の関数を使用して完全に影響を受けない、私が見つけた唯一の合成数です。したがって、私は pollard メソッド内に 4 の n のチェックを追加し、そうであれば即座に 2 を返します。

したがって、このプログラムを設定するには、次のことを行う必要があります。https://mattmccutchen.net/bigint/に移動し、bigint-2010.04.30.zip をダウンロードします。これを解凍し、すべての .hh ファイルとすべての C++ ソースファイルを ~\Program Files\Microsoft Visual Studio 11.0\VC\include ディレクトリに配置します。ただし、Sample および C++ Testsuite ソースファイルは除きます。次に、Visual Studio で空のプロジェクトを作成します。ソリューションエクスプローラーで、リソースファイルフォルダーを右クリックし、[追加...既存の項目] を選択します。上記のディレクトリにすべての C++ ソースファイルを追加します。次に、ソリューションエクスプローラーでも、ソースファイルフォルダーを右クリックして新しい項目を追加し、C++ ファイルを選択して名前を付け、以下のソースコードを貼り付けます。

あまりお世辞を言うつもりはありませんが、スタックオーバーフローには、私よりも C++ についてよく知っている人がいます。しかし、そうでない場合でも、コードはそのまま機能し、中規模の数の素因数をプログラムで見つける際の原則を説明するのに役立つはずです。一般的な数体ふるいを脅かすことはありませんが、私が使用しているような古い Core2 Duo コンピュータでも、かなり短時間で 12 ～ 14 桁の素因数を素因数分解できます。

コードは次のとおりです。幸運を。

#include <string>
#include <stdio.h>
#include <iostream>
#include "BigIntegerLibrary.hh"

typedef BigInteger BI;
typedef BigUnsigned BU;

using std::string;
using std::cin;
using std::cout;

BU pollard(BU numberToFactor);
BU gcda(BU differenceBetweenCongruentFunctions, BU numberToFactor);
BU f(BU x, BU numberToFactor, int increment);
void initializeArrays();
BU getNumberToFactor ();
void factorComposites();
bool testForComposite (BU num);

BU primeFactors[1000];
BU compositeFactors[1000];
BU tempFactors [1000];
int primeIndex;
int compositeIndex;
int tempIndex;
int numberOfCompositeFactors;
bool allJTestsShowComposite;

int main ()
{
    while(1)
    {
        primeIndex=0;
        compositeIndex=0;
        tempIndex=0;
        initializeArrays();
        compositeFactors[0] = getNumberToFactor();
        cout<<"\n\n";
        if (compositeFactors[0] == 0) return 0;
        numberOfCompositeFactors = 1;
        factorComposites();
    }
}

void initializeArrays()
{
    for (int i = 0; i<1000;i++)
    {
        primeFactors[i] = 0;
        compositeFactors[i]=0;
        tempFactors[i]=0;
    }
}

BU getNumberToFactor ()
{
    std::string s;
    std::cout<<"Enter the number for which you want a prime factor, or 0 to quit: ";
    std::cin>>s;
    return stringToBigUnsigned(s);
}

void factorComposites()
{
    while (numberOfCompositeFactors!=0)
    {
        compositeIndex = 0;
        tempIndex = 0;

        // This while loop finds non-zero values in compositeFactors.
        // If they are composite, it factors them and puts one factor in tempFactors,
        // then divides the element in compositeFactors by the same amount.
        // If the element is prime, it moves it into tempFactors (zeros the element in compositeFactors)
        while (compositeIndex < 1000)
        {
            if(compositeFactors[compositeIndex] == 0)
            {
                compositeIndex++;
                continue;
            }
            if(testForComposite(compositeFactors[compositeIndex]) == false)
            {
                tempFactors[tempIndex] = compositeFactors[compositeIndex];
                compositeFactors[compositeIndex] = 0;
                tempIndex++;
                compositeIndex++;
            }
            else
            {
                tempFactors[tempIndex] = pollard (compositeFactors[compositeIndex]);
                compositeFactors[compositeIndex] /= tempFactors[tempIndex];
                tempIndex++;
                compositeIndex++;
            }
        }
        compositeIndex = 0;

        // This while loop moves all remaining non-zero values from compositeFactors into tempFactors
        // When it is done, compositeFactors should be all 0 value elements
        while (compositeIndex < 1000)
        {
            if (compositeFactors[compositeIndex] != 0)
            {
                tempFactors[tempIndex] = compositeFactors[compositeIndex];
                compositeFactors[compositeIndex] = 0;
                tempIndex++;
                compositeIndex++;
            }
            else compositeIndex++;
        }
        compositeIndex = 0;
        tempIndex = 0;

        // This while loop checks all non-zero elements in tempIndex.
        // Those that are prime are shown on screen and moved to primeFactors
        // Those that are composite are moved to compositeFactors
        // When this is done, all elements in tempFactors should be 0
        while (tempIndex<1000)
        {
            if(tempFactors[tempIndex] == 0)
            {
                tempIndex++;
                continue;
            }
            if(testForComposite(tempFactors[tempIndex]) == false)
            {
                primeFactors[primeIndex] = tempFactors[tempIndex];
                cout<<primeFactors[primeIndex]<<"\n";
                tempFactors[tempIndex]=0;
                primeIndex++;
                tempIndex++;
            }
            else
            {
                compositeFactors[compositeIndex] = tempFactors[tempIndex];
                tempFactors[tempIndex]=0;
                compositeIndex++;
                tempIndex++;
            }
        }
        compositeIndex=0;
        numberOfCompositeFactors=0;

        // This while loop just checks to be sure there are still one or more composite factors.
        // As long as there are, the outer while loop will repeat
        while(compositeIndex<1000)
        {
            if(compositeFactors[compositeIndex]!=0) numberOfCompositeFactors++;
            compositeIndex ++;
        }
    }
    return;
}

// The following method uses the Miller-Rabin primality test to prove with 100% confidence a given number is     composite,
// or to establish with a high level of confidence -- but not 100% -- that it is prime

bool testForComposite (BU num)
{
    BU confidenceFactor = 101;
    if (confidenceFactor >= num) confidenceFactor = num-1;
    BU a,d,s, nMinusOne;
    nMinusOne=num-1;
    d=nMinusOne;
    s=0;
    while(modexp(d,1,2)==0)
    {
        d /= 2;
        s++;
    }
    allJTestsShowComposite = true; // assume composite here until we can prove otherwise
    for (BI i = 2 ; i<=confidenceFactor;i++)
    {
        if (modexp(i,d,num) == 1) 
            continue;  // if this modulus is 1, then we cannot prove that num is composite with this     value of i, so continue
        if (modexp(i,d,num) == nMinusOne)
        {
            allJTestsShowComposite = false;
            continue;
        }
        BU exponent(1);     
        for (BU j(0); j.toInt()<=s.toInt()-1;j++)
        {
            exponent *= 2;
            if (modexp(i,exponent*d,num) == nMinusOne)
            {
                // if the modulus is not right for even a single j, then break and increment i.
                allJTestsShowComposite = false;
                continue;
            }
        }
        if (allJTestsShowComposite == true) return true; // proven composite with 100% certainty, no need     to continue testing
    }
    return false;
    /* not proven composite in any test, so assume prime with a possibility of error = 
    (1/4)^(number of different values of i tested).  This will be equal to the value of the
    confidenceFactor variable, and the "witnesses" to the primality of the number being tested will be all     integers from
    2 through the value of confidenceFactor.

    Note that this makes this primality test cryptographically less secure than it could be.  It is     theoretically possible,
    if difficult, for a malicious party to pass a known composite number for which all of the lowest n integers     fail to
    detect that it is composite.  A safer way is to generate random integers in the outer "for" loop and use     those in place of
    the variable i.  Better still if those random numbers are checked to ensure no duplicates are generated.
    */
}

BU pollard(BU n)
{
    if (n == 4) return 2;
    BU x = 2;
    BU y = 2;
    BU d = 1;
    int increment = 1;

    while(d==1||d==n||d==0)
    {
        x = f(x,n, increment);
        y = f(y,n, increment);
        y = f(y,n, increment);
        if (y>x)
        {
            d = gcda(y-x, n);
        }
        else
        {
            d = gcda(x-y, n);
        }
        if (d==0) 
        {
            x = 2;
            y = 2;
            d = 1;
            increment++; // This changes the pseudorandom function we use to increment x and y
        }
    }
    return d;
}


BU gcda(BU a, BU b)
{
    if (a==b||a==0)
        return 0;   // If x==y or if the absolute value of (x-y) == the number to be factored, then we     have failed to find
                    // a factor.  I think this is not proof of primality, so the process could     be repeated with a new function.
                    // For example, by replacing x*x+1 with x*x+2, and so on.  If many such     functions fail, primality is likely.

    BU currentGCD = 1;
    while (currentGCD!=0) // This while loop is based on Euclid's algorithm
    {
        currentGCD = b % a;
        b=a;
        a=currentGCD;
    }
    return b;
}

BU f(BU x, BU n, int increment)
{
    return (x * x + increment) % n;
}

c - Pollard の rho アルゴリズムの実装から適切な出力が得られない

4 に答える 4

Related

Reference