java - アトキンのふるい-説明とJavaの例

Question

ウィキペディアでアトキンのふるいについて読んだことがありますが、現時点ではウィキは限られています。アトキンのふるいの概要とJavaの例を探していました。

ありがとう。

score 133 · Accepted Answer

素数、合成数、ふるいについてここで示した基本的な考え方のいくつかを知っているかもしれません（そしておそらく知っているかもしれません）が、アルゴリズムの性質を理解する上で他の読者に役立つかもしれません。この答えのいくつかは、StackOverflowと同等の数学に属することに危険なほど近づきますが、アルゴリズムの機能とその方法を関連付ける必要があると感じています。

このふるいに関するウィキペディアの記事の3つのモジュラー算術および2次対は、このふるいのコアを定理（およびそれらの証明）とともに公開したAtkinおよびBernsteinの論文の3つの対から派生しており、それらが集合的に素数のふるいを形成することを示しています。誰もが素数だけを生成しますが、すべての素数を生成するわけではありません。すべての素数を生成するには、3つすべてが必要です。

これは、ウィキペディアのアルゴリズムを実装するJavaプログラムです。ウィキペディアの記事アルゴリズムのJavaで直接実装されているだけで、実装の効率性については主張しません。

// SieveOfAtkin.java

import java.util.Arrays;

public class SieveOfAtkin {
private static int limit = 1000;
private static boolean[] sieve = new boolean[limit + 1];
private static int limitSqrt = (int)Math.sqrt((double)limit);

public static void main(String[] args) {
    // there may be more efficient data structure
    // arrangements than this (there are!) but
    // this is the algorithm in Wikipedia
    // initialize results array
    Arrays.fill(sieve, false);
    // the sieve works only for integers > 3, so 
    // set these trivially to their proper values
    sieve[0] = false;
    sieve[1] = false;
    sieve[2] = true;
    sieve[3] = true;

    // loop through all possible integer values for x and y
    // up to the square root of the max prime for the sieve
    // we don't need any larger values for x or y since the
    // max value for x or y will be the square root of n
    // in the quadratics
    // the theorem showed that the quadratics will produce all
    // primes that also satisfy their wheel factorizations, so
    // we can produce the value of n from the quadratic first
    // and then filter n through the wheel quadratic 
    // there may be more efficient ways to do this, but this
    // is the design in the Wikipedia article
    // loop through all integers for x and y for calculating
    // the quadratics
    for (int x = 1; x <= limitSqrt; x++) {
        for (int y = 1; y <= limitSqrt; y++) {
            // first quadratic using m = 12 and r in R1 = {r : 1, 5}
            int n = (4 * x * x) + (y * y);
            if (n <= limit && (n % 12 == 1 || n % 12 == 5)) {
                sieve[n] = !sieve[n];
            }
            // second quadratic using m = 12 and r in R2 = {r : 7}
            n = (3 * x * x) + (y * y);
            if (n <= limit && (n % 12 == 7)) {
                sieve[n] = !sieve[n];
            }
            // third quadratic using m = 12 and r in R3 = {r : 11}
            n = (3 * x * x) - (y * y);
            if (x > y && n <= limit && (n % 12 == 11)) {
                sieve[n] = !sieve[n];
            } // end if
            // note that R1 union R2 union R3 is the set R
            // R = {r : 1, 5, 7, 11}
            // which is all values 0 < r < 12 where r is 
            // a relative prime of 12
            // Thus all primes become candidates
        } // end for
    } // end for
    // remove all perfect squares since the quadratic
    // wheel factorization filter removes only some of them
    for (int n = 5; n <= limitSqrt; n++) {
        if (sieve[n]) {
            int x = n * n;
            for (int i = x; i <= limit; i += x) {
                sieve[i] = false;
            } // end for
        } // end if
    } // end for
    // put the results to the System.out device
    // in 10x10 blocks
    for (int i = 0, j = 0; i <= limit; i++) {
        if (sieve[i]) {
            System.out.printf("%,8d", i);
            if (++j % 10 == 0) {
                System.out.println();
            } // end if
            if (j % 100 == 0) {
                System.out.println();
            } // end if
        } // end if
    } // end for
} // end main
} // end class SieveOfAtkin

私は、ふるいを構成する定理を説明しているAtkin（Bernsteinとの共著）の元の論文のコピーを持っています。その論文はここで入手できます：http：//www.ams.org/mcom/2004-73-246/S0025-5718-03-01501-1/S0025-5718-03-01501-1.pdf。それは非数学者のための密な読書であり、アメリカ数学会の論文に典型的なすべての簡潔さを持っています。

ここに続くのは、アルゴリズムがAtkinとBernsteinの説明と論文からどのように導き出されるかについてのより深い説明です。

AtkinとBernsteinは（当然のことながら）、読者が素数ふるい、モジュラー演算、およびモジュラー演算を使用したホイール因数分解を完全に理解していることを前提としています。残念ながら、ウィキペディアの記事の説明とアルゴリズムは、程度は少し劣りますが、同様のことを前提としています。AtkinとBernsteinは、ホイール因数分解と既約二次方程式の3つのペアが使用できる唯一のものであると主張せず、使用できる他のペアの例を示し、方法についてこれ以上のコメントはありません。したがって、AtkinとBernsteinが定理と証明を与える3つは、彼らの仕事に基づくアルゴリズムで使用される3つです。AtkinとBernsteinも、3つのペアが最適なペアであるとは主張していません。それらは、明らかに便利なものです。

この種の議論に簡潔に役立つファンキーな数学記号は、ここでは利用できません。この回答の目的のために、私は使用します

{1つを定義するいくつかの列挙されたセットまたはプロパティ}

セットを表す

Nat0

ゼロを含む自然数のセットを表す、つまり、Nat0 = {0、1、2、...}、

ナット

ゼロを含まない自然数のセットを表すため、つまり、Nat = {1、2、3、...}と、セットを定義するための次の構成と、その要素のシンボル：

{セットの要素のシンボル：シンボルの観点からセットを定義する基準}

＃

セットのカーディナリティ、つまりセット内の要素の数を表す

^

べき乗を表す、つまりxの2乗をx^2と書く

説明、定理、およびアルゴリズムのホイール因数分解で使用されるモジュラー演算は、2つの同等の形式で表示されます。

n =（k * m）+ r for k in Nat0 and r in R = {r：r in Nat0 and r <m}

n mod m = rここで、r in R = {r：r in Nat0 and r <m}

以下は、彼らの論文の定理で与えられた定義と、モジュラー算術形式に関するいくつかの注記です。

nは常に素数です。ここで＃{（x、y）：n =（4 * x ^ 2）+（y ^ 2）、n in {n：（Nat0 * 4）+ 1}、ここでxおよびy> = 1 nには完全な平方係数がありません}は奇数です。つまり、2次n =（4 * x ^ 2）+（y ^ 2）を解く奇数の（x、y）ペアがある場合に限ります。ここで、xおよびy整数> = 1、n mod 4 = 1であり、nに完全な二次因子がない場合、nは素数です。ここで、rがRにあるn mod m=rの形式はm=4およびR={r：1}であることに注意してください。
nは常に素数です。ここで＃{（x、y）：n =（3 * x ^ 2）+（y ^ 2）、n in {n：（Nat0 * 6）+ 1}、ここでxおよびy> = 1 nには完全な平方係数がありません}は奇数です。つまり、2次n =（3 * x ^ 2）+（y ^ 2）を解く奇数の（x、y）ペアがある場合に限ります。ここで、xおよびy整数> = 1、n mod 6 = 1であり、nに完全な二次因子がない場合、nは素数です。ここで、rが集合Rにある形式n mod m = rは、m=6およびR={r：1}であることに注意してください。
nは常に素数です。ここで＃{（x、y）：n =（3 * x ^ 2）-（y ^ 2）、{n：（Nat0 * 12）+ 11}、x> y> = 1、nは完全な二乗係数はありません}は奇妙です。つまり、2次n =（3 * x ^ 2）-（y ^ 2）を解く奇数の（x、y）ペアがある場合に限り、x、y整数（x> y> = 1）、n mod 12 = 11であり、nに完全な二次因子がない場合、nは素数です。ここで、rが集合Rにある形式n mod m = rは、m=12およびR={r：11}であることに注意してください。

論文とウィキペディアの記事が読者がよく知っていると想定しているホイール因数分解の特性は、モジュラー演算を使用して、特定の素因数を持たない整数のみを選択的に選択できることです。フォームで

n mod m = r、r in R = {r：Nat0、r <m}、

mに対して互いに素であるRの要素のみを選択した場合、式を満たすすべての整数nは、素数またはmに対して互いに素になります。

mがnに対して互いに素であるということは、互いに素な除数が1より大きいことを意味します。互いに素な数の例は次のとおりです。モジュラー算術で使用される互いに素（残差）と、説明やアルゴリズムのさまざまなバージョンでそれらがどのように同等であるか。

以下では、定理、アルゴリズム、および説明がすべてどのように関連しているかを説明します。

最初の2次式の場合、n =（4 * x ^ 2）+（y ^ 2）：

定理は次の形式を使用します。

n =（k * 4）+ rここで、R1のr = {r：1}およびNat0のk

これは書くのと同じです

n mod 4 = rここで、R1のr = {r：1}

nは、Nat0で1から始まる1つおきの奇数、つまり{1、5、9、13、...}である必要があると定義されていることに注意してください。

アルゴリズムの場合、mに対してさまざまな選択を行うことができ、正しいセットRを使用すると、定理によって示されるプロパティが保持されます。論文とウィキペディアの記事の著者は、読者がすでにこれらすべてを知っており、すぐにそれを認識すると想定しています。論文とウィキペディアの記事で使用されているmの他の値については、同等のものは次のとおりです。

n mod 12 = rここで、R1aのr = {r：1、5、9}

n mod 60 = rここで、R1bのr = {r：1、5、9、13、17、21、25、29、33、37、41、45、49、53、57}

セットR1aおよびR1bの特定の要素は、後で説明する2つの理由で削除される可能性があり、定理は引き続き適用されます。

2次2次式の場合、n =（3 * x ^ 2）+（y ^ 2）：

定理は次の形式を使用します。

n =（k * 6）+ rここで、R2のr = {r：1}およびNat0のk

繰り返しますが、これはと同じです

n mod 6 = rここで、R2のr = {r：1}

ここで、これは1から始まるNat0の3分の1の奇数、つまり{1、7、13、19、...}であることに注意してください。

論文と記事の同等物は次のとおりです。

n mod 12 = rここで、R2aのr = {r：1、7}

n mod 60 = rここで、R2bのr = {r：1、7、13、19、25、31、37、43、49、55}

繰り返しますが、セットR2aとR2bの値は、後で説明する2つの理由で削除される可能性があり、定理は引き続き適用されます。

3次2次式の場合、（3 * x ^ 2）-（y ^ 2）：

定理は次の形式を使用します。

n = k * 12 + rここで、Nat0のkとR3aのr = {r：11}

繰り返しますが、これは次と同じです。

n mod 12 = rここで、R3aのr = {r：11}

ここで、これは11から始まるNat0の6番目ごとの奇数、つまり{11、23、35、47、...}であることに注意してください。

論文と記事の同等物は次のとおりです。

n mod 60 = rここで、R3bのr = {r：11、23、35、47、59}

繰り返しますが、セットR3bの値は、後で説明する理由で削除でき、定理は引き続き適用されます。

さまざまなアルゴリズムと説明では、m=12とm=60の値の場合、定理またはアルゴリズムの有効性に影響を与えることなく、集合Rの要素が削除されます。セットRの一部の値が破棄される理由は2つあります。

最初の理由は、ペアになっているmに対して互いに素ではない集合Rのrの値は、1つ以上の素因数mを持つ合成整数であるnの値を含めるためだけに機能するためです。素数になります。モジュラー演算のこの機能は、ホイール因数分解を使用して、素数であるかどうかについて、通常はより複雑で効率の低い、さらなるテストから大量の非素数を除外する理由です。このふるいでは、より複雑なテストは、特定の既約二次方程式の解の数が奇数であるかどうかです。つまり、このアルゴリズムのセットRで使用されているmの値に比較的プライムされていない、セットRのすべての値をすぐに破棄できます。

2番目の理由は、この論文では、ホイールの因数分解により、重なり合う素数を含む、重なり合う整数のセットが作成されるためです。それらは便利であり、定理にとって重複は問題ではありませんでしたが、アルゴリズムでは、簡単に回避できれば無駄です。この場合、それは自明に回避されます。また、ホイール因数分解からの整数のセットが重複する場合、1つの2次の奇数の解と別の2次の奇数の解は、累積偶数の解になります（奇数と奇数は常に偶数）。ウィキペディアの実装を含む多くの実装では、これは素数が素数ではないことを識別します。ウィキペディアのような実装は、すべての四分円の累積解から素数を決定し、しないからです。t各二次方程式から解を分離します。そのような場合、ホイール因数分解からの整数が整数の排他的なサブセットであることが不可欠です。

実装では、必要がない場合、複数の2次式で同じ数をテストする必要はありません。同じmが使用されていると仮定して、3つの2次方程式で使用される集合Rのrの値は、そのうちの1つだけに含める必要があります。複数の場合、nの同じ値が複数回表示され、複数の2次方程式でテストされます。使用中のmの値が同じである場合、Rの同じ要素が複数の2次方程式でRに表示されないようにすると、オーバーラップがなくなります。ウィキペディアの記事の場合、ホイールの因数分解によって防止されるオーバーラップは、1つの2次方程式では奇数であるが、2つの2次方程式では偶数に累積する累積2次解で発生する誤った結果を防ぎます。

別のアルゴリズムは、二次方程式を計算する前にオーバーラップを回避する場合があります。二次方程式とホイール因数分解の経験的テストでは、m = 12のホイール因数分解では、二次方程式を解くよりもnの値が大幅に少なくなります。m = 60の場合にホイール因数分解を使用すると、差が大幅に増加します。nの特定の値に対する二次解のアルゴリズムが非常に効率的である場合、二次方程式をテストするためにホイールの因数分解から得られた値のみを使用することにより、大幅な改善が得られる可能性があります。

互いに素ではない要素を削除した後のホイールの因数分解は次のとおりです。最初の2次式の場合：

n mod 12 = rここで、R1aのr = {1：1、5}（9には12と共通の約数3があります）

n mod 60 = rここで、R1bのr = {r：1、13、17、29、37、41、49、53}（5、25、および45には60と共通の除数5があります; 9、21、33、45および57は60と共通の約数3を持っています）そしてこれはAtkinとBernsteinの論文のアルゴリズムの形式です。

2次の場合：

n mod 12 = rここで、R2aのr = {1、7}（Rの要素には12と共通の約数がありません}

n mod 60 = rここで、R2bのr = {r：1、7、13、19、31、37、43、49}（25と55は60と共通の約数5を持ちます）、これはアルゴリズムの形式です。アトキンとバーンスタインの論文。

3次2次式の場合：

n mod 12 = rここで、R3aのr = {r：11}（Rのどの要素にも12と共通の約数はありません}

n mod 60 = 4ここで、R3bのr = {r：11、23、47、59}（35は60と共通の約数5を持ちます）、これはAtkinandBernsteinの論文のアルゴリズムの形式です。

同じ要素のいくつかが、第1および第2の二次方程式のセットR1aおよびR2aに表示されることに注意してください。セットR1bとR2bについても同じことが言えます。mが12の場合、共通要素のセットは{1}です。mが60の場合、共通要素のセットは{1、13、37、49}です。Rの要素が1つの二次方程式にのみ含まれるようにすると、次のフォームが作成されます。これは、ウィキペディアの記事から認識できるはずです。

最初の2次式の場合：

n mod 12 = rここで、R1aのr = {r：1、5}（重複は削除されません）（これはウィキペディアのアルゴリズムに示されている形式です）

n mod 60 = rここで、r in R1b = {r：1、13、17、29、37、41、49、53}（重複は削除されません）（これはウィキペディアの説明に示されている形式です）

2次の場合：

n mod 12 = rここで、R2aのr = {r：7}（要素1はすでにR1aにあるため削除されています）（これはウィキペディアのアルゴリズムに示されている形式です）

n mod 60 = r where r in R2b = {r : 7, 19, 31, 43} (elements 1, 13, 37 and 49 removed since they are already in R1b) (this is the form shown in the Wikipedia explanation)

For the third quadratic:

n mod 12 = r where r in R3a = {r: 11} (no duplicates)

n mod 60 = r where r in R3b = {r: 11, 23, 47, 59} (no duplicates)

One remaining question might be made as to why values of m range over 4, 6, 12 and 60. This has to do with how many composite (i.e. non-prime) numbers one wants to exclude from more complex testing for being prime using the quadratics versus the complexity of the wheel factorization used.

The value for m used can determine which composites can get immediately eliminated without eliminating primes. If m = 4 and R1 = {r : 1}, as in the theorem for the first quadratic, all numbers with prime factors of 2 get eliminated because 1 is relatively prime to all numbers and 4 has prime factors of 2. It is important to note that because 3 is not in this set R, a wheel factorization using m = 4 and set R1 will also exclude a large number of primes, perhaps half of them.

If m = 6 and R2 = {r : 1} as in the theorem for the second quadratic, all composite numbers with prime factors of 2 or 3 get eliminated because 1 is relatively prime to all numbers and 6 has prime factors of 2 and 3. Again, with m = 6 and set R2, which doesn't contain 5, a large number of primes, perhaps half of them, will get excluded.

If m = 12 and R3 = {r : 11} as in the theorem for the third quadratic, all composite mumbers with prime factors of 2 or 3 get eliminated because 11 is relatively prime to 12 and 12 has prime factors of 2 and 3. Again, with m = 12 and set R3, which doesn't contain 1, 5 or 7, a large number of primes, perhaps well more than half of them, will get excluded.

One of the things that Atkin and Bernstein informally show in their paper is that, although the wheel factors in the theorems individually exclude primes from their respective quadratics, collectively, the three wheel factorizations permit all primes and, in the case of the wheel factorizations in the first and second quadratics, permit significant overlap. Although they don't remove the overlap in their algorithms where m = 60, the Wikipedia article does where they set m = 12 in the article algorithm and m = 60 in the article explanation.

For the quadratics Atkin and Bernstein used in their theorems, weakening the wheel factorizations that go with them will invalidate that they will operate according to the theorems. However, we can strengthen them in ways that remove only more composites but still keeping the exact same primes. For the forms where m = 4, (4 = 2 * 2) every even integer gets filtered. For the forms where m = 12 (12 = 2 * 2 * 3), every integer with prime factors of 2 or 3 gets filtered. For the forms where m = 60 (60 = 2 * 2 * 3 * 5), every integer with prime factors of 2, 3 or 5 gets filtered. We could use potentially use filters with m = 6 for the same effect as m = 12 and m = 30 for the same effect as m = 60, but we would have to exercise care that what we create is equivalent to the ones used in the theorems.

Here are some useful statistics about wheel factorization. 50% of the integers in Nat0 are even and, other than 2, not prime. 33% of the integers in Nat0 have 3 as a prime factor and are not prime. 20% of the integers in Nat0 have 5 as a prime factor and are not prime. Collectively, 67 % of the integers in Nat0 have prime factors of 2 or 3 and are not prime. Collectively, about 75% of the integers in Nat0 have prime factors of 2, 3 or 5 and are not prime. A simple method for eliminating 1/2, 2/3 or 3/4 of the integers in Nat0 from more complex testing for being prime is very enticing and is the motive for using wheel factorization as a preliminary filter in prime number sieves. It is also a motivation for using values of m with an accompanying set R that can filter all composites wtih prime factors representing large quantities of composites.

As an absolute minimum, one would want to remove composites with a prime factor of 2 (i.e. all even numbers) and then add 2 back at the end. One would at least like to remove composites with prime factors of 2 or 3. Preferably, one would like to remove composites with prime factors of 2, 3 or 5. Past that, the statistics show diminishing returns. m = 4 with R1 accomplishes the bare minimum. m = 12 with R1a, R2a and R3a accomplishes the least one would like. m = 60 with R1b, R2b and R3b accomplish a very desirable result.

There are some more things to consider in working with values for m and sets R. Take note that the two forms are NOT equivalent for the first quadratic:

n mod 12 = r where r in R1a = {r : 1, 5}

and

n mod 6 = r where r in R = {r : 1, 5}

because the form where m = 6 is not equivalent to

n mod 4 = r where r in R1 = {r : 1}

Note that:

n mod 6 = r where r in R = {r : 1}

is equivalent to

n mod 12 = r where r in R = {r : 1, 7}

and the form where m = 6 could be used with the second quadratic. It is, in fact, the form used with the second quadratic in the theorem. If we were to use it instead of the examples already considered, we could remove the element 1 from the set R for first quadratic when its m = 12 to remove overlap.

In adjusting the algorithm, due diligence must be used to maintain the conditions that the theorems require. When choosing values for m and sets for R, one must ensure that the form is an equivalent one that does not introduce any new values for n to the quadratic that were not produced by the wheel factorization in the theorem.

For implementations, choices get made based on complexities and efficiencies involving the data structures, arithmetic, processor capabilities (especially with regard to multiplication and division), available processor cache, memory and the size of data structures. There are tradeoffs betweeen the values for m and the number of remainders (residues) in sets R that have to be checked. Some of these may be reasons for seeing m = 60 in the explanation and m = 12 in the algorithm. They are defninitely the reasons Atkin and Bernstein used forms with m = 60 in the algorithms in their paper.

In the Atkin and Bernstein paper, they also provide algorithms for finding solutions to the quadratics for specific values of n using lattices. These additional algorithms allowed Atkin and Bernstein to create sieve algorithms that filtered simultaneously on the quadratics and the wheel factorizations. None of the algorithms for lattice derived solutions to the quadratics with the wheel factorizations were considered in the Wikipedia article's algorithm. In the Wikipedia article, an exhaustive x, y value technique is used with the quadratics and the wheel factorizations are applied after the quadratics return values for n. Again, this is an efficiency issue that has to be decided by implementers.

java - アトキンのふるい-説明とJavaの例

1 に答える 1

Related

Reference