java - Javaでサイズnのセットからk個の要素のサブセットを繰り返し生成するにはどうすればよいですか？

Question

私はすべてのサイズkのサブセットを分析し、どれが最適かを判断することを含むパズルに取り組んでいます。サブセットの数が少ない場合に機能するソリューションを作成しましたが、より大きな問題ではメモリが不足します。現在、Pythonで記述された反復関数をJavaに変換して、作成された各サブセットを分析し、セット全体ではなく、最適化されていることを表す値のみを取得して、不足しないようにしようとしています。メモリー。これが私がこれまでに持っているものであり、非常に小さな問題でも終了しないようです。

public static LinkedList<LinkedList<Integer>> getSets(int k, LinkedList<Integer> set)
{
    int N = set.size();
    int maxsets = nCr(N, k);
    LinkedList<LinkedList<Integer>> toRet = new LinkedList<LinkedList<Integer>>();

    int remains, thresh;
    LinkedList<Integer> newset; 
    for (int i=0; i<maxsets; i++)
    {
        remains = k;
        newset = new LinkedList<Integer>();
        for (int val=1; val<=N; val++)
        {
            if (remains==0)
                break;
            thresh = nCr(N-val, remains-1);
            if (i < thresh)
            {
                newset.add(set.get(val-1));
                remains --;
            }
            else 
            {
                i -= thresh;
            }
        }
        toRet.add(newset);
    }

    return toRet;

}

誰かが私がこの関数をデバッグするのを手伝ったり、サイズkのサブセットを繰り返し生成するための別のアルゴリズムを提案したりできますか？

編集：私はついにこの関数を機能させました。Pythonはループインデックスの処理が異なるため、iとthreshの比較を行うためにiと同じ新しい変数を作成する必要がありました。

score 28 · Accepted Answer

まず、リストにランダムアクセスを行う場合は、それを効率的にサポートするリスト実装を選択する必要があります。LinkedListのjavadocから：

すべての操作は、二重リンクリストで期待できるとおりに実行されます。リストにインデックスを付ける操作は、リストの最初または最後のどちらか、指定されたインデックスに近い方からトラバースします。

ArrayListは、スペース効率が高く、ランダムアクセスの方がはるかに高速です。実際には、事前に長さがわかっているので、単純な配列を使用することもできます。

アルゴリズムについて：簡単に始めましょう：サイズ1のすべてのサブセットをどのように生成しますか？おそらくこのように：

for (int i = 0; i < set.length; i++) {
    int[] subset = {i};
    process(subset);
}

ここで、processは、これまでに処理されたすべてのサブセットよりも「優れている」かどうかをチェックするなど、セットに対して何かを行うメソッドです。

では、サイズ2のサブセットで機能するように、これをどのように拡張しますか？サイズ2のサブセットとサイズ1のサブセットの関係は何ですか？サイズ2のサブセットは、最大の要素を削除することでサイズ1のサブセットに変えることができます。言い換えると、サイズ2の各サブセットは、サイズ1のサブセットを取得し、セット内の他のすべての要素よりも大きい新しい要素を追加することによって生成できます。コード内：

processSubset(int[] set) {
    int subset = new int[2];
    for (int i = 0; i < set.length; i++) {
        subset[0] = set[i];
        processLargerSets(set, subset, i);
    }
}

void processLargerSets(int[] set, int[] subset, int i) {
    for (int j = i + 1; j < set.length; j++) {
        subset[1] = set[j];
        process(subset);
    }
}

任意のサイズkのサブセットの場合、最大の要素を切り刻むことにより、サイズkのサブセットをサイズk-1のサブセットに変換できることに注意してください。つまり、サイズkのすべてのサブセットは、サイズk -1のすべてのサブセットを生成することによって生成でき、これらのそれぞれについて、サブセット内の最大値よりも大きい各値について、その値をセットに追加します。コード内：

static void processSubsets(int[] set, int k) {
    int[] subset = new int[k];
    processLargerSubsets(set, subset, 0, 0);
}

static void processLargerSubsets(int[] set, int[] subset, int subsetSize, int nextIndex) {
    if (subsetSize == subset.length) {
        process(subset);
    } else {
        for (int j = nextIndex; j < set.length; j++) {
            subset[subsetSize] = set[j];
            processLargerSubsets(set, subset, subsetSize + 1, j + 1);
        }
    }
}

テストコード：

static void process(int[] subset) {
    System.out.println(Arrays.toString(subset));
}


public static void main(String[] args) throws Exception {
    int[] set = {1,2,3,4,5};
    processSubsets(set, 3);
}

ただし、巨大なセットでこれを呼び出す前に、サブセットの数がかなり急速に増える可能性があることを覚えておいてください。

score 11 · Accepted Answer

org.apache.commons.math3.util.Combinationsを使用できます。

例：

import java.util.Arrays;
import java.util.Iterator;

import org.apache.commons.math3.util.Combinations;

public class tmp {
    public static void main(String[] args) {
        for (Iterator<int[]> iter = new Combinations(5, 3).iterator(); iter.hasNext();) {
            System.out.println(Arrays.toString(iter.next()));
        }
    }

}

出力：[0、1、2] [0、1、3] [0、2、3] [1、2、3] [0、1、4] [0、2、4] [1、2、4 ] [0、3、4] [1、3、4] [2、3、4]

score 3 · Accepted Answer

これが私が書いた組み合わせイテレータです

package psychicpoker;

import java.util.ArrayList;
import java.util.Collection;
import java.util.Iterator;
import java.util.List;

import static com.google.common.base.Preconditions.checkArgument;

public class CombinationIterator<T> implements Iterator<Collection<T>> {

private int[] indices;
private List<T> elements;
private boolean hasNext = true;

public CombinationIterator(List<T> elements, int k) throws IllegalArgumentException {
    checkArgument(k<=elements.size(), "Impossible to select %d elements from hand of size %d", k, elements.size());
    this.indices = new int[k];
    for(int i=0; i<k; i++)
        indices[i] = k-1-i;
    this.elements = elements;
}

public boolean hasNext() {
    return hasNext;
}

private int inc(int[] indices, int maxIndex, int depth) throws IllegalStateException {
    if(depth == indices.length) {
        throw new IllegalStateException("The End");
    }
    if(indices[depth] < maxIndex) {
        indices[depth] = indices[depth]+1;
    } else {
        indices[depth] = inc(indices, maxIndex-1, depth+1)+1;
    }
    return indices[depth];
}

private boolean inc() {
    try {
        inc(indices, elements.size() - 1, 0);
        return true;
    } catch (IllegalStateException e) {
        return false;
    }
}

public Collection<T> next() {
    Collection<T> result = new ArrayList<T>(indices.length);
    for(int i=indices.length-1; i>=0; i--) {
        result.add(elements.get(indices[i]));
    }
    hasNext = inc();
    return result;
}

public void remove() {
    throw new UnsupportedOperationException();
}

}

score 3 · Accepted Answer

私は今日、nサイズのセットのすべてのkサイズのサブセットを生成するという同じ問題を抱えていました。

Haskellで書かれた再帰的アルゴリズムがありましたが、問題はJavaで新しいバージョンを書く必要がありました。
Javaでは、再帰を最適化するためにメモ化を使用する必要があると思いました。結局、私はそれを繰り返し行う方法を見つけました。私は、ウィキペディアの組み合わせに関する記事のこの画像に触発されました。

すべてのkサイズのサブセットを計算する方法：

public static int[][] combinations(int k, int[] set) {
    // binomial(N, K)
    int c = (int) binomial(set.length, k);
    // where all sets are stored
    int[][] res = new int[c][Math.max(0, k)];
    // the k indexes (from set) where the red squares are
    // see image above
    int[] ind = k < 0 ? null : new int[k];
    // initialize red squares
    for (int i = 0; i < k; ++i) { ind[i] = i; }
    // for every combination
    for (int i = 0; i < c; ++i) {
        // get its elements (red square indexes)
        for (int j = 0; j < k; ++j) {
            res[i][j] = set[ind[j]];
        }
        // update red squares, starting by the last
        int x = ind.length - 1;
        boolean loop;
        do {
            loop = false;
            // move to next
            ind[x] = ind[x] + 1;
            // if crossing boundaries, move previous
            if (ind[x] > set.length - (k - x)) {
                --x;
                loop = x >= 0;
            } else {
                // update every following square
                for (int x1 = x + 1; x1 < ind.length; ++x1) {
                    ind[x1] = ind[x1 - 1] + 1;
                }
            }
        } while (loop);
    }
    return res;
}

二項分布の方法:
(Pythonの例、ウィキペディアから採用）

private static long binomial(int n, int k) {
    if (k < 0 || k > n) return 0;
    if (k > n - k) {    // take advantage of symmetry
        k = n - k;
    }
    long c = 1;
    for (int i = 1; i < k+1; ++i) {
        c = c * (n - (k - i));
        c = c / i;
    }
    return c;
}

もちろん、組み合わせは爆発する可能性があるため、常にスペースの問題があります。
私自身の問題の文脈では、可能な最大値は約2,000,000サブセットです。私のマシンはこれを1032ミリ秒で計算しました。

score 2 · Accepted Answer

afsantosの答えに触発されました：-）...フルセットから特定のサイズのすべてのサブセットの組み合わせを生成するC＃.NET実装を作成することにしました。可能なサブセットの総数を計算する必要はありません。終わりに達したときに検出します。ここにあります：

public static List<object[]> generateAllSubsetCombinations(object[] fullSet, ulong subsetSize) {
    if (fullSet == null) {
        throw new ArgumentException("Value cannot be null.", "fullSet");
    }
    else if (subsetSize < 1) {
        throw new ArgumentException("Subset size must be 1 or greater.", "subsetSize");
    }
    else if ((ulong)fullSet.LongLength < subsetSize) {
        throw new ArgumentException("Subset size cannot be greater than the total number of entries in the full set.", "subsetSize");
    }

    // All possible subsets will be stored here
    List<object[]> allSubsets = new List<object[]>();

    // Initialize current pick; will always be the leftmost consecutive x where x is subset size
    ulong[] currentPick = new ulong[subsetSize];
    for (ulong i = 0; i < subsetSize; i++) {
        currentPick[i] = i;
    }

    while (true) {
        // Add this subset's values to list of all subsets based on current pick
        object[] subset = new object[subsetSize];
        for (ulong i = 0; i < subsetSize; i++) {
            subset[i] = fullSet[currentPick[i]];
        }
        allSubsets.Add(subset);

        if (currentPick[0] + subsetSize >= (ulong)fullSet.LongLength) {
            // Last pick must have been the final 3; end of subset generation
            break;
        }

        // Update current pick for next subset
        ulong shiftAfter = (ulong)currentPick.LongLength - 1;
        bool loop;
        do {
            loop = false;

            // Move current picker right
            currentPick[shiftAfter]++;

            // If we've gotten to the end of the full set, move left one picker
            if (currentPick[shiftAfter] > (ulong)fullSet.LongLength - (subsetSize - shiftAfter)) {
                if (shiftAfter > 0) {
                    shiftAfter--;
                    loop = true;
                }
            }
            else {
                // Update pickers to be consecutive
                for (ulong i = shiftAfter+1; i < (ulong)currentPick.LongLength; i++) {
                    currentPick[i] = currentPick[i-1] + 1;
                }
            }
        } while (loop);
    }

    return allSubsets;
}

score 2 · Accepted Answer

この解決策は私のために働いた：

 private static void findSubsets(int array[])
    {
      int numOfSubsets = 1 << array.length; 

      for(int i = 0; i < numOfSubsets; i++)
     {
        int pos = array.length - 1;
       int bitmask = i;

       System.out.print("{");
       while(bitmask > 0)
       {
        if((bitmask & 1) == 1)
         System.out.print(array[pos]+",");
        bitmask >>= 1;
        pos--;
       }
       System.out.print("}");
     }
    }

score 0 · Accepted Answer

Swiftの実装：

以下は、afsantosによって提供された回答の2つのバリエーションです。

関数の最初の実装はcombinations、元のJava実装の機能を反映しています。

k2番目の実装は、セットから値のすべての組み合わせを見つけるための一般的なケースです[0, setSize)。これが本当に必要なすべてである場合、この実装はもう少し効率的です。

さらに、いくつかのマイナーな最適化とsmidginロジックの簡略化が含まれています。

/// Calculate the binomial for a set with a subset size
func binomial(setSize: Int, subsetSize: Int) -> Int
{
    if (subsetSize <= 0 || subsetSize > setSize) { return 0 }

    // Take advantage of symmetry
    var subsetSizeDelta = subsetSize
    if (subsetSizeDelta > setSize - subsetSizeDelta)
    {
        subsetSizeDelta = setSize - subsetSizeDelta
    }

    // Early-out
    if subsetSizeDelta == 0 { return 1 }

    var c = 1
    for i in 1...subsetSizeDelta
    {
        c = c * (setSize - (subsetSizeDelta - i))
        c = c / i
    }
    return c
}

/// Calculates all possible combinations of subsets of `subsetSize` values within `set`
func combinations(subsetSize: Int, set: [Int]) -> [[Int]]?
{
    // Validate inputs
    if subsetSize <= 0 || subsetSize > set.count { return nil }

    // Use a binomial to calculate total possible combinations
    let comboCount = binomial(setSize: set.count, subsetSize: subsetSize)
    if comboCount == 0 { return nil }

    // Our set of combinations
    var combos = [[Int]]()
    combos.reserveCapacity(comboCount)

    // Initialize the combination to the first group of set indices
    var subsetIndices = [Int](0..<subsetSize)

    // For every combination
    for _ in 0..<comboCount
    {
        // Add the new combination
        var comboArr = [Int]()
        comboArr.reserveCapacity(subsetSize)
        for j in subsetIndices { comboArr.append(set[j]) }
        combos.append(comboArr)

        // Update combination, starting with the last
        var x = subsetSize - 1
        while true
        {
            // Move to next
            subsetIndices[x] = subsetIndices[x] + 1

            // If crossing boundaries, move previous
            if (subsetIndices[x] > set.count - (subsetSize - x))
            {
                x -= 1
                if x >= 0 { continue }
            }
            else
            {
                for x1 in x+1..<subsetSize
                {
                    subsetIndices[x1] = subsetIndices[x1 - 1] + 1
                }
            }
            break
        }
    }
    return combos
}


/// Calculates all possible combinations of subsets of `subsetSize` values within a set
/// of zero-based values for the set [0, `setSize`)
func combinations(subsetSize: Int, setSize: Int) -> [[Int]]?
{
    // Validate inputs
    if subsetSize <= 0 || subsetSize > setSize { return nil }

    // Use a binomial to calculate total possible combinations
    let comboCount = binomial(setSize: setSize, subsetSize: subsetSize)
    if comboCount == 0 { return nil }

    // Our set of combinations
    var combos = [[Int]]()
    combos.reserveCapacity(comboCount)

    // Initialize the combination to the first group of elements
    var subsetValues = [Int](0..<subsetSize)

    // For every combination
    for _ in 0..<comboCount
    {
        // Add the new combination
        combos.append([Int](subsetValues))

        // Update combination, starting with the last
        var x = subsetSize - 1
        while true
        {
            // Move to next
            subsetValues[x] = subsetValues[x] + 1

            // If crossing boundaries, move previous
            if (subsetValues[x] > setSize - (subsetSize - x))
            {
                x -= 1
                if x >= 0 { continue }
            }
            else
            {
                for x1 in x+1..<subsetSize
                {
                    subsetValues[x1] = subsetValues[x1 - 1] + 1
                }
            }
            break
        }
    }
    return combos
}

java - Javaでサイズnのセットからk個の要素のサブセットを繰り返し生成するにはどうすればよいですか？

7 に答える 7

Related

Reference