java - 積がすべての必要な順列を含む生成セットを決定するための効率的なアルゴリズムは何ですか?

Question

次の形式の順列 (順序に関連する組み合わせ) のリストを考えてみましょう。

(1 2 3)
(1 2 4)
(5 2 3)
(5 2 4)

この順列グループの生成セットの最小数を見つける必要があります。たとえば、上記の順列を考えると、

(1 2 3,4)
(5 2 3,4)

最適解ではありません。最適解は (1,5 2 3,4) です。

この解にはセット A={1, 5} B={2} および C={3,4} が含まれていることがわかります。順列の元のリストは、これらのセットの順序付けられたデカルト積です: AXBX C.

順列のリストを、セット A、B、および C として表される可能な限り少ないグループに分割したいと思います。その積には、リスト内のすべての順列が含まれます。順列のリストを生成セットの単一のリストに減らすことが常に可能であるとは限らないため、最終的な答えは通常、セットのリストのリストですが、常にではありません。つまり、セット A、B、および C の積がリスト内の順列の一部を説明するのが通常のケースですが、セット D、E、および F はリスト内の他の順列を説明する必要があります。 .

この問題を解決するための私の大まかな試みは、リスト内の 2 つの順列スロットのいずれかに一致するかどうかを確認し、それらを再帰的にマージすることでした。もしそうなら、私はそれらの2つの組み合わせをマージしました。

(1 2 3)
(1 2 4)

生産

(1 2 3,4)

残念ながら、このような組み合わせのマージ順序は関連付けられていません。理想的な解決策は、セットの最終的なリストが可能な限り多くの順列を含むように、これらの組み合わせをマージすることです。

結合性の問題を示すために、次の例を取り上げます。これは、生成セットのリストを 2 つ未満に減らすことはできません。

(1 2 4) 
(1 2 3)
(1 2 5) 
(5 2 3) 
(5 2 4)

これらを次のルールに従って再帰的にマージするとします。次に、3 番目の列で 2 つのセットをマージして、新しいセットを作成します。この 2 つの行を結合した後、元の 2 つの行を破棄するので、それらが再結合されたり、二重にカウントされたりすることはありません。」これらのマージの順序は重要です。上記の順列のリストが与えられた場合、(1 2 3) と (1 2 4) をマージすると、(1 2 3,4) が得られます。では、次のマージを実行して生成セットを最適化するにはどうすればよいでしょうか? (1 2 5) を見て、2 つの列で (1 2 3,4) と一致することがわかったとします。マージを実行して (1 2 3,4,5) を取得します。すべて順調に見えます。ただし、(5 2 3) と (5 2 4) をマージすると、(5 2 3,4) になります。(5 2 3,4) と (1 2 3,4,5) を比較します。2 列の一致がないので、マージを停止します。

ここで、別の順序でマージしたとします。(1 2 3) と (1 2 4) をマージして (1 2 3,4) を生成します。次に、(5 2 3) と (5 2 4) をマージして (5 2 3,4) を生成します。これら 2 つの製品が一致していることがわかります。次に、(1 2 3,4) と (5 2 3,4) をマージして (1,5 2 3,4) を生成します。生成セットの最終的なリストは (1,5 2 3,4) と (1 2 5) です。したがって、マージ順序が 2 つの異なる回答を生成したことがわかります: (1,5 2 3,4) と (1 2 5) 対 (5 2 3,4) と (1 2 3,4,5)。

この場合、同じ数 (2) の生成セットのリストが各回答で発生するため、おそらくどちらかの回答に落ち着くでしょう。ただし、(1,5 2 3,4) と (1 2 5) の方がやや好ましいです。これは、(1,5 2 3,4) が可能な最大数の組み合わせを含むためです。ただし、900 の組み合わせのリストがあるとします。問題に対するボトムアップソリューションのマージ順序により、最適化がセットのリストのリストで可能な最小数である最適化されていない方法で問題を減らすことになります。可能性のあるすべてのマージパスを事前に調べずに、マージ順序が何であるかを知ることは困難です。これは、私も試した一致を見つける力ずくの方法よりも効率的ではありません。

力ずくの方法も試してみました。ブルートフォース方式の効率が受け入れられないのはなぜですか? このメソッドは、最初にすべての列で一意の整数のリストを見つけます。次に、これらの整数の可能なすべての組み合わせの累乗セットを生成します。列/セット A、列/セット B、および列/セット C についても同様です。次に、これらのセットをサイズ (最大から最小) で並べ替え、他の列の他のすべてのセットの各セットのデカルト積を計算します。これらのデカルト積をループします。これらのデカルト積は、生成セットによってキー付けされ、デカルト積が入力からの順列のリストと一致するかどうかを確認します。これはおおよそ O(n^3) で、n は入力リスト内の組み合わせの数です。これがO(n^2)でもできれば、今より勝てると思います。

メモリに関する考慮事項に関する限り。各スロットのドメインが 1 ～ 12 の整数であると仮定しましょう。3 つのスロットにわたる異なる組み合わせの数は 12!/3! です。7,900 万を超える生の組み合わせを見ています。それは、Google の Guava Collection API によってセットに分割される前のことです (ところで、これを強くお勧めします)。どうにかしてセットを遅延生成することもできます。Google がそうしているように感じますが、メモリの制約は依然として大きいです。

この問題を解決するアルゴリズムを本当に探しています。ただし、最小限の負担でこれを処理する Java ライブラリまたはメソッドがあれば、それも歓迎します。ありがとう。

score 0 · Accepted Answer

皆さんの洞察と回答に感謝します。この回答は、実行可能なソリューションを次のように提示したJohn Kolen ( http://www.johnfkolen.com ) の功績によるものです。

トリプルの最大カバレッジのための貪欲なアルゴリズム

入力: A、B、および C が整数のセットである、トリプルのサブセット A x B x C を持つセット。

出力: n 個のトリプル (A_i、B_i、C_i) のセット。A の A_i、B の B_i、C の C_i、および Union_i A_i x B_i x C_i = X および Intersection_i A_i x B_i x C_i = 空。

アルゴリズム

アルゴリズムは 3 つの部分に分けることができます: ペアカバーの検索、トリプルカバーの検索、最後にトータルカバーの作成です。

B x C のサブセットからペアのカバーを見つけるには、B の要素から C のセットへのマップを作成する必要があります。基本的に、このデータ構造は小さなプレフィックスツリーまたはトライであり、プレフィックスのセットの最大カバーを簡単に見つけることができます。たとえば、b_1->C_1 および b_2->C_2 の場合、b_1 と b_2 を含む最大カバーは <[b_1,b_2],C_1 交差点 C_2> になります。

最大のペアを見つけたら、最大のトリプルを構築することができます。ただし、今回は、プレフィックス (A) が一連のペア (BxC) にマップされます。前の方法を使用すると、A のすべてのサブセットと、接尾辞に対するそれらの一致するペアカバレッジを調べることができます。

貪欲なアプローチは、局所的に最適なカバーを見つけて、それをソリューションセットに追加します。現在のカバーによってキャプチャされたトリプルは考慮から除外され、ローカルで最適なカバーがシングルトンになるまでプロセスが続行されます。残りのトリプルがソリューションに追加されます。

関連するコードが添付されています。

import java.io.IOException;
import java.io.FileReader;
import java.io.BufferedReader;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Set;
import java.util.Arrays;

class Triples {
    private ArrayList<int[]> _data;
    private HashMap<Integer,HashSet<Integer>> _tree;

    /**
     * Instantiates a container of triples read from the provided file.
     * <p>
     * File should contain lines of the form '[#,#,#]', where # is an
     * integer. Spaces are ignored.
     *
     * @param filename a path to a file containing triples
     */
    public Triples(String filename) {
    _data = new ArrayList<int[]>();
    _tree = new HashMap<Integer, HashSet<Integer>>();
    readfile(filename);
    }

    /**
     * Returns the number of triples.
     *
     * @return the number of triples.
     */
    int size() {
    return _data.size();
    }

    /**
     * Returns a set of encoded pairs (b,c) where (first, b, c) is in the
     * set of triples. The pairs are encoded as integers using b * 255 + c.
     *
     * @param  first  the first value of a triple
     * @return        a set of pair encondig
     */
    HashSet<Integer> tree(int first) {
    return _tree.get(first);
    }

    public class PairCover 
    {
    public ArrayList<Integer> a;
    public ArrayList<Integer> b;
    PairCover() 
    {
        a = b = null;
    }
    PairCover(ArrayList<Integer> ax, ArrayList<Integer> bx) 
    {
        a = ax;
        b = bx;
    }
    /**
     * Returns the number of pairs covered by this cover.
     *
     * @return the number of pairs covered by this cover.
     */
    public int score()
    {
        if (a != null && b != null)
        return a.size() * b.size();
        else
        return 0;
    }
    public String toString() {
        return "<"+a+","+b+">";
    }
    }

    /**
     * Compute the maximal pair covering (B,C) for a set of suffix pairs.
     *
     * @return pair covering where |BxC| is maximized.
     */
    public PairCover maxPairCover(HashSet<Integer> set) {
    // unpack the pairs
    HashMap<Integer,NSet> pairs = new HashMap<Integer,NSet>();
    for(Integer value : set) {
        Integer a = value / 256;
        Integer b = value & 255;
        if (!pairs.containsKey(a))
        pairs.put(a, new NSet());
        pairs.get(a).add(b);
    }

    _mpc_best = new PairCover();
    Integer[] remain_ary = pairs.keySet().toArray(new Integer[0]);

    // recursively examine all subsets pair first values and find matching
    // second value groups.
    maxPairCoverRec(pairs,
            new ArrayList<Integer>(),
            new ArrayList<Integer>(Arrays.asList(remain_ary)),
            null);

    return _mpc_best;
    }

    /**
     * Recursively compute the maximal pair covering (B,C) for a set of 
     * suffix pairs from a set of active prefixes by combining all possible
     * subsets of the remaining prefixes.
     * <p>
     * Stores the best pair cover in _mpc_best. This "global" variable makes it
     * easy to prune search.
     *
     * @param pairs tree-compressed set of pairs
     * @param active currently active pair prefixes
     * @param remain set of pair prefixes to explore
     * @param set the pair suffixes shared by all the active prefixes
     * @return pair covering where |BxC| is maximized.
     */
    void maxPairCoverRec(HashMap<Integer,NSet> pairs,
                 ArrayList<Integer> active,
                 ArrayList<Integer> remain,
                 NSet set) 
    {
    if (set != null) {
        // Prune search if suffix set is empty or that there is no way
        // to generate a better pair cover than the current cover.
        if (set.isEmpty() ||
        (active.size() + remain.size()) * set.size()
        <= _mpc_best.score())
        return ;
    }
    if (remain.isEmpty()) {
        if (active.size() > 0) {
        // Save the current pair cover if it's better than the current
        // cover.
        if (_mpc_best.score() < set.size() * active.size()) {
            _mpc_best.a = (ArrayList<Integer>)(active.clone());
            _mpc_best.b = (ArrayList<Integer>)(set.to_array_list());
        }
        }
        return;
    }

    ArrayList<Integer> active_r = (ArrayList<Integer>)(active.clone());
    ArrayList<Integer> remain_r = (ArrayList<Integer>)(remain.clone());
    Integer a = remain_r.remove(0);
    // Explore active sets without element a.
    maxPairCoverRec(pairs, active_r, remain_r, set);

    // Explore active sets with element a. Perform intersection of 
    // current suffix set with suffixes of a.
    if (set == null) {
        set = pairs.get(a);
    }
    else {
        set = set.intersection(pairs.get(a));
    }
    active_r.add(a);
    maxPairCoverRec(pairs, active_r, remain_r, set);
    }

    public class TripleCover 
    {
    public ArrayList<Integer> a;
    public ArrayList<Integer> b;
    public ArrayList<Integer> c;
    TripleCover() 
    {
        a = b = c = null;
    }
    TripleCover(ArrayList<Integer> ax,
            ArrayList<Integer> bx,
            ArrayList<Integer> cx) 
    {
        a = ax;
        b = bx;
        c = cx;
    }
    TripleCover(int ax,
            int bx,
            int cx) 
    {
        a = new ArrayList<Integer>();
        a.add(ax);
        b = new ArrayList<Integer>();
        b.add(bx);
        c = new ArrayList<Integer>();
        c.add(cx);
    }
    /**
     * Returns the number of pairs covered by this cover.
     *
     * @return the number of pairs covered by this cover.
     */
    public int score()
    {
        if (a != null && b != null && c != null)
        return a.size() * b.size() * c.size();
        else
        return 0;
    }
    public String toString() {
        return "<"+a+","+b+","+c+">";
    }
    }

    /**
     * Compute the maximal triple cover for the pairs currently stored
     * in _tree.
     *
     * @return a triple cover with |AxBxC| maximized
     */
    TripleCover maxTripleCover() {
    _mtc_best = new TripleCover();
    Integer[] remain_ary = _tree.keySet().toArray(new Integer[0]);
    ArrayList<Integer> remain = 
        new ArrayList<Integer>(Arrays.asList(remain_ary));
    maxTripleCoverRec(new ArrayList<Integer>(),
                 remain,
                 null);
    return _mtc_best;
    }

    /**
     * Recursively explore all subsets of values in first position and
     * find the largest cover for the second and third positions.
     * <p>
     * Stores the best triple cover in _mtc_best. This "global" variable 
     * makes it easy to prune search.
     *
     * @param active the active values of the first position
     * @param remain the additional values of the first position to explore
     * @param set the current set of second/third position pairs that are shared by the active values
     */
    void maxTripleCoverRec(ArrayList<Integer> active,
               ArrayList<Integer> remain,
               HashSet<Integer> set) {
    if (set != null &&
        (set.isEmpty() ||
         (remain.size()>0 && 
          (active.size() + remain.size()) * set.size()
          <= _mtc_best.score()))){
        return;
    }
    if (remain.isEmpty()) {
        if (active.size() > 0) {
        PairCover mpc = maxPairCover(set);
        if (_mtc_best.score() < mpc.score() * active.size()) {
            _mtc_best.a = (ArrayList<Integer>)(active.clone());
            _mtc_best.b = mpc.a;
            _mtc_best.c = mpc.b;
        }
        }
        return;
    }

    ArrayList<Integer> active_r = (ArrayList<Integer>)(active.clone());
    ArrayList<Integer> remain_r = (ArrayList<Integer>)(remain.clone());
    Integer a = remain_r.remove(0);
    maxTripleCoverRec(active_r, remain_r, set);
    if (set == null) {
        set = (HashSet<Integer>)(_tree.get(a).clone());
    }
    else {
        set = (HashSet<Integer>)(set.clone());
        set.retainAll(_tree.get(a));
    }
    active_r.add(a);
    maxTripleCoverRec(active_r, remain_r, set);

    }

    /**
     * Finds a covering set for the triples using a greedy approach.
     * The largest cover of the current triples is found, recorded, and
     * the affected triples are removed from consideration. This process
     * continues until singleton covers are left.
     *
     * @return a list of covers
     */
    ArrayList<TripleCover> greedy() {
    // Since the prefix tree is modified as new covers are found,
    // we make a copy
    HashMap<Integer,HashSet<Integer>> old_tree = _tree;
    _tree = (HashMap<Integer,HashSet<Integer>>)(_tree.clone());

    ArrayList<TripleCover> solution = new ArrayList<TripleCover>();
    TripleCover cover;
    do {
        cover = maxTripleCover();  // find best
        solution.add(cover);  // record it
        remove(cover);  // remove covered triples from tree
        System.out.println("" + cover + " ==> " + cover.score());
    } while (cover.score() > 1);

    // Nothing left but singletons, so add them to solution
    for(Integer a : _tree.keySet()) {
        for(Integer pair : _tree.get(a)) {
        int b = pair / 256;
        int c = pair & 255;
        solution.add(new TripleCover(a,b,c));
        }
    }

    // restore prefix tree
    _tree = old_tree; 

    return solution;
    }

    //************************************************************

    private PairCover _mpc_best;
    private TripleCover _mtc_best;


    /**
     * Reads triples from the specified file. Will exit if file does not
     * exist.
     *
     * @param filename a path to a file containing triples
     */
    private void readfile(String filename) {
    try {
        FileReader fr = new FileReader(filename);
        BufferedReader reader = new BufferedReader(fr);
        String line = null;
        while ((line = reader.readLine()) != null) {
        line = line.replace("[","").replace("]","").replace(" ","");
        String[] svalues = line.split(",");
        int[] values = new int[3];
        values[0] = Integer.parseInt(svalues[0]);
        values[1] = Integer.parseInt(svalues[1]);
        values[2] = Integer.parseInt(svalues[2]);
        _data.add(values);
        Integer key = new Integer(values[0]);
        if (!_tree.containsKey(key)) {
            _tree.put(key, new HashSet<Integer>());
        }
        _tree.get(key).add(values[1] * 256 + values[2]);
        }
    } catch (IOException e) {
        System.out.println("Could not open " + filename);
        System.exit(1);
    }
    }

    /**
     * Removes covered triples from the prefix tree
     *
     * @param tc a covering
     */
    private void remove(TripleCover tc) {
    for(Integer a : tc.a) {
        HashSet<Integer> set = _tree.get(a);
        for(Integer b : tc.b) {
        for(Integer c : tc.c) {
            set.remove(b * 256 + c);
        }
        }
        if (set.isEmpty()) {
        _tree.remove(a);
        }
    }
    }

}

java - 積がすべての必要な順列を含む生成セットを決定するための効率的なアルゴリズムは何ですか?

1 に答える 1

Related

Reference