algorithm - 最も「高価な」配列の要素数を返します

Question

最近、興味深い問題に出くわしました。私の解決策が最適かどうか疑問に思っています。

0 と 1 の配列が与えられます。目標は、最も高価なサブ配列でゼロの量と 1 の量を返すことです。

配列のコストは、1 の量を 0 の量で割ったものです。サブ配列にゼロがない場合、コストはゼロです。

最初はブルートフォーシングを試みましたが、10,000 要素の配列では遅すぎて、メモリが足りなくなりました。
私の 2 番目のアイデアは、これらのサブ配列を作成する代わりに、サブ配列の開始と終了を記憶することでした。こうすることで大量のメモリを節約できましたが、複雑さは依然として O(n ² ) でした。
私が思いついた最終的な解決策は、O(n) だと思います。こんなふうになります：

配列の先頭から開始し、要素ごとに、1 から現在のインデックスまでのサブ配列のコストを計算します。したがって、最初の要素からなるサブ配列から始め、次に 1 番目と 2 番目などとなります。コストを計算する必要があるのは、サブ配列内の 1 と 0 の量だけなので、サブアレイの最適な端。

2 番目のステップは、ステップ 1 のサブアレイの最後から開始し、同じことを繰り返して最適な開始点を見つけることでした。そうすれば、配列全体でこれ以上の組み合わせはないと確信できます。

この解決策は正しいですか？そうでない場合、この解決策が正しくないことを示す反例はありますか?

編集

わかりやすくするために、入力配列が 0101 であるとします。10 個のサブ配列があります: 0、1、0、1、01、10、01、010、101、および 0101。

101 が最も高価なサブアレイであるため、最も高価なサブアレイのコストは 2 になります。したがって、アルゴリズムは 1,2 を返す必要があります

編集 2

忘れていたことがもう 1 つあります。2 つのサブアレイのコストが同じ場合、長い方が「より高価」になります。

score 2 · Accepted Answer

私の仮定の証明をスケッチしましょう。

(a = 配列全体、*=ゼロ以上、+=1 つ以上、{n}=正確に n)

ケースa=0*とa=1+: c=0

a=01+とa=1+0：に適合し、1*0{1,2}1*a が最適

For the normal case, a contains one or more 0s and 1s.
This means there is some optimum sub-array of non-zero cost.
(S) Assume s is an optimum sub-array of a.
It contains one or more zeros. (Otherwise its cost would be zero).
(T) Let t be the longest `1*0{1,2}+1*` sequence within s 
(and among the equally long the one with with most 1s).
(Note: There is always one such, e.g. `10` or `01`.)
Let N be the number of 1s in t.
Now, we prove that always t = s.
By showing it is not possible to add adjacent parts of s to t if (S).
(E) Assume t shorter than s.
We cannot add 1s at either side, otherwise not (T).
For each 0 we add from s, we have to add at least N more 1s 
later to get at least the same cost as our `1*0+1*`.
This means: We have to add at least one run of N 1s.
If we add some run of N+1, N+2 ... somewhere than not (T).
If we add consecutive zeros, we need to compensate 
with longer runs of 1s, thus not (T).
This leaves us with the only option of adding single zeors and runs of N 1s each.
This would give (symmetry) `1{n}*0{1,2}1{m}01{n+m}...`
If m>0 then `1{m}01{n+m}` is longer than `1{n}0{1,2}1{m}`, thus not (T).
If m=0 then we get `1{n}001{n}`, thus not (T).
So assumption (E) must be wrong.

結論: 最適なサブアレイはに準拠する必要があり1*0{1,2}1*ます。

1*01*最後のコメント (または1*001*)の仮定によると、Java での O(n) impl は次のとおりです。

public class Q19596345 {
    public static void main(String[] args) {
        try {
            String array = "0101001110111100111111001111110";
            System.out.println("array=" + array);
            SubArray current = new SubArray();
            current.array = array;
            SubArray best = (SubArray) current.clone();
            for (int i = 0; i < array.length(); i++) {
                current.accept(array.charAt(i));
                SubArray candidate = (SubArray) current.clone();
                candidate.trim();
                if (candidate.cost() > best.cost()) {
                    best = candidate;
                    System.out.println("better: " + candidate);
                }
            }
            System.out.println("best: " + best);
        } catch (Exception ex) { ex.printStackTrace(System.err); }
    }
    static class SubArray implements Cloneable {
        String array;
        int start, leftOnes, zeros, rightOnes;

        // optimize 1*0*1* by cutting
        void trim() {
            if (zeros > 1) {
                if (leftOnes < rightOnes) {
                    start += leftOnes + (zeros - 1);
                    leftOnes = 0;
                    zeros = 1;
                } else if (leftOnes > rightOnes) {
                    zeros = 1;
                    rightOnes = 0;
                }
            }
        }

        double cost() {
            if (zeros == 0) return 0;
            else return (leftOnes + rightOnes) / (double) zeros + 
                (leftOnes + zeros + rightOnes) * 0.00001;
        }
        void accept(char c) {
            if (c == '1') {
                if (zeros == 0) leftOnes++;
                else rightOnes++;
            } else {
                if (rightOnes > 0) {
                    start += leftOnes + zeros;
                    leftOnes = rightOnes;
                    zeros = 0;
                    rightOnes = 0;
                }
                zeros++;
            }
        }
        public Object clone() throws CloneNotSupportedException { return super.clone(); }
        public String toString() { return String.format("%s at %d with cost %.3f with zeros,ones=%d,%d", 
            array.substring(start, start + leftOnes + zeros + rightOnes), start, cost(), zeros, leftOnes + rightOnes);
        }
    }
}

score 1 · Accepted Answer

最大配列が常に 1+0+1+、1+0、または 01+ であることを示すことができれば (正規表現の表記法で実行回数を計算できます)

したがって、配列 (010011) の場合、(常に 1 の連続で始まります)

0,1,1,2,2

したがって、比率は (0, 1, 0.3, 1.5, 1) であり、最終結果として 10011 の配列になり、1 回の実行は無視されます。

左端のコストは 0 右端のコストは 2

したがって、この場合、右端が正解です -- 011

反例はまだ思いつきませんが、証明も自明ではありません。うまくいけば、クラウドソーシングができるでしょう:)

縮退のケースはより単純です。1 と 0 はすべて同じコストであるため、すべて 1 と 0 は明らかです。1+、0+、またはその逆の文字列は、すべて 1 で 1 つの 0 です。

score 0 · Accepted Answer

これはどう？C# プログラマーとして、Dictionary のようなものを使用できると考えています<int,int,int>. 。最初の int はキーとして使用され、2 番目はサブ配列番号として使用され、3 番目はサブ配列の要素として使用されます。

あなたの例のキー|サブ配列番号|要素

   1|1|0
   2|2|1
   3|3|0
   4|4|1
   5|5|0
   6|5|1
   7|6|1
   8|6|0
   9|7|0
   10|7|1
   11|8|0
   12|8|1
   13|8|0
   14|9|1
   15|9|0
   16|9|1
   17|10|0
   18|10|1
   19|10|0
   20|10|1

次に、辞書を実行して、最高のものを変数に格納できます。

var maxcost=0
var arrnumber=1;
var zeros=0;
var ones=0;
var cost=0;
for (var i=1;i++;i<=20+1)
{
    if ( dictionary.arraynumber[i]!=dictionary.arraynumber[i-1])
    {
       zeros=0;
       ones=0;
       cost=0;
       if (cost>maxcost)
       {
           maxcost=cost;
       }
    }
    else
    {
        if (dictionary.values[i]==0)
        {
          zeros++; 
        }
        else
        {
         ones++;
        }
        cost=ones/zeros;
    }

}

これはlog（n ^ 2）になります。配列のメモリのサイズが3nだけ必要ですか？

algorithm - 最も「高価な」配列の要素数を返します

編集

編集 2

4 に答える 4

Related

Reference