performance - O(log n) で並べ替えられた行列 (行 n 列) の数値を検索します

Question

MxN行と列が並べ替えられた行列 ( ) があるとします。

各行のすべての要素は昇順で配置されます
各列のすべての要素は昇順で配置されます
すべての要素は整数です
それ以外の仮定はできない

例：

[1 5 8 20]

[2 9 19 21]

[12 15 25 30]

特定の数値が行列に存在するかどうかを確認する必要があります (基本検索)。私は実行するアルゴリズムを持っていますO(n)

int row = 0;
int col = N-1;

while (row < M && col >= 0) {
  if (mat[row][col] == elem) { 
    return true;
  } else if (mat[row][col] > elem) { 
    col--;
  } else { 
    row++;
  } 
}

しかし、私はO(log (MxN)) == O(Log(n))解決策を求められました。何か案は？？

score 76 · Accepted Answer

このタスクでは、O(log (M * N)) のソリューションは使用できません。

単純化されたタスクを見てみましょう: 「ソートされた」正方行列では、2 次対角線 (緑) より上のすべての要素が指定された数よりも小さく、2 次対角線より下にあるすべての要素 (赤) が指定された数よりも大きく、2 次対角線上の要素に対して追加の仮定がない (黄色）。

ここに画像の説明を入力

このタスクの元の仮定も、これらの追加の仮定も、二次対角要素が互いにどのように関連しているかを教えてくれません。つまり、ソートされていない N 個の整数の配列しかないということです。ソートされていない配列で指定された数値を O(N) より速く見つけることはできません。したがって、正方行列を使用した元の (より複雑な) 問題では、O(N) よりも優れた解を得ることができません。

長方形の行列の場合、正方形の図を引き伸ばし、それに応じて追加の仮定を設定します。ここでは、それぞれ max(N,M)/min(N,M) のサイズの min(N,M) ソートされたサブ配列があります。ここで検索する最良の方法は、線形検索を使用して、指定された値を含む可能性のある 1 つまたは複数のサブ配列を見つけてから、これらのサブ配列内でバイナリ検索を使用することです。最悪の場合、各サブ配列でバイナリ検索を行う必要があります。複雑さは O(min(N,M) * (1 + log(max(N,M) / min(N,M)))) です。したがって、長方形行列を使用した元の (より複雑な) 問題では、O(min(N,M) * ( 1 + log(max(N,M)) - log(min(N,M))) よりも優れた解を得ることができません。）。

score 7 · Accepted Answer

O(n)以上のことはできません。一部の人 (このページには少なくとも 3 人います) はもっとうまくやれると思っていますが、それは彼らのアルゴリズムが間違っているか、アルゴリズムの複雑さを計算する方法がわからないため、推測しようとするからです。このブログ投稿は非常に優れており、これらの人たちの誤りを説明しています.

O(n) が最適であることの証明のドラフト: 次の行列を考えます。

1     2     3     4     5     6 … (n-2)  (n-1) (n+1)
2     3     4     5     6     7 … (n-1)  (n+1) (n+2)
3     4     5     6     7     8 … (n+1)  (n+2) (n+3)
…     …     …     …     …     … … …      …     …
(n-2) (n-1) …     …     …     … … …      …     (2n-1)
(n-1) (n+1) …     …     …     … … …      …     2n
(n+1) (n+2) …     …     …     … … (2n-1) 2n    (2n+1)

このマトリックスで探している場合は、行にある場合は、行にある可能性があるため、行nごとに少なくとも 1 回チェックする必要があります。（証明は完全ではありませんが、アイデアは次のとおりです）nn

score 4 · Accepted Answer

この問題を解決するには、再帰を使用する必要があります。行列 X と数値 y を指定すると、X の中央の行で y のバイナリ検索を実行し、行列を次のように 4 つの部分に分割できます。

A|B
---
C|D

A のすべての要素は y よりも小さく、D のすべての要素は y よりも大きく、y は B と C にある可能性があります。B と C で y を繰り返し見つけます。

height(A)=height(B)\approx= height(C)=height(D) なので、 size(X)>= 2*(size(B)+size(C)) . したがって、O(logn) の場合の結果の複雑さ。

def find(X,y):
    a,b = X.shape
    i = a /2
    j = binsearch(X[i,:], y)
    if X[i,j]==y:
        return True
    else:
        return find( X[ (i+1):a, 0:(j-1)], y ) or find( X[ 0:i, j:b], y )

score 2 · Accepted Answer

Since both rows and columns are sorted, if we look at the first element of each row we can find which one contains the number we're looking for. Then, again, we can exploit the fact that the elements in each row are sorted and find that number.
The fastest search algorithm I know is Binary Search, which has a complexity of O(log n), so the total complexity will be O(log m + log n).
Here's an example, suppose we're looking for 28:

 1  2  3  4  5  6  7  8  9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50

We do a binary search over the elements of the first column (1, 11, 21, 31, 41) and find that the row is the third, because its first element is smaller than our number but the next row's first element is larger. Number of steps: 2 (21, 31, found)
We do a binary search again over the third row (21, 22, 23, 24, 25, 26, 27, 28, 29, 30) and find our number. Number of steps: 2 - 3 (25, 27 or 28, found)

performance - O(log n) で並べ替えられた行列 (行 n 列) の数値を検索します

5 に答える 5

Related

Reference