python - Python 間隔交差

Question

私の問題は次のとおりです。

間隔のリストを含むファイルを持つ：

そして、範囲

0 200

指定された範囲内の間隔の間の位置 [start end] を報告するような交差をしたいと思います。

例えば：

8 9
12 20
30 200

どのようにこれを噛むかについてのアイデアの他に、入力ファイルは常に巨大になるため、最適化に関するいくつかの考えを読んでおくとよいでしょう。

score 1 · Accepted Answer

大まかなアルゴリズム:

ブール値の配列を作成し、すべて false に設定しますseen = [False]*200
各行start endセットについて、入力ファイルを反復処理しますseen[start]..seen[end]True
完了したら、配列を簡単にウォークして、未使用の間隔を見つけることができます。

最適化に関しては、入力範囲のリストが開始番号でソートされている場合、最大の数値を追跡し、それを使用して処理中に範囲をフィルタリングできます-たとえば、次のようなものです

for (start,end) in input:
  if end<=lowest_unseen:
    next
  if start<lowest_unseen:
    start=lowest_unseen
  ...

(元の並べ替えのコストを無視して) 全体を O(n) にする必要があります。配列を 1 回通過して、見た/見られないタグを付け、一度見られないものを出力します。

気持ちよさそうです。入力ファイルが呼び出されると仮定すると、(最適化されていない) コードは次のとおりです。input

seen = [False]*200
file = open('input','r')
rows = file.readlines()
for row in rows:
  (start,end) = row.split(' ')
  print "%s %s" % (start,end)
  for x in range( int(start)-1, int(end)-1 ):
    seen[x] = True

print seen[0:10]

in_unseen_block=False
start=1
for x in range(1,200):
  val=seen[x-1]
  if val and not in_unseen_block:
    continue
  if not val and in_unseen_block:
    continue
  # Must be at a change point.
  if val:
    # we have reached the end of the block
    print "%s %s" % (start,x)
    in_unseen_block = False
  else:
    # start of new block
    start = x
    in_unseen_block = True
# Handle end block
if in_unseen_block:
  print "%s %s" % (start, 200)

最適化は、読者の演習として残しておきます。

score 1 · Accepted Answer

入力区間の 1 つが開いたり閉じたりするたびにメモを作成するopensと、とのキーをまとめてcloses、順序付けられたセットに分類することで、やりたいことを実行できます。、隣接する数値の各ペアが間隔を形成するとしましょう。そうすれば、すべてのロジックをこれらの間隔に個別のチャンクとして集中させることができます。」

myRange = range(201)
intervals = [(1,5), (2,8), (9,12), (20,30)]
opens = {}
closes = {}

def open(index):
    if index not in opens:
        opens[index] = 0
    opens[index] += 1

def close(index):
    if index not in closes:
        closes[index] = 0
    closes[index] += 1

for start, end in intervals:
    if end > start: # Making sure to exclude empty intervals, which can be problematic later
        open(start)
        close(end)

# Sort all the interval-endpoints that we really need to look at
oset = {0:None, 200:None}
for k in opens.keys():
    oset[k] = None
for k in closes.keys():
    oset[k] = None
relevant_indices = sorted(oset.keys())

# Find the clear ranges
state = 0
results = []
for i in range(len(relevant_indices) - 1):
    start = relevant_indices[i]
    end = relevant_indices[i+1]

    start_state = state
    if start in opens:
        start_state += opens[start]
    if start in closes:
        start_state -= closes[start]

    end_state = start_state
    if end in opens:
        end_state += opens[end]
    if end in closes:
        end_state -= closes[end]
    state = end_state

    if start_state == 0:
        result_start = start
        result_end = end
        results.append((result_start, result_end))

for start, end in results:
    print(str(start) + " " + str(end))

これは以下を出力します:

間隔をソートする必要はありません。

score 1 · Accepted Answer

この質問はMerging interval in Pythonの複製のようです。

私が問題をよく理解していれば、間隔 (1 5; 2 8; 9 12; 20 30) と範囲 (0 200) のリストがあり、間隔外の位置を取得したいが、指定された範囲内にあります。右？

これに役立つ Python ライブラリがあります: python-intervals (pip を使用して PyPI からも入手可能)。免責事項: 私はそのライブラリのメンテナーです。

このライブラリを次のようにインポートするとします。

import intervals as I

あなたの答えを得るのはとても簡単です。基本的に、最初に提供したものに基づいて間隔の論理和を作成します。

inters = I.closed(1, 5) | I.closed(2, 8) | I.closed(9, 12) | I.closed(20, 30)

次に、これらの間隔の補数を計算して、「外側」にあるすべてのものを取得します。

compl = ~inters

次に、ポイントをその間隔に制限したいので、[0, 200] でユニオンを作成します。

print(compl & I.closed(0, 200))

これにより、次の結果が得られます。

[0,1) | (8,9) | (12,20) | (30,200]

python - Python 間隔交差

4 に答える 4

Related

Reference