15

Let's say that we have a recursive data-structure, like a binary tree. There are many ways to traverse it, and they have different memory-usage profiles. For instance, if we were to simply print the value of each node, using pseudo-code like the following in-order traversal...

visitNode(node) {
    if (node == null) return;
    visitNode(node.leftChild);
    print(node.value);
    visitNode(node.rightChild);
}

...our memory usage would be constant, but due to the recursive calls, we would increase the size of the call stack. On very large trees, this could potentially overflow it.

Let's say that we decided to optimize for call-stack size; assuming that this language is capable of proper tailcalls, we could rewrite this as the following pre-order traversal...

visitNode(node, nodes = []) {
    if (node != null) {
        print(node.value);
        visitNode(nodes.head, nodes.tail + [node.left, node.right]);
    } else if (node == null && nodes.length != 0 ) {
        visitNode(nodes.head, nodes.tail);
    } else return;
}

While we would never blow the stack, we would now see heap usage increase linearly with respect to the size of the tree.

Let's say we were then to attempt to lazily traverse the tree - here is where my reasoning gets fuzzy. I think that even using a basic lazy evaluation strategy, we would grow memory at the same rate as the tailcall optimized version. Here is a concrete example using Scala's Stream class, which provides lazy evaluation:

sealed abstract class Node[A] {
  def toStream: Stream[Node[A]]
  def value: A
}

case class Fork[A](value: A, left: Node[A], right: Node[A]) extends Node[A] {
  def toStream: Stream[Node[A]] = this #:: left.toStream.append(right.toStream)
}

case class Leaf[A](value: A) extends Node[A] {
  def toStream: Stream[Node[A]] = this #:: Stream.empty
}

Although only the head of the stream is strictly evaluated, anytime the left.toStream.append(right.toStream) is evaluated, I think this would actually evaluate the head of both the left and right streams. Even if it doesn't (due to append cleverness), I think that recursively building this thunk (to borrow a term from Haskell) would essentially grow memory at the same rate. Rather than saying, "put this node in the list of nodes to traverse", we're basically saying, "here's another value to evaluate that will tell you what to traverse next", but the outcome is the same; linear memory growth.

The only strategy I can think of that would avoid this is having mutable state in each node declaring which paths have been traversed. This would allow us to have a referentially transparent function that says, "Given a node, I will tell you which single node you should traverse next", and we could use that to build an O(1) iterator.

Is there another way to accomplish O(1), tailcall optimized traversal of a binary tree, possibly without mutable state?

4

4 に答える 4

12

おそらく変更可能な状態なしで、O(1) を達成する別の方法はありますか?

コメントで述べたように、ツリーがトラバーサルを生き残る必要がない場合は、これを行うことができます。Haskell の例を次に示します。

data T = Leaf | Node T Int T

inOrder :: T -> [Int]
inOrder Leaf                     =  []
inOrder (Node Leaf x r)          =  x : inOrder r
inOrder (Node (Node l x m) y r)  =  inOrder $ Node l x (Node m y r)

ガベージ コレクターが処理したばかりのものをクリーンアップすると仮定すると、これは O(1) の補助スペースを必要とするNodeため、効果的に右ローテーションされたバージョンに置き換えます。ただし、処理するノードをすぐにガベージ コレクションできない場合、最後の句は、リーフに到達する前にO( n ) 個のノードを構築する可能性があります。

親ポインターがある場合は、それも実行可能です。ただし、親ポインターは変更可能な状態を必要とし、サブツリーの共有を妨げるため、実際には機能しません。(cur, prev)最初はであるペアで反復子を表す場合は、ここで(root, nil)概説されているように反復を実行できます。ただし、これを機能させるには、ポインター比較を備えた言語が必要です。

親ポインターと変更可能な状態がなければ、少なくともツリーのルートがどこにあり、そこにどのように到達するかを追跡するデータ構造を維持する必要があります。これは、インオーダーまたはポストオーダーのある時点でそのような構造が必要になるためです。横断。このような構造は必然的に Ω( d ) 空間をとります。ここでdは木の深さです。

于 2013-08-21T12:55:58.797 に答える