scala - Seq[T]、immutable.Seq[T]、List[T]、または Vector[T] を保持するように Scala の不変ケースクラスを定義する必要がありますか?

Question

タプルなどの単一のオブジェクトを保持するケースクラスを定義したい場合は、簡単に実行できます。

sealed case class A(x: (Int, Int))

この場合、「x」値の取得には一定の短い時間がかかり、このクラスは作成方法に関係なく、一定の小さなスペースしか必要としません。

さて、代わりに一連の値を保持したいとしましょう。次のようにできます。

sealed final case class A(x: Seq[Int])

これは以前と同じように機能するように見えるかもしれませんが、x のすべてを読み取るストレージと時間が x.length に比例するようになりました。

ただし、誰かが次のようなことを行う可能性があるため、これは実際には当てはまりません。

val hugeList = (1 to 1000000000).toList
val a = A(hugeList.view.filter(_ == 500000000))

この場合、a オブジェクトは単一の int をシーケンスに保持する無害なケースクラスのように見えますが、実際には数ギガバイトのメモリが必要であり、毎回その単一の要素にアクセスするのに数秒かかります。

これは、Seq[T] の代わりに List[T] のようなものをタイプとして指定することで修正できます。ただし、これは特定の実装への参照を追加するため見苦しく見えますが、実際には、Vector[T] などの他の適切に動作する実装も同様です。

もう 1 つの懸念される問題は、変更可能な Seq[T] を渡すことができることです。そのため、少なくとも scala.collection.Seq の代わりに immutable.Seq を使用する必要があるようです (ただし、現時点では、コンパイラーは実際に不変性を強制することはできません)。

ほとんどのライブラリを見ると scala.collection.Seq[T] を使用するのが一般的なパターンのようですが、これは本当に良い考えですか?

あるいは、Seq が型付けが最も短いという理由だけで使用されているのではないでしょうか。

編集で追加された新しいテキスト

クラスライブラリを見ると、scala.reflect.api.Trees のような最もコアな機能のいくつかは実際に List[T] を使用しており、一般的に具象クラスを使用することは良い考えのようです。

しかし、なぜベクトルではなくリストを使用するのでしょうか?

ベクトルの長さは O(1)/O(log(n)) で、prepend、append、およびランダムアクセスであり、漸近的に小さく (リストは vtable と次のポインターにより ~3 ～ 4 倍大きくなります)、キャッシュ効率の高い並列計算をサポートします。、一方 List には O(1) prepend 以外のプロパティはありません。

したがって、個人的には、Vector[T] がライブラリのデータ構造で公開されているものに対して正しい選択であることに傾いています。ライブラリのユーザーが必要とする操作がわからない場合、あまり人気がないように見えます。

score 1 · Accepted Answer

First of all, you talk both about space and time requirements. In terms of space, your object will always be as large as the collection. It doesn't matter whether you wrap a mutable or immutable collection, that collection for obvious reasons needs to be in memory, and the case class wrapping it doesn't take any additional space (except its own small object reference). So if your collection takes "gigabytes of memory", that's a problem of your collection, not whether you wrap it in a case class or not.

You then go on to argue that a problem arises when using views instead of eager collections. But again the question is what the problem actually is? You use the example of lazily filtering a collection. In general running a filter will be an O(n) operation just as if you were iterating over the original list. In that example it would be O(1) for successive calls if that collection was made strict. But that's a problem of the calling site of your case class, not the definition of your case class.

The only valid point I see is with respect to mutable collections. Given the defining semantics of case classes, you should really only use effectively immutable objects as arguments, so either pure immutable collections or collections to which no instance has any more write access.

There is a design error in Scala in that scala.Seq is not aliased to collection.immutable.Seq but a general seq which can be either mutable or immutable. I advise against any use of unqualified Seq. It is really wrong and should be rectified in the Scala standard library. Use collection.immutable.Seq instead, or if the collection doesn't need to be ordered, collection.immutable.Traversable.

So I agree with your suspicion:

Looking at most libraries it seems that the common pattern is to use scala.collection.Seq[T], but is this really a good idea?

No! Not good. It might be convenient, because you can pass in an Array for example without explicit conversion, but I think a cleaner design is to require immutability.

scala - Seq[T]、immutable.Seq[T]、List[T]、または Vector[T] を保持するように Scala の不変ケース クラスを定義する必要がありますか?

編集で追加された新しいテキスト

1 に答える 1

Related

Reference

scala - Seq[T]、immutable.Seq[T]、List[T]、または Vector[T] を保持するように Scala の不変ケースクラスを定義する必要がありますか?