scala - Scala パーサーコンビネーター、大きなファイルの問題

Question

私は次のようにパーサーを書きました：

class LogParser extends JavaTokenParsers {

  def invertedIndex: Parser[Array[Array[(Int, Int)]]] = {
    num ~> num ~> num ~> rep(postingsList) ^^ {
      _.toArray
    }
  }

  def postingsList: Parser[Array[(Int, Int)]] = {
    num ~> rep(entry) ^^ {
      _.toArray
    }
  }

  def entry = {
    num ~ "," ~ num ^^ {
      case docID ~ "," ~ count => (docID.toInt, count.toInt)
    }
  }

  def num = wholeNumber ^^ (_.toInt)

}

次のように FileReader を使用して (270MB) ファイルから解析する場合:

val index = parseAll(invertedIndex, new FileReader("path/to/file")).get

を取得しますException in thread "main" java.lang.StackOverflowError（ a でラップも試みましたBufferedReader）が、最初にファイルを次のように文字列に読み込むことで修正できます。

val input = io.Source.fromFile("path/to/file")
val str = input.mkString
input.close()
val index = parseAll(invertedIndex, str).get

これはなぜですか？最初に文字列として読み取らないようにする方法はありますか?それは無駄に思えますか?

score 1 · Accepted Answer

スタックオーバーフローエラーを停止するために必要なものであるトランポリンをサポートするscalaパーサーコンビネーターによく似た別のライブラリ[1]があります。

[1] https://github.com/djspiewak/gll-combinators

scala - Scala パーサー コンビネーター、大きなファイルの問題

1 に答える 1

Related

Reference

scala - Scala パーサーコンビネーター、大きなファイルの問題