ほとんどの解析問題と同様に、入力形式の要素を最もよく説明する文法を構築しようとします。
この場合、名詞があります。
[comma ending value-chars qmark quoted-chars value header row]
いくつかの動詞:
[row-feed emit-value]
そして、操作名詞:
[current chunk current-row width]
もう少し分解できると思いますが、作業するには十分です。まず、基礎:
comma: ","
ending: "^/"
qmark: {"}
value-chars: complement charset reduce [qmark comma ending]
quoted-chars: complement charset reduce [qmark]
次に、値の構造です。引用符で囲まれた値は、有効な文字または引用符のチャンクを見つけたときに構築されます。
current: chunk: none
quoted-value: [
qmark (current: copy "")
any [
copy chunk some quoted-chars (append current chunk)
|
qmark qmark (append current qmark)
]
qmark
]
value: [
copy current some value-chars
| quoted-value
]
emit-value: [
(
delimiter: comma
append current-row current
)
]
emit-none: [
(
delimiter: comma
append current-row none
)
]
は各行の先頭で にdelimiter
設定され、値を渡すとすぐに に変更されることに注意してください。したがって、入力行は として定義されます。ending
comma
[ending value any [comma value]]
あとはドキュメント構造を定義するだけです:
current-row: none
row-feed: [
(
delimiter: ending
append/only out current-row: copy []
)
]
width: none
header: [
(out: copy [])
row-feed any [
value comma
emit-value
]
value body: ending :body
emit-value
(width: length? current-row)
]
row: [
row-feed width [
delimiter [
value emit-value
| emit-none
]
]
]
if parse/all stream [header some row opt ending][out]
これらすべての単語を保護するためにそれをまとめると、次のようになります。
REBOL [
Title: "CSV Parser"
Date: 19-Nov-2012
Author: "Christopher Ross-Gill"
]
parse-csv: use [
comma ending delimiter value-chars qmark quoted-chars
value quoted-value header row
row-feed emit-value emit-none
out current current-row width
][
comma: ","
ending: "^/"
qmark: {"}
value-chars: complement charset reduce [qmark comma ending]
quoted-chars: complement charset reduce [qmark]
current: none
quoted-value: use [chunk][
[
qmark (current: copy "")
any [
copy chunk some quoted-chars (append current chunk)
|
qmark qmark (append current qmark)
]
qmark
]
]
value: [
copy current some value-chars
| quoted-value
]
current-row: none
row-feed: [
(
delimiter: ending
append/only out current-row: copy []
)
]
emit-value: [
(
delimiter: comma
append current-row current
)
]
emit-none: [
(
delimiter: comma
append current-row none
)
]
width: none
header: [
(out: copy [])
row-feed any [
value comma
emit-value
]
value body: ending :body
emit-value
(width: length? current-row)
]
row: [
opt ending end break
|
row-feed width [
delimiter [
value emit-value
| emit-none
]
]
]
func [stream [string!]][
if parse/all stream [header some row][out]
]
]