最初の質問ストリーム
こんにちは、みんな、
これは、この質問のフォローアップになる可能性があります:Antlrルールの優先順位
reStructuredTextマークアップ言語用のANTLR文法を書き込もうとしています。
私が直面している主な問題は、「他の文法規則をマスクせずに、文字のシーケンス(通常のテキスト)をどのように一致させるか」です。
インラインマークアップのある段落の例を見てみましょう。
In `Figure 17-6`_, we have positioned ``before_ptr`` so that it points to the element
*before* the insert point. The variable ``after_ptr`` points to the element *after* the
insert. In other words, we are going to put our new element **in between** ``before_ptr``
and ``after_ptr``.
インラインマークアップテキストのルールを書くのは簡単だと思いました。だから私は簡単な文法を書きました:
grammar Rst;
options {
output=AST;
language=Java;
backtrack=true;
//memoize=true;
}
@members {
boolean inInlineMarkup = false;
}
// PARSER
text
: inline_markup (WS? inline_markup)* WS? EOF
;
inline_markup
@after {
inInlineMarkup = false;
}
: {!inInlineMarkup}? (emphasis|strong|litteral|link)
;
emphasis
@init {
inInlineMarkup = true;
}
: '*' (~'*')+ '*' {System.out.println("emphasis: " + $text);}
;
strong
@init {
inInlineMarkup = true;
}
: '**' (~'*')+ '**' {System.out.println("bold: " + $text);}
;
litteral
@init {
inInlineMarkup = true;
}
: '``' (~'`')+ '``' {System.out.println("litteral: " + $text);}
;
link
@init {
inInlineMarkup = true;
}
: inline_internal_target
| footnote_reference
| hyperlink_reference
;
inline_internal_target
: '_`' (~'`')+ '`' {System.out.println("inline_internal_target: " + $text);}
;
footnote_reference
: '[' (~']')+ ']_' {System.out.println("footnote_reference: " + $text);}
;
hyperlink_reference
: ~(' '|'\t'|'\u000C'|'_')+ '_' {System.out.println("hyperlink_reference: " + $text);}
| '`' (~'`')+ '`_' {System.out.println("hyperlink_reference (long): " + $text);}
;
// LEXER
WS
: (' '|'\t'|'\u000C')+
;
NEWLINE
: '\r'? '\n'
;
この単純な文法は機能しません。そして、私は通常のテキストと一致させようとさえしませんでした...
私の質問:
- 誰かが私のエラーを指摘し、通常のテキストと一致させる方法についてのヒントを教えてもらえますか?
- 文法規則に優先順位を設定する方法はありますか?多分これはリードかもしれません。
よろしくお願いします:-)
ロビン
2番目の質問ストリーム
ご助力ありがとうございます!エラーを理解するのに苦労したでしょう...私はANTLRを学ぶためにその文法(のみ)を書いているのではなく、Eclipse用のIDEプラグインをコーディングしようとしています。そしてそのために、私は文法が必要です;)
私はなんとか文法をさらに進めて、text
ルールを書きました:
grammar Rst;
options {
output=AST;
language=Java;
}
@members {
boolean inInlineMarkup = false;
}
//////////////////
// PARSER RULES //
//////////////////
file
: line* EOF
;
line
: text* NEWLINE
;
text
: inline_markup
| normal_text
;
inline_markup
@after {
inInlineMarkup = false;
}
: {!inInlineMarkup}? {inInlineMarkup = true;}
(
| STRONG
| EMPHASIS
| LITTERAL
| INTERPRETED_TEXT
| SUBSTITUTION_REFERENCE
| link
)
;
link
: INLINE_INTERNAL_TARGET
| FOOTNOTE_REFERENCE
| HYPERLINK_REFERENCE
;
normal_text
: {!inInlineMarkup}?
~(EMPHASIS
|SUBSTITUTION_REFERENCE
|STRONG
|LITTERAL
|INTERPRETED_TEXT
|INLINE_INTERNAL_TARGET
|FOOTNOTE_REFERENCE
|HYPERLINK_REFERENCE
|NEWLINE
)
;
//////////////////
// LEXER TOKENS //
//////////////////
EMPHASIS
: STAR ANY_BUT_STAR+ STAR {System.out.println("EMPHASIS: " + $text);}
;
SUBSTITUTION_REFERENCE
: PIPE ANY_BUT_PIPE+ PIPE {System.out.println("SUBST_REF: " + $text);}
;
STRONG
: STAR STAR ANY_BUT_STAR+ STAR STAR {System.out.println("STRONG: " + $text);}
;
LITTERAL
: BACKTICK BACKTICK ANY_BUT_BACKTICK+ BACKTICK BACKTICK {System.out.println("LITTERAL: " + $text);}
;
INTERPRETED_TEXT
: BACKTICK ANY_BUT_BACKTICK+ BACKTICK {System.out.println("LITTERAL: " + $text);}
;
INLINE_INTERNAL_TARGET
: UNDERSCORE BACKTICK ANY_BUT_BACKTICK+ BACKTICK {System.out.println("INLINE_INTERNAL_TARGET: " + $text);}
;
FOOTNOTE_REFERENCE
: L_BRACKET ANY_BUT_BRACKET+ R_BRACKET UNDERSCORE {System.out.println("FOOTNOTE_REFERENCE: " + $text);}
;
HYPERLINK_REFERENCE
: BACKTICK ANY_BUT_BACKTICK+ BACKTICK UNDERSCORE {System.out.println("HYPERLINK_REFERENCE (long): " + $text);}
| ANY_BUT_ENDLINK+ UNDERSCORE {System.out.println("HYPERLINK_REFERENCE (short): " + $text);}
;
WS
: (' '|'\t')+ {$channel=HIDDEN;}
;
NEWLINE
: '\r'? '\n' {$channel=HIDDEN;}
;
///////////////
// FRAGMENTS //
///////////////
fragment ANY_BUT_PIPE
: ESC PIPE
| ~(PIPE|'\n'|'\r')
;
fragment ANY_BUT_BRACKET
: ESC R_BRACKET
| ~(R_BRACKET|'\n'|'\r')
;
fragment ANY_BUT_STAR
: ESC STAR
| ~(STAR|'\n'|'\r')
;
fragment ANY_BUT_BACKTICK
: ESC BACKTICK
| ~(BACKTICK|'\n'|'\r')
;
fragment ANY_BUT_ENDLINK
: ~(UNDERSCORE|' '|'\t'|'\n'|'\r')
;
fragment ESC
: '\\'
;
fragment STAR
: '*'
;
fragment BACKTICK
: '`'
;
fragment PIPE
: '|'
;
fragment L_BRACKET
: '['
;
fragment R_BRACKET
: ']'
;
fragment UNDERSCORE
: '_'
;
文法はinline_markupで正常に機能していますが、normal_textが一致していません。
これが私のテストクラスです:
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import org.antlr.runtime.ANTLRStringStream;
import org.antlr.runtime.CommonTokenStream;
import org.antlr.runtime.RecognitionException;
import org.antlr.runtime.tree.Tree;
public class Test {
public static void main(String[] args) throws RecognitionException, IOException {
InputStream is = Test.class.getResourceAsStream("test.rst");
Reader r = new InputStreamReader(is);
StringBuilder source = new StringBuilder();
char[] buffer = new char[1024];
int readLenght = 0;
while ((readLenght = r.read(buffer)) > 0) {
if (readLenght < buffer.length) {
source.append(buffer, 0, readLenght);
} else {
source.append(buffer);
}
}
r.close();
System.out.println(source.toString());
ANTLRStringStream in = new ANTLRStringStream(source.toString());
RstLexer lexer = new RstLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
RstParser parser = new RstParser(tokens);
RstParser.file_return out = parser.file();
System.out.println(((Tree)out.getTree()).toStringTree());
}
}
そして私が使用する入力ファイル:
In `Figure 17-6`_, we have positioned ``before_ptr`` so that it points to the element
*before* the insert point. The variable ``after_ptr`` points to the |element| *after* the
insert. In other words, `we are going`_ to put_ our new element **in between** ``before_ptr``
and ``after_ptr``.
そして、私はこの出力を取得します:
HYPERLINK_REFERENCE (short): 7-6`_
line 1:2 mismatched character ' ' expecting '_'
line 1:10 mismatched character ' ' expecting '_'
line 1:18 mismatched character ' ' expecting '_'
line 1:21 mismatched character ' ' expecting '_'
line 1:26 mismatched character ' ' expecting '_'
line 1:37 mismatched character ' ' expecting '_'
LITTERAL: `before_ptr`
line 1:86 no viable alternative at character '\r'
line 1:55 mismatched character ' ' expecting '_'
line 1:60 mismatched character ' ' expecting '_'
line 1:63 mismatched character ' ' expecting '_'
line 1:70 mismatched character ' ' expecting '_'
line 1:73 mismatched character ' ' expecting '_'
line 1:77 mismatched character ' ' expecting '_'
line 1:85 mismatched character ' ' expecting '_'
EMPHASIS: *before*
line 2:12 mismatched character ' ' expecting '_'
line 2:19 mismatched character ' ' expecting '_'
line 2:26 mismatched character ' ' expecting '_'
LITTERAL: `after_ptr`
line 2:30 mismatched character ' ' expecting '_'
line 2:39 mismatched character ' ' expecting '_'
line 2:90 no viable alternative at character '\r'
line 2:60 mismatched character ' ' expecting '_'
line 2:63 mismatched character ' ' expecting '_'
line 2:67 mismatched character ' ' expecting '_'
line 2:77 mismatched character ' ' expecting '_'
line 2:85 mismatched character ' ' expecting '_'
line 2:89 mismatched character ' ' expecting '_'
line 3:7 mismatched character ' ' expecting '_'
line 3:10 mismatched character ' ' expecting '_'
line 3:16 mismatched character ' ' expecting '_'
line 3:23 mismatched character ' ' expecting '_'
line 3:27 mismatched character ' ' expecting '_'
line 3:31 mismatched character ' ' expecting '_'
line 3:42 mismatched character ' ' expecting '_'
line 3:51 mismatched character ' ' expecting '_'
line 3:55 mismatched character ' ' expecting '_'
line 3:63 mismatched character ' ' expecting '_'
line 3:94 mismatched character '\r' expecting '*'
line 4:3 mismatched character ' ' expecting '_'
line 4:18 no viable alternative at character '\r'
line 4:18 mismatched character '\r' expecting '_'
HYPERLINK_REFERENCE (short): oing`_
HYPERLINK_REFERENCE (short): ut_
EMPHASIS: *in between*
LITTERAL: `after_ptr`
BR.recoverFromMismatchedToken
line 0:-1 mismatched input '<EOF>' expecting NEWLINE
null
私のエラーを指摘できますか?(文法にfilter = true;オプションを追加すると、パーサーはエラーなしでインラインマークアップに対して機能します)
ロビン