python - 行末まで解析するためのレクサー

Question

キーワードがある場合、キーワードが見つかったら、残りの行を取得して文字列として返すにはどうすればよいですか？行末に遭遇したら、その行のすべてを返します。

これが私が見ている行です：

  description here is the rest of my text to collect

したがって、レクサーが説明に遭遇したときに、「収集する残りのテキストはここにあります」を文字列として返したいと思います。

次のように定義しましたが、エラーが発生しているようです。

states = (
     ('bcdescription', 'exclusive'),
)

def t_bcdescription(t):
    r'description '
    t.lexer.code_start = t.lexer.lexpos
    t.lexer.level = 1
    t.lexer.begin('bcdescription')

def t_bcdescription_close(t):
    r'\n'
    t.value = t.lexer.lexdata[t.lexer.code_start:t.lexer.lexpos+1]
    t.type="BCDESCRIPTION"
    t.lexer.lineno += t.valiue.count('\n')
    t.lexer.begin('INITIAL')
    return t

これは、返されるエラーの一部です。

  File "/Users/me/Coding/wm/wm_parser/ply/lex.py", line 393, in token
raise LexError("Illegal character '%s' at index %d" % (lexdata[lexpos],lexpos), lexdata[lexpos:])
ply.lex.LexError: Illegal character ' ' at index 40

最後に、この機能を複数のトークンに使用したい場合、どうすればそれを実現できますか？

御時間ありがとうございます

score 0 · Accepted Answer

コードに大きな問題はありません。実際、コードをコピーして実行するだけで、うまく機能します

import ply.lex as lex 

states = ( 
     ('bcdescription', 'exclusive'),
)

tokens = ("BCDESCRIPTION",)

def t_bcdescription(t):
    r'\bdescription\b'
    t.lexer.code_start = t.lexer.lexpos
    t.lexer.level = 1 
    t.lexer.begin('bcdescription')

def t_bcdescription_close(t):
    r'\n'
    t.value = t.lexer.lexdata[t.lexer.code_start:t.lexer.lexpos+1]
    t.type="BCDESCRIPTION"
    t.lexer.lineno += t.value.count('\n')
    t.lexer.begin('INITIAL')
    return t

def t_bcdescription_content(t):
    r'[^\n]+'

lexer = lex.lex()
data = 'description here is the rest of my text to collect\n'
lexer.input(data)

while True:
    tok = lexer.token()
    if not tok: break      
    print tok

結果は次のとおりです。

LexToken(BCDESCRIPTION,' here is the rest of my text to collect\n',1,50)

だから多分あなたのコードの他の部分をチェックすることができます

この機能が複数のトークンに必要な場合は、単語をキャプチャするだけで、それらのトークンに単語が表示されたら、上記のコードで残りのコンテンツのキャプチャを開始できます。

score -1 · Accepted Answer

これについて、詳細情報なしでレクサー/パーサーを使用する必要がある理由は明らかではありません。

>>> x = 'description here is the rest of my text to collect'
>>> a, b = x.split(' ', 1)
>>> a
'description'
>>> b
'here is the rest of my text to collect'

python - 行末まで解析するためのレクサー

2 に答える 2

Related

Reference