python - Python 用の単純な文字列パーサーはありますか?

Question

私はそのような文字列を持っています: 'aaa(cc(kkk)c)ddd[lll]{m(aa)mm}'. その文字列から、次の構造を取得したいと思います: ['aaa', '(cc(kkk)c)', 'ddd', '[lll]', '{m(aa)mm}']. 言い換えれば、異なるタイプの括弧内にある部分文字列を分離したいと思います。

score 7 · Accepted Answer

ネストレベルを追跡するには、スタックアプローチを使用する必要があります。

pairs = {'{': '}', '[': ']', '(': ')'}

def parse_groups(string):
    stack = []
    last = 0
    for i, c in enumerate(string):
        if c in pairs:
            # push onto the stack when we find an opener
            if not stack and last < i:
                # yield anything *not* grouped
                yield string[last:i]
            stack.append((c, i))
        elif c in pairs:
            if stack and pairs[stack[-1][0]] == c:
                # Found a closing bracket, pop the stack
                start = stack.pop()[1]
                if not stack:
                    # Group fully closed, yield
                    yield string[start:i + 1]
                    last = i + 1
            else:
                raise ValueError('Missing opening parethesis')

    if stack:
        raise ValueError('Missing closing parethesis')

    if last < len(string):
        # yield the tail
        yield string[last:]

これによりグループが生成され、必要に応じてリストにキャストされます。

>>> list(parse_groups('aaa(cc(kkk)c)ddd[lll]{m(aa)mm}'))
['aaa', '(cc(kkk)c)', 'ddd', '[lll]', '{m(aa)mm}']

括弧/括弧のバランスが取れていない場合、例外が発生します。

>>> list(parse_groups('aa(bb'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 19, in parse_groups
ValueError: Missing closing parethesis
>>> list(parse_groups('aa[{bb}}'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 20, in parse_groups
ValueError: Missing opening parethesis
>>> list(parse_groups('aa)bb'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 20, in parse_groups
ValueError: Missing opening parethesis

score 1 · Accepted Answer

pyparsingを見ることもできます。興味深いことに、これはスタックとして実装でき、{[( が見つかったときに文字列フラグメントをプッシュし、 )]} が見つかったときにポップできます。

score 0 · Accepted Answer

Custom String Parserライブラリを試すことができると思います(私はその作成者です)。任意の論理構造を持つデータを操作するように設計されているため、必要に応じてカスタマイズできます;)

python - Python 用の単純な文字列パーサーはありますか?

3 に答える 3

Related

Reference