python - PythonでC形式の文字列を解析するにはどうすればよいですか?

Question

C ファイルに次のコードがあります。

printf("Worker name is %s and id is %d", worker.name, worker.id);

"%s"Python を使用して、フォーマット文字列を解析し、 andを見つけられるようにしたいと考えています"%d"。

だから私は機能が欲しい：

>>> my_function("Worker name is %s and id is %d")
[Out1]: ((15, "%s"), (28, "%d))

libclang の Python バインディングと pycparser を使用してこれを達成しようとしましたが、これらのツールでこれを行う方法がわかりませんでした。

これを解決するために正規表現を使用することも試みましたが、これはまったく単純ではありませprintfん"%%s"。

gcc と clang の両方が明らかにコンパイルの一部としてこれを行います - このロジックを Python にエクスポートした人はいませんか?

score 9 · Accepted Answer

正規表現を使用して、適切にフォーマットされた候補を確実に見つけることができます。

C Format Specificationの定義を見てください。(Microsoft を使用しますが、必要なものを使用してください。)

それは：

%[flags] [width] [.precision] [{h | l | ll | w | I | I32 | I64}] type

printf%%になる特殊なケースもあります。%

そのパターンを正規表現に変換できます。

(                                 # start of capture group 1
%                                 # literal "%"
(?:                               # first option
(?:[-+0 #]{0,5})                  # optional flags
(?:\d+|\*)?                       # width
(?:\.(?:\d+|\*))?                 # precision
(?:h|l|ll|w|I|I32|I64)?           # size
[cCdiouxXeEfgGaAnpsSZ]            # type
) |                               # OR
%%)                               # literal "%%"

デモ

そして、Python 正規表現に:

import re

lines='''\
Worker name is %s and id is %d
That is %i%%
%c
Decimal: %d  Justified: %.6d
%10c%5hc%5C%5lc
The temp is %.*f
%ss%lii
%*.*s | %.3d | %lC | %s%%%02d'''

cfmt='''\
(                                  # start of capture group 1
%                                  # literal "%"
(?:                                # first option
(?:[-+0 #]{0,5})                   # optional flags
(?:\d+|\*)?                        # width
(?:\.(?:\d+|\*))?                  # precision
(?:h|l|ll|w|I|I32|I64)?            # size
[cCdiouxXeEfgGaAnpsSZ]             # type
) |                                # OR
%%)                                # literal "%%"
'''

for line in lines.splitlines():
    print '"{}"\n\t{}\n'.format(line, 
           tuple((m.start(1), m.group(1)) for m in re.finditer(cfmt, line, flags=re.X)))

版画:

"Worker name is %s and id is %d"
    ((15, '%s'), (28, '%d'))

"That is %i%%"
    ((8, '%i'), (10, '%%'))

"%c"
    ((0, '%c'),)

"Decimal: %d  Justified: %.6d"
    ((9, '%d'), (24, '%.6d'))

"%10c%5hc%5C%5lc"
    ((0, '%10c'), (4, '%5hc'), (8, '%5C'), (11, '%5lc'))

"The temp is %.*f"
    ((12, '%.*f'),)

"%ss%lii"
    ((0, '%s'), (3, '%li'))

"%*.*s | %.3d | %lC | %s%%%02d"
    ((0, '%*.*s'), (8, '%.3d'), (15, '%lC'), (21, '%s'), (23, '%%'), (25, '%02d'))

score 0 · Accepted Answer

これは、 %s %d またはそのようなフォーマット文字列のインデックスを出力する、私が書いた反復コードです

            import re  
            def myfunc(str):
                match = re.search('\(.*?\)',str)
                if match:
                    new_str = match.group()
                    new_str = new_str.translate(None,''.join(['(',')','"'])) #replace the characters in list with none
                    print new_str
                    parse(new_str)
                else:
                    print "No match"

            def parse(str):
                try:
                    g = str.index('%')
                    print " %",str[g+1]," = ",g
                    #replace % with ' '
                    list1 = list(str)
                    list1[str.index('%')] = ' '
                    str = ''.join(list1)

                    parse(str)
                except ValueError,e:
                    return

            str = raw_input()
            myfunc(str)`

それが役に立てば幸い

python - PythonでC形式の文字列を解析するにはどうすればよいですか?

3 に答える 3

Related

Reference