python - python C関数本体を削除

Question

Cソースファイルの関数からボディ全体を削除する方法を探しています。

たとえば、次の内容のファイルがあります。

1.  int func1 (int para) {
2.    return para;
3.  }
4.
5.  int func2 (int para) {
6.    if (1) {
7.      return para;
8.    }
9.    return para;
10. }

私はこれらの正規表現を試しました:

content = re.sub('(\{[.*]?\})', '', content, flags=re.DOTALL)

しかし、ネストされた { } には問題があります。この正規表現は最初の } のみを置換するため、9 行目と 10 行目はまだ内容にあります。解決策は、{ と } の括弧を数え、カウンターが 0 のときに置換を停止することだと思います。{ が見つかった => カウンター ++、} が見つかった => カウンター -- しかし、これをPythonで実装する方法がわかりません。みんな私にキックをくれますか？

score 9 · Accepted Answer

あなたは、これまで何度も実装されてきたホイールを再発明しようとしていると思います。Cファイル内の各関数の署名を抽出するだけの場合は、はるかに簡単な方法があります。

ctagsユーティリティがこれを処理します。

~/test$ ctags -x --c-types=f ./test.c
func1            function      1 ./test.c         int func1 (int para) {
func2            function      5 ./test.c         int func2 (int para) {
~/test$ # Clean up the output a little bit
~/test$ ctags -x --c-types=f ./test.c | sed -e 's/\s\+/ /g' | cut -d ' ' -f 5-
int func1 (int para) {
int func2 (int para) {

score 0 · Accepted Answer

これは、C ソースファイルから関数本体を削除するスクリプトの 1 つです。唯一の要件は、Mac OSX でビルドされた ctags ではなく、Mac OSX の brew からの ctags でした。Mac OSX の組み込みの ctags でなぜ動作しないのか、私にはわかりませんでした。次のコマンドを入力して、brewを使用して ctags をインストールできます。

$ brew install ctags

次に、dummyc.pl という名前の次の PERL スクリプトを C ソースファイルと共に使用します。たとえば、入力 C ソース:

int
func1 (int para)
{
  return para;
}

int
func2 (int para)
{
  if (1)
    {
      return para;
    }
  return para;
}

これは出力です：

int
func1 (int para)
{
  return 0;
}

int
func2 (int para)
{
  return 0;
}

これは、PERL スクリプトです。

#!/usr/bin/env perl
use strict;
use warnings;

unless ( @ARGV == 1 )
{
  print "Filter out the body of C functions.
Usage: dummyc.pl file.c
Required: ctags (e.g., \$ brew install ctags)\n";
  exit;
}

my $cfile = $ARGV[0];
my $lc = 1;
my $kindPrev = "comment";
my $lnPrev = 1;
my $lsPrev = "comment";
my $namePrev = "comment";
my $line = 1;
open(CFILE, $cfile) or die "could not open $cfile: $!";
open(PIPE, "/usr/local/Cellar/ctags/5.8/bin/ctags -xu $cfile|") or die "couldn't start pipe: $!";
while ($line)
{
  last unless $line;
  # R_USE_SIGNALS    macro        24 errors.c         #define R_USE_SIGNALS 1
  $line = <PIPE>;
  my $name;
  my $kind;
  my $ln;
  my $ls;
  if ($line)
  {
    $line =~ /^(\S+)\s+(\w+)\s+(\d+)\s+$cfile\s+(.+)/;
    $name = $1;
    $kind = $2;
    $ln = $3;
    $ls = $4;
  }
  else
  {
    $ln = 1000000;
  }

  if ($kindPrev eq "function") 
  {
    my $isFunctionBody = 0;
    my $hasStartBrace = 0;
    my $hasReturnValue = 1;
    my $noReturn = 0;
    for (my $i = $lnPrev; $i < $ln; $i++)
    {
      my $cline = <CFILE>;
      last unless $cline;

      if ($cline =~ /void.+$namePrev/)
      {
        $hasReturnValue = 0;  
      }
      if ($cline =~ /NORET.+$namePrev/)
      {
        $noReturn = 1;  
      }
      if ($isFunctionBody == 0 and $cline =~ /\{/)
      {
        $isFunctionBody = 1;
        unless ($cline =~ /^\{/)
        {
          $hasStartBrace = 1;
          print $cline;
        }
      }
      elsif ($cline =~ /^\}/)
      {
        $isFunctionBody = 0;
        print "{\n" if $hasStartBrace == 0;
        if ($noReturn == 0)
        {
          if ($hasReturnValue == 1)
          {
            print "  return 0;\n";
          }
          else
          {
            print "  return;\n";
          }
        }
      }
      unless ($isFunctionBody == 1)
      {
        print $cline;
      }
    }
  }
  else
  {
    for (my $i = $lnPrev; $i < $ln; $i++)
    {
      my $cline = <CFILE>;
      last unless $cline;
      print $cline;
    }
  }
  $kindPrev = $kind;
  $lnPrev = $ln;
  $lsPrev = $ls;
  $namePrev = $name;
}
close(PIPE) or die "couldn't close pipe: $! $?";
close(CFILE) or die "couldn't close $cfile: $! $?";

ただし、PERL スクリプトを編集する必要があるかもしれません。

score 0 · Accepted Answer

これは純粋な python ソリューションであり、実装が非常に簡単です。

体を抽出する関数

{基本的に、それぞれを対応するものと一致させようとします}:

{次の前に2 つある場合}は、スコープに入っています。
一方、}次のの前に1 つある場合{は、スコープを終了しています。

実装は自明です。

{のすべてのインデックスを探し}、別のリストで維持する
スコープ深度変数も維持します
- 現在の{位置が現在の位置よりも下にある}場合は、スコープに入り、スコープの深さに 1 を追加して、次の{位置に移動します。
- 現在の{位置が現在の位置よりも上にある}場合、スコープを終了し、スコープの深さから 1 を削除して、次の}位置に移動します。
スコープ深度変数が 0 の場合、関数本体の右中括弧が見つかりました

関数本体の最初の中かっこ (中かっこを除く) の直後に始まる文字列があるとします。この部分文字列を指定して次の関数を呼び出すと、最後の中かっこの位置が得られます。

def find_ending_brace(string_from_first_brace):
  starts = [m.start() for m in re.finditer('{', string_from_first_brace, re.MULTILINE)]
  ends = [m.start() for m in re.finditer('}', string_from_first_brace, re.MULTILINE)]

  i = 0
  j = 0
  current_scope_depth = 1

  while(current_scope_depth > 0):  
    if(ends[j] < starts[i]):
      current_scope_depth -= 1
      j += 1
    elif(ends[j] > starts[i]):
      current_scope_depth += 1
      i += 1
      if(i == len(starts)): # in case we reached the end (fewer { than })
        j += 1
        break

  return ends[j-1]

候補関数定義の抽出

ここで、ファイルの元の文字列が変数にある場合my_content、

find_func_begins = [m for m in re.finditer("\w+\s+(\w+)\s*\((.*?)\)\s*\{", my_content)]

各関数のプロトタイプを提供します (find_func_begins[0].group(1) == func1およびfind_func_begins[0].group(2) == 'int para'), および

my_content[
  find_func_begins[0].start():
    find_func_begins[0].end() +
    find_ending_brace(my_content[find_func_begins[0].end():])]

本体の中身をお届けします。

プロトタイプの抽出

find_func_beginsの正規表現は少し緩いので、最初の終了ブレースに到達した後で、関数定義をもう一度探す必要があると思います。各関数定義と一致する中括弧を反復すると、次の反復アルゴリズムが生成されます。

reg_ex = "\w+\s+(\w+)\s*\((.*?)\)\s*\{"
last = 0
protos = ""
find_func_begins = [m for m in re.finditer(reg_ex, my_content[last:], re.MULTILINE | re.DOTALL)]
while(len(find_func_begins) > 0):
  function_begin = find_func_begins[0]
  function_proto_end = last + function_begin.end()
  protos += my_content[last: function_proto_end-1].strip() + ";\n\n"

  last = function_proto_end + find_ending_brace(my_content[function_proto_end:]) + 1
  find_func_begins = [m for m in re.finditer(reg_ex, my_content[last:], re.MULTILINE | re.DOTALL)]

欲しいものが入っているはずですprotos。お役に立てれば！

python - python C関数本体を削除

4 に答える 4

体を抽出する関数

候補関数定義の抽出

プロトタイプの抽出

Related

Reference