c - glib のハッシュコンテナーでの flex+bison 出力

Question

bib ファイルの解析ではかなりの進歩を遂げましたが、次のステップは、現在の理解レベルでは非常に困難です。上記の bib ファイルを正しく解析する bison と flex コードを作成しました。

%{
#include <stdio.h>
%}

// Symbols.
%union
{
    char    *sval;
};
%token <sval> VALUE
%token <sval> KEY
%token OBRACE
%token EBRACE
%token QUOTE
%token SEMICOLON 

%start Input
%%
Input: 
     /* empty */ 
     | Input Entry ;  /* input is zero or more entires */
Entry: 
     '@' KEY '{' KEY ','{ printf("===========\n%s : %s\n",$2, $4); } 
     KeyVals '}' 
     ;
KeyVals: 
       /* empty */ 
       | KeyVals KeyVal ; /* zero or more keyvals */
KeyVal: 
      KEY '=' VALUE ',' { printf("%s : %s\n",$1, $3); };

%%

int yyerror(char *s) {
  printf("yyerror : %s\n",s);
}

int main(void) {
  yyparse();
}

と

%{
#include "bib.tab.h"
%}

%%
[A-Za-z][A-Za-z0-9]*      { yylval.sval = strdup(yytext); return KEY; }
\"([^\"]|\\.)*\"|\{([^\"]|\\.)*\}     { yylval.sval = strdup(yytext); return VALUE; }
[ \t\n]                   ; /* ignore whitespace */
[{}@=,]                   { return *yytext; }
.                         { fprintf(stderr, "Unrecognized character %c in input\n", *yytext); }
%%

これらの値をコンテナに入れたいです。ここ数日間、私は glib の膨大なドキュメントを読み、私のケースに最も適したハッシュコンテナーを見つけました。以下は基本的なハッシュコードです。値が配列のキーと値に配置されると、ハッシュが正しく設定されます。

#include <glib.h>
#define slen 1024

int main(gint argc, gchar** argv) 
{
  char *keys[] = {"id", "type", "author", "year",NULL};
  char *vals[] = {"one",  "Book",  "RB", "2013", NULL};
  gint i;
  GHashTable* table = g_hash_table_new(g_str_hash, g_str_equal);
  GHashTableIter iter;
  g_hash_table_iter_init (&iter, table);
  for (i= 0; i<=3; i++)
  {
    g_hash_table_insert(table, keys[i],vals[i]);
    g_printf("%d=>%s:%s\n",i,keys[i],g_hash_table_lookup(table,keys[i]));
  }
}

問題は、この 2 つのコードをどのように統合するか、つまり、解析されたデータを C コードで使用するかです。どんな親切な助けも大歓迎です。

編集：@UncleOの応答を説明するために：@UncleO、

ご意見をありがとうございます。私はそれをよりよく説明する方法がありません。ここで試してみます。私のコード（バイソン）の最近のステータスは次のとおりです。

%{
#include <stdio.h>
#include <glib.h>
%}

// Symbols.
%union
{
    char    *sval;
};
%token <sval> VALUE
%token <sval> KEY
%token OBRACE
%token EBRACE
%token QUOTE
%token SEMICOLON 

%start Input
%%
Input: 
     /* empty */ 
     | Input Entry ;  /* input is zero or more entires */
Entry: 
     '@' KEY '{' KEY ','{ printf("===========\n%s : %s\n",$2, $4); } 
     KeyVals '}' 
     ;
KeyVals: 
       /* empty */ 
       | KeyVals KeyVal ; /* zero or more keyvals */
KeyVal: 
      KEY '=' VALUE ',' { printf("%s : %s\n",$1, $3); };

%%

int yyerror(char *s) {
  printf("yyerror : %s\n",s);
}

int main(void) {
 GHashTable* table = g_hash_table_new(g_str_hash, g_str_equal);
  char *keys[] = {"id", "type", "author", "year",NULL};
  char *vals[] = {"one",  "Book",  "RB", "2013", NULL};
  gint i;
  yyparse();
  GHashTableIter iter;
  g_hash_table_iter_init (&iter, table);
  for (i= 0; i<=3; i++)
  {
    g_hash_table_insert(table, keys[i],vals[i]);
    g_printf("%d=>%s:%s\n",i,keys[i],g_hash_table_lookup(table,keys[i]));
  }
}

lex ファイルは変更しません。配列 keys と vals の要素はテスト用です。入力ファイルの例は

@Booklet{ab19,
    Author="Rudra Banerjee and A. Mookerjee",
    Editor="sm1",
    Title="sm2",
    Publisher="sm3",
    Volume="sm4",
    Issue="sm5",
    Page="sm6",
    Month="sm8",
    Note="sm9",
    Key="sm10",
    Year="1980",
    Add="osm1",
    Edition="osm2",
}

そのため、解析中に、コードは値を正しく解析します。解析された入力からのこれらの値を使用して、入力ごとに異なるハッシュテーブルに挿入したいと考えています。したがって、私の最終的な目標は、配列のキーと値をコードから削除することです。そしてライン

g_hash_table_insert(table, keys[i],vals[i]);

次のようなものに置き換える必要があります。

g_hash_table_insert(table, <$1 from bison>,<$3 from bison>);

これは理にかなっていますか？

編集:=====================================

@ Uncle0: これが更新されたコードです。おそらくこれで私の意図は明らかです。私はこれを修正するために多くのことを試みていますが、bison行からの印刷は期待どおりに印刷されていますが、ハッシュテーブルからの印刷中はそうではありません(コードの最後の行)

%{
#include <stdio.h>
#include <glib.h>
#define slen 1024
GHashTable* table;
%}

// Symbols.
%union
{
    char    *sval;
};
%token <sval> VALUE
%token <sval> KEY
%token OBRACE
%token EBRACE
%token QUOTE
%token SEMICOLON 

%start Input
%%
Input: 
     /* empty */ 
     | Input Entry ;  /* input is zero or more entires */
Entry: 
     '@' KEY '{' KEY ','{ g_hash_table_insert(table, "TYPE", $2);
                  g_hash_table_insert(table, "ID", $4);
              g_printf("%s: %s\n", $2, $4);
              } 
     KeyVals '}' 
     ;
KeyVals: 
       /* empty */ 
       | KeyVals KeyVal ; /* zero or more keyvals */
KeyVal: 
      KEY '=' VALUE ',' { g_hash_table_insert(table, $1, $3);
                          g_printf("%s: %s\n", $1, $3); };

%%

int yyerror(char *s) {
  printf("yyerror : %s\n",s);
}

int main(void) {
  table = g_hash_table_new(g_str_hash, g_str_equal);
gint i;
do{
   g_hash_table_remove_all (table);
   yyparse();
   parse_entry (table);
//  g_printf("%s:%s\n","Author=>",g_hash_table_lookup(table,"Author"));
//  g_printf("%s:%s\n","KEY=>",g_hash_table_lookup(table,"KEY"));
  }
  while(!EOF);
}
void parse_entry (GHashTable *table)
{
  GHashTableIter iter;
  gchar *key, *val;
  char *keys[] = {"id", "type", "author", "year", "title", "publisher", "editor", 
    "volume", "number", "pages", "month", "note", "address", "edition", "journal",
    "series", "book", "chapter", "organization", NULL};
  char *vals[] = {NULL,  NULL,  NULL, NULL, NULL,
    NULL,  NULL,  NULL, NULL, NULL,
    NULL,  NULL,  NULL, NULL, NULL,
    NULL,    NULL,  NULL, NULL, NULL};

  gchar **kiter;
  gint i;
  g_hash_table_iter_init (&iter, table);
  while (g_hash_table_iter_next (&iter, (void **)&key, (void **)&val))
  {
    for (kiter = keys, i = 0; *kiter; kiter++, i++)
    {
      if (!g_ascii_strcasecmp(*kiter, key))
      {
    vals[i] = g_strndup(val,slen);
    break;
      }
    g_printf("%d=>%s:%s\n",i,keys[i],vals[i]);
    }
  }
}

score 1 · Accepted Answer

You haven't been clear on what you want to do with the input, but here is an explanation to get you started.

flex is going to take your file of regular expressions and produce a function called yylex().

bison is going to take your grammar file and produce a function called yyparse() that uses the yylex() function repeatedly to tokenize strings. The main() function will only call yyparse() once, and each time the yyparse() function matches a rule in the grammar, it will execute the code fragments you have specified. Right now, you are merely printing the values, but you can do other things like insert into the hash table or whatever you want.

The grammar.y file has sections for code that comes before the definition of yyparse() and code that comes after. It is okay to put the main() function at the end of this file if you want to, but it is only better to put it in another file an link the two. Usually, the main() function does things like open the input for reading, etc., then calls yyparse() to perform the bulk of the work. After yyparse() returns, main can clean up.

EDIT: Hi Rudra,

I see you want to keep main() in the grammar file. That's okay.

All you need to do now is change the printf statements in the snippets to insert into the table, The table variable will have to be declared outside of main() for yyparse() to see it.

%{
#include <stdio.h>
#include <glib.h>

GHashTable* table;

%}

// Symbols.
%union
{
    char    *sval;
};
%token <sval> VALUE
%token <sval> KEY
%token OBRACE
%token EBRACE
%token QUOTE
%token SEMICOLON 

%start Input
%%
Input: 
     /* empty */ 
     | Input Entry ;  /* input is zero or more entires */
Entry: 
     '@' KEY '{' KEY ','{ printf("===========\n%s : %s\n",$2, $4); } 
     KeyVals '}' 
     ;
KeyVals: 
       /* empty */ 
       | KeyVals KeyVal ; /* zero or more keyvals */
KeyVal: 
      KEY '=' VALUE ',' { g_hash_table_insert(table, $1, $3); printf("%s : %s\n",$1, $3); };

%%

int yyerror(char *s) {
  printf("yyerror : %s\n",s);
}

int main(void) {
  table = g_hash_table_new(g_str_hash, g_str_equal);

  yyparse();
}

Are you sure you don't want to do anything with the first terms in the data? It seems like you are not using them for anything.

c - glib のハッシュ コンテナーでの flex+bison 出力

1 に答える 1

Related

Reference

c - glib のハッシュコンテナーでの flex+bison 出力