c++ - カスタムレクサーに関するパーサーの問題

Question

カスタムビルドの Lexer クラスに関する支援を求めており、それを使用して入力を解析しています。私たちの教授は私たちのプロジェクトのスケルトンコードをいくつか提供してくれました。私たちはそれを使用する必要があります。私の問題はこれです。複数の関数を一度に呼び出して、テーブルをソートし、別のテーブルの列をマージ/ソートできるようにする必要があります。たとえば、入力は次のようになります。

<'file_name> を <'column2> でソートして表示

ここで、「display」と「sortedby」は一種のキーワードで、column2 はコンテンツに応じて数値またはアルファベット順にソートされます。

ソートに使用するアルゴリズムが与えられましたが、現在の問題はその実装ではなく、レクサー/パーサーが複数の入力を読み取れるようにすることです。現在、'display' ビットのみを動作させることができます。それ以上のものは、エラーメッセージを吐き出すだけです.

私はコードを調べ、いくつかのロジックを変更しようとしました - ステートメントを true から false に切り替え、&& と || を交換し、いくつかの if-else ステートメントを試してみましたが、うまくいきませんでした。

私は本当にいくつかのアドバイスを使うことができました！元の形式で提供されるコードの一部:

Lexer.h:

#ifndef _LEXER_H
#define _LEXER_H
#include <string>

enum token_types_t { 
IDENT,  // a sequence of alphanumeric characters and _, starting with alpha
TAG, // sequence of characters between < >, no escape
ENDTOK, // end of string/file, no more token
ERRTOK  // unrecognized token
};

struct Token {
token_types_t type;
std::string value;
// constructor for Token
Token(token_types_t tt=ENDTOK, std::string val="") : type(tt), value(val) {}
};

class Lexer {
public:
// constructor
Lexer(std::string str="") : input_str(str), cur_pos(0), in_err(false), 
    separators(" \t\n\r") { }

//modifiers 
void set_input(std::string); // set a new input, 
void restart();              // move cursor to the beginning, restart

Token next_token();    // returns the next token
bool has_more_token(); // are there more token(s)?

private:
std::string input_str;  // the input string to be scanned
size_t      cur_pos;    // current position in the input string
bool        in_err;     // are we in the error state?
std::string separators; // set of separators; *not* the best option!
};
#endif

Lexer.cpp:

#include "Lexer.h"
#include <iostream>
using namespace std;

Token Lexer::next_token() {
Token ret;
size_t last;

if (in_err) {
    ret.type = ERRTOK;
    ret.value = "";
    return ret;
}

// if not in error state, the default token is the ENDTOK
ret.type = ENDTOK;
ret.value = "";

if (has_more_token()) {
    last = cur_pos; // input_str[last] is a non-space char
    if (input_str[cur_pos] == '<') {
        cur_pos++;
        while (cur_pos < input_str.length() && input_str[cur_pos] != '>')
            cur_pos++;
        if (cur_pos < input_str.length()) {
            ret.type = TAG;
            ret.value = input_str.substr(last+1, cur_pos-last-1);
            cur_pos++; // move past the closing "
        } else {
            in_err = true;
            ret.type = ERRTOK;
            ret.value = "";
        }
    } else {
        while (cur_pos < input_str.length() &&
               separators.find(input_str[cur_pos]) == string::npos &&
               input_str[cur_pos] != '<') {
            cur_pos++;
        }
        ret.type  = IDENT;
        ret.value = input_str.substr(last, cur_pos-last);
    }
}
return ret;
}

void Lexer::set_input(string str) {
input_str = str;
restart();
}

bool Lexer::has_more_token() {
while (cur_pos < input_str.length() && 
       separators.find(input_str[cur_pos]) != string::npos) {
    cur_pos++;
}
return (cur_pos < input_str.length());
}

void Lexer::restart() {
cur_pos = 0;
in_err = false;
}

パーサー (より大きな .cpp ファイルの一部):

bool parse_input(Lexer lexer, string& file_name) {    
Token file_name_tok;

if (!lexer.has_more_token() || 
    (file_name_tok = lexer.next_token()).type != TAG)
    return false;

if  (lexer.has_more_token())
    return false;

file_name = file_name_tok.value;
return true;
}

表示機能 (パーサーと同じ .cpp ファイルの一部):

void display(Lexer cmd_lexer) {
string file_name, line;

if (!parse_input(cmd_lexer, file_name)) {
    error_return("Syntax error: display <filename>");
    return;
}

ifstream ifs(file_name.c_str());
string error_msg;
if (ifs) {
       if (!is_well_formed(ifs, error_msg)) {
        error_return(error_msg);
    } else {
            ifs.clear();           
        ifs.seekg(0, ios::beg); 
        print_well_formed_file(ifs);
    }
    while (ifs.good()) {
  getline (ifs, line);
  cout << line << endl;
}

} else {
    error_return("Can't open " + file_name + " for reading");
}
ifs.close();
}

score 1 · Accepted Answer

私のコメントへの回答に応じて、これらは私が問題を解決する方法です：

コマンドがソースファイルを読み取って解析する必要がある場合displayは、スタックを介して実装できます。ディレクティブが見つかって解析されるたびにdisplay、新しいレクサーインスタンスをスタックにプッシュします。「現在の」レクサーにスタックの一番上を使用します。
コマンドが実際の解析とは関係のないファイルを読み取って何らかの操作を実行する必要がある場合はdisplay、命令を固定形式の中間形式で保存し、解析が完了したらこの中間形式を「実行」することを検討してください。これは、ほとんどすべての最新のスクリプト言語が行う方法です。

score 0 · Accepted Answer

簡単そうです。複数の入力を読み取るには、複数のレクサー/パーサーが必要です。読み取る必要のある入力ごとに 1 つ作成するだけです。

c++ - カスタムレクサーに関するパーサーの問題

2 に答える 2

Related

Reference