javascript - Suggestions for a table based approach to stream parsing

Question

I'm trying to parse a protocol where each message looks something like this:

[01001ACP01010100]

that is, each message has a starting character ([), and end character (]), a 5 byte sequence number, a type (ACP in this case). The data in between is decided by the type.

What I'm looking for is a way to declare the structure of all valid messages in one or more tables, and then make a parser that utilizes that table.

I'd also like a solution that can handle node.js streams and partially transmitted messages.

My first attempt looked something like this:

var sub_parsers = {
    "beg" : make_parser(function (char) {return char === "<"}, 1), // start character
    "end" : make_parser(function (char) {return char === ">"}, 1), // end character
    "seq" : make_parser(isnum, 5), // sequence number
    "typ" : make_parser(isupper, 3), // type (must be all uppercase)
};

var order = ["beg", "seq", "typ"];

var make_parser = function (valid, length) {
    var buf, ret;

    buf = "";

    return function (char) {

        buf += char;

        if (valid(char)) {
            if (buf.length === length) {
                ret = buf.slice(0);
                buf = "";
                return ret;
            }
        } else {
            buf = "";
            return null;
        }
        return undefined;
    };

    return f;
};

Then I keep the current state somewhere, and pump characters the parse function corresponding to my state.

There are several problems with this approach:

There are some states, such as the "typ" state above, where the actual value parsed affect the parser. I have no way of encoding this in the table above.
I'd like to have a table, that not only encodes how to parse the messages, but also how to serialize new ones.

javascript - Suggestions for a table based approach to stream parsing

0 に答える 0

Related

Reference