1

I'm making a simple stack-based language which uses commands to manipulate the stack. When I find a command in the source, I use this regex to separate out the actual command name, such as sum, and the arguments to the command. Arguments are surrounded by triangle brackets and are separated by commas.

Here's the regex I'm currently using:

(?<command>[^<>\s]+)(\<(?<args>(\d+)+(?>,\s*\d+)*)\>)?

Now this works fine, and here are some examples of it working:

+              => command: '+', args: nil
sum<5>         => command: 'sum', args: '5'
print<1, 2, 3> => command: 'print', args: '1, 2, 3'

This works exactly as I want for each one but the last. My question is, is there a way to capture each argument separately? I mean like this:

print<1, 2, 3> => command: 'print', args: ['1', '2', '3']

By the way, I'm using the latest Ruby regex engine.

4

1 に答える 1

1

エンジンがキャプチャ スタックを保持しないため、Ruby 正規表現でキャプチャ グループを繰り返し使用する単純な正規表現を使用してこのような出力を取得することはできません。

,後処理ステップとして、2 番目のキャプチャを分割する必要があります。

Ruby のデモを参照してください:

def cmd_split(s)
    rx = /(?<command>[^<>\s]+)(<(?<args>(\d+)+(?:,\s*\d+)*)>)?/
    res = []
    s.scan(rx) { 
        res << ($~[:args] != nil ? 
            Hash["command", $~[:command], "args", $~[:args].split(/,\s*/)] : 
            Hash[$~[:command], ""]) }
    return res
end

puts cmd_split("print<1, 2, 3>") # => {"command"=>"print", "args"=>["1", "2", "3"]}
puts cmd_split("disp<1>")        # => {"command"=>"disp", "args"=>["1"]}
puts cmd_split("+")              # => {"+"=>""}
于 2016-11-08T22:08:57.420 に答える