112

My example string is as follows:

This is 02G05 a test string 20-Jul-2012

Now from the above string I want to extract 02G05. For that I tried the following regex with sed

$ echo "This is 02G05 a test string 20-Jul-2012" | sed -n '/\d+G\d+/p'

But the above command prints nothing and the reason I believe is it is not able to match anything against the pattern I supplied to sed.

So, my question is what am I doing wrong here and how to correct it.

When I try the above string and pattern with python I get my result

>>> re.findall(r'\d+G\d+',st)
['02G05']
>>>
4

5 に答える 5

118

How about using grep -E?

echo "This is 02G05 a test string 20-Jul-2012" | grep -Eo '[0-9]+G[0-9]+'
于 2012-07-19T20:42:43.843 に答える
116

The pattern \d might not be supported by your sed. Try [0-9] or [[:digit:]] instead.

To only print the actual match (not the entire matching line), use a substitution.

sed -n 's/.*\([0-9][0-9]*G[0-9][0-9]*\).*/\1/p'
于 2012-07-19T20:39:57.473 に答える
6

sed doesn't recognize \d, use [[:digit:]] instead. You will also need to escape the + or use the -r switch (-E on OS X).

Note that [0-9] works as well for Arabic-Hindu numerals.

于 2012-07-19T20:37:52.847 に答える
5

Try this instead:

echo "This is 02G05 a test string 20-Jul-2012" | sed 's/.* \([0-9]\+G[0-9]\+\) .*/\1/'

But note, if there is two pattern on one line, it will prints the 2nd.

于 2012-07-19T20:40:07.053 に答える
-1

Try using rextract. It will let you extract text using a regular expression and reformat it.

Example:

$ echo "This is 02G05 a test string 20-Jul-2012" | ./rextract '([\d]+G[\d]+)' '${1}'

2G05
于 2016-09-13T03:03:35.520 に答える