0
Input string
---------------
South Africa 109/0 
Australia 100
Sri Lanka 111
Sri Lanka 331/4

Expected Output
---------------
['South Africa', '109', '0']
['Australia', '100']
['Sri Lanka', '111']
['Sri Lanka', '331', '4']

いくつかの正規表現を試しましたが、正しいものを書くことがわかりませんでした。この場合、国名にスペースが含まれている場合と含まれていない場合があるため (南アフリカ、インド)、スペース区切り文字は役に立ちません。前もって感謝します

4

5 に答える 5

1

試す:

import re
re.split(r"(?<=[a-zA-Z])\s+(?=\d)|(?=\d)\s+(?=[a-zA-Z])|/", "South Africa 109/0")
于 2012-09-13T09:26:19.417 に答える
0

これはあなたが必要とする正規表現です:

for match in re.finditer(r"(?m)^(?P<Country>.*?)\s*(?P<Number1>\d+)\s*?/?\s*?(?P<Number2>\d*?)\s*?$", inputText):
    country = match.group("Country")
    number1 = match.group("Number1")
    number2 = match.group("Number2")

ここで結果を見ることができます。

そして、これがパターンの説明です:

# ^(?P<Country>.*?)\s*(?P<Number1>\d+)\s*?/?\s*?(?P<Number2>\d*?)\s*?$
# 
# Options: ^ and $ match at line breaks
# 
# Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
# Match the regular expression below and capture its match into backreference with name “Country” «(?P<Country>.*?)»
#    Match any single character that is not a line break character «.*?»
#       Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*»
#    Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
# Match the regular expression below and capture its match into backreference with name “Number1” «(?P<Number1>\d+)»
#    Match a single digit 0..9 «\d+»
#       Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
#    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Match the character “/” literally «/?»
#    Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
# Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
#    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Match the regular expression below and capture its match into backreference with name “Number2” «(?P<Number2>\d*?)»
#    Match a single digit 0..9 «\d*?»
#       Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*?»
#    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Assert position at the end of a line (at the end of the string or before a line break character) «$»
于 2012-09-13T09:33:19.747 に答える
0

正規表現で答えが得られましたが、利用可能な組み込みstrメソッドも検討することをお勧めします(とにかくこのユースケースでは):

s = 'South Africa 109/0'
country, numbers = s.rsplit(' ', 1)
# ('South Africa', '109/0')
new_list = [country] + numbers.split('/')
# ['South Africa', '109', '0'] 
于 2012-09-13T09:52:34.687 に答える
0
re.compile("^([\w\s]+)\s(\d+)\/?(\d+)?")

3 つのグループが表示されます。分解できます

  • ([\w\s]+)行頭の文字とスペースのみのグループ( ^)
  • 空間
  • 数字のグループ、少なくとも 1 つ(\d+)
  • /どうか
  • 数字のグループ (潜在的にNone)
于 2012-09-13T09:18:52.500 に答える