python - ローマ数字を含む文字列を同等の整数に変換します

Question

次の文字列があります。

str = "MMX Lions Television Inc"

そして、私はそれを次のように変換する必要があります:

conv_str = "2010 Lions Television Inc"

ローマ数字を同等の整数に変換する次の関数があります。

numeral_map = zip(
    (1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4, 1),
    ('M', 'CM', 'D', 'CD', 'C', 'XC', 'L', 'XL', 'X', 'IX', 'V', 'IV', 'I')
)

def roman_to_int(n):
    n = unicode(n).upper()

    i = result = 0
    for integer, numeral in numeral_map:
        while n[i:i + len(numeral)] == numeral:
            result += integer
            i += len(numeral)
    return result

re.subここで正しい文字列を取得するにはどうすればよいですか?

(注:regexここで説明されている方法を使用してみました:有効なローマ数字のみを正規表現と一致させるにはどうすればよいですか?しかし、うまくいきませんでした。)

score 7 · Accepted Answer

共通の関数/ライブラリを探すときは、常にPythonPackageIndexを試してください。

これは、キーワード「roman」に関連するモジュールのリストです。

たとえば、「romanclass」には、ドキュメントを引用して、変換を実装するクラスがあります。

So a programmer can say:

>>> import romanclass as roman

>>> two = roman.Roman(2)

>>> five = roman.Roman('V')

>>> print (two+five)

and the computer will print:

VII

score 2 · Accepted Answer

re.sub()置換として関数を受け入れることができる場合、関数は Match オブジェクトである単一の引数を受け取り、置換文字列を返す必要があります。ローマ数字の文字列を int に変換する関数は既にあるので、これは難しくありません。

あなたの場合、次のような関数が必要です。

def roman_to_int_repl(match):
    return str(roman_to_int(match.group(0)))

これで、リンクした質問から正規表現を変更して、より大きな文字列内で一致を見つけることができます。

s = "MMX Lions Television Inc"
regex = re.compile(r'\b(?=[MDCLXVI]+\b)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b')
print regex.sub(roman_to_int_repl, s)

文字列の「LLC」を置き換えないバージョンの正規表現を次に示します。

regex = re.compile(r'\b(?!LLC)(?=[MDCLXVI]+\b)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b')

変更された置換関数で元の正規表現を使用することもできます。

def roman_to_int_repl(match):
    exclude = set(["LLC"])   # add any other strings you don't want to replace
    if match.group(0) in exclude:
        return match.group(0)
    return str(roman_to_int(match.group(0)))

python - ローマ数字を含む文字列を同等の整数に変換します

2 に答える 2

Related

Reference