python - なぜ Python .title() はスティッキーなのですか? 将来の文字列にタイトルケースを適用しますか?

Question

これは奇妙なものです：

私はPythonで文字列をフォーマットしようとしています（明らかに）string.title().

これが私のコードです：

    def format_trade_name(self):
        tr_name = self.trade_name.title()
        tr_cap = [
            'oxy',
            'depo',
            'edex',
            'emla',
            'pred',
        ]
        tr_join = '|'.join(tr_cap)   
        tr_regex = r'\b(?!(' +tr_join + r'))(\w{1,4})\b'
        tr_matches = re.search(tr_regex, self.trade_name,re.IGNORECASE)
        for i in tr_matches.groups():
            if i is not None:
                tr_name = re.sub(r'\b'+i+r'\b',i.upper(),tr_name)
        return tr_name

問題は次のとおり です。関数で各単語の最初の文字を大文字にし、4 文字の文字列 (tr_cap にはない) をすべて大文字に変換する必要があります。したがって、元の文字列がの場合tylenol depo er、フォーマットされた文字列をTylenol Depo ER

2 行目をに変更するとtr_name = self.trade_name.capitalize()、関数は( is not capitalized) に変わります。tylenol depo erTylenol depo ERdepo

2 行目をそのままにしておくと、 .title() を使用した後に書式設定が適用されたにもかかわらずtr_name = self.trade_name.title()、関数がtylenol depo er(Tylenol Depo ErではErなく大文字に変わります。

新しいフォーマットを適用しようとした後でも、文字列がタイトルケースに変換される理由を誰かに説明してもらえますか?

更新だから私はそれを修正しましたが、なぜそれが機能するのかわかりません。欠けている重要な原則があるように感じます。

に変更tr_matches = re.search(tr_regex, self.trade_name,re.IGNORECASE)するtr_matches = re.search(tr_regex, tr_name, re.IGNORECASE)と動作します。

したがって、これは機能します：

    def format_trade_name(self):
        tr_name = self.trade_name.title()
        tr_cap = [
            'oxy',
            'depo',
            'edex',
            'emla',
            'pred',
        ]
        tr_join = '|'.join(tr_cap)   
        tr_regex = r'\b(?!(' +tr_join + r'))(\w{1,4})\b'
        tr_matches = re.search(tr_regex, tr_name ,re.IGNORECASE)
        for i in tr_matches.groups():
            if i is not None:
                tr_name = re.sub(r'\b'+i+r'\b',i.upper(),tr_name)
        return tr_name

理由はありますか？

score 1 · Accepted Answer

関数全体でタイトルケースの文字列を使用していません。

tr_matches = re.search(tr_regex, self.trade_name,re.IGNORECASE)

これらの一致は小文字になりますが、 re.sub は大文字と小文字が混在する文字列を検索しています。

コードを次のように切り替えます。

tr_matches = re.search(tr_regex, tr_name, re.IGNORECASE)

編集：複数の部分文字列を大文字にしたい場合、 re.search は1つしか一致しないため、実行できません。findall は、次のようなトリックを行う必要があります。

tr_matches = re.findall(tr_regex, self.trade_name,re.IGNORECASE)
print(tr_matches)
for _, i in tr_matches:
    if i is not None:
        tr_name = re.sub(r'\b'+i+r'\b',i.upper(),tr_name)
        print(tr_name)

編集 2: re.sub() は、一致するステップとループ全体を削除できるほど柔軟です。1 ～ 4 桁の各単語に一致し、ラムダ関数で大文字に変換できます。

  def format_trade_name(self):
        tr_name = self.trade_name.capitalize()
        tr_cap = [
            'oxy',
            'depo',
            'edex',
            'emla',
            'pred',
        ]
        tr_join = '|'.join(tr_cap)   
        tr_regex = r'\b(?!' +tr_join + r')(\w{1,4})\b'

        tr_name = re.sub(
            tr_regex,
            lambda match: match.group(0).upper(), 
            tr_name
        )
        return tr_name

python - なぜ Python .title() はスティッキーなのですか? 将来の文字列にタイトルケースを適用しますか?

1 に答える 1

Related

Reference