python - 文字列にヘブライ文字があるかどうかを確認する正しい方法

Question

ヘブライ語の Unicode 表現は 1424 から 1514 (または 16 進数の 0590 から 05EA) です。

これを達成するための、最も効率的で最もpythonicな方法を探しています。

最初に私はこれを思いついた：

for c in s:
    if ord(c) >= 1424 and ord(c) <= 1514:
        return True
return False

次に、よりエレガントな実装を行いました。

return any(map(lambda c: (ord(c) >= 1424 and ord(c) <= 1514), s))

そして多分：

return any([(ord(c) >= 1424 and ord(c) <= 1514) for c in s])

これらのうちどれが最高ですか? または、別の方法で行う必要がありますか？

score 16 · Accepted Answer

あなたがすることができます：

# Python 3.
return any("\u0590" <= c <= "\u05EA" for c in s)
# Python 2.
return any(u"\u0590" <= c <= u"\u05EA" for c in s)

score 1 · Accepted Answer

unidcodedata で最初の文字をチェックするのは簡単です:

import unicodedata

def is_greek(term):
    return 'GREEK' in unicodedata.name(term.strip()[0])


def is_hebrew(term):
    return 'HEBREW' in unicodedata.name(term.strip()[0])

score 1 · Accepted Answer

基本的なオプションは次のとおりです。

文字の範囲を含む正規表現と照合します。また
文字列を繰り返し処理し、すべてのターゲット文字を含む文字列またはセット内の文字のメンバーシップをテストし、一致が見つかった場合は中断します。

実際のテストだけが、どちらがより高速になるかを示すことができます。

python - 文字列にヘブライ文字があるかどうかを確認する正しい方法

3 に答える 3

Related

Reference