python - 変数 Python から文字列を抽出する

Question

解析された URL を処理した後、このようなものを変数に格納しまし''.join(soup.findAll(text=True))た。指定された引数に従って、スコアと一緒に学校を取得し、誰と遊んでいるかを取得する必要がありますtest.py "norfolk st."。「Norfolk St. 0-38 Rutgers' ...re.search()、string.find() などのいくつかの関数を使用して、期待される結果を取得できない文字を解析しようとしましたか? 助けが必要

Norfolk St. 


0 - 38




    Rutgers 
    Final


     South Florida 


    6 - 21


     Michigan St. 
    Final


     Chowan 


    7 - 47


     Charlotte 
    Final


     SE Louisiana 


    17 - 38


     (24) TCU 
    Final


     W. Kentucky 


    20 - 52


     Tennessee 
    Final


     S. Carolina St. 


    13 - 52


     (4) Clemson 
    Final


     Middle Tenn. St. 


    20 - 40


     North Carolina 
    Final


     Central Conn. St. 


    44 - 51


     Lehigh 
    Final OT


     Army 


    14 - 40


     Ball St. 
    Final

問題は、この URL からサッカーボックスのスコアを取得する必要があることですhttp://sports.yahoo.com/college-football/scoreboard/?conf=all。ユーザーがコマンドライン引数で学校名を指定するたびに、この URI に移動する必要があり、ハイパーリンクがある場合は学校名を確認し、リダイレクトして取得する必要があります。ボックススコアはこのようなものです

1   2   3   4   Total
FAU 3   3   0   7   13
ECU 7   14  10  0   31

ゲームが進行中の場合、取得したスコアは指定された秒間スリープしてから最新のスコアを取得する必要があります。私はPythonが初めてなので、助けが必要です。

score 0 · Accepted Answer

私は正規表現を気にしません。テキストに基づいて、文字列から空白を除いたものは、おおむね次の形式に従い、繰り返されているように見えます。

thing 1
score
thing 2
"final"

その結果、先に進んで文字列をクリーンアップし、それを反復処理して、4 つの各グループを辞書の一部として返すことができます。

例えば：

def chunk(iterable, n):
    '''chunk([1, 2, 3, 4, 5, 6], 2) -> [[1, 2], [3, 4], [5, 6]]'''
    return [iterable[i:i+n] for i in range(0, len(iterable), n)]

def get_scores(raw):
    clean = [line.strip() for line in raw.split('\n') if line.strip() != '']
    return {thing1: (thing1, score, thing2) for (thing1, score, thing2, _) in chunk(clean, 4)}

次に、次のことができます。

>>> raw = ''.join(soup.findAll(text=True))
>>> scores = get_scores(raw)
>>> print scores['Norfolk St.']
('Norfolk St.', '0 - 38', 'Rutgers')

検索で大文字と小文字を区別しないようにするには、次のようにします。

def get_scores(raw):
    clean = [line.strip().lower() for line in raw.split('\n') if line.strip() != '']
    return {thing1: (thing1, score, thing2) for (thing1, score, thing2, _) in chunk(clean, 4)}

「Norfolk St.」または「Norfolk St.」のどちらかを調べたい場合は、または 'Rutgers' と同じ結果を得るには、次のようにします。

def get_scores(raw):
    clean = [line.strip().lower() for line in raw.split('\n') if line.strip() != '']
    output = {}
    for (thing1, score, thing2, _) in chunk(clean, 4):
        data = (thing1, score, thing2)
        output[thing1] = data
        output[thing2] = data
    return output

python - 変数 Python から文字列を抽出する

2 に答える 2

Related

Reference