python - 別の列のパンダgroupby値

Question

私はこのデータフレームを持っています

frame =  pd.DataFrame({'player1' : ['Joe', 'Steve', 'Bill', 'Doug', 'Steve','Bill','Joe','Steve'],
                      'player2' : ['Bill', 'Doug', 'Steve', 'Joe', 'Bill', 'Steve', 'Doug', 'Bill'],
                      'winner' : ['Joe','Steve' , 'Steve','Doug', 'Bill', 'Steve', 'Doug', 'Steve'],
                      'loser' : ['Bill', 'Doug', 'Bill', 'Joe', 'Steve', 'Bill', 'Joe', 'Bill'],
                       'ones' : 1})

これを行うことで、勝者が何回勝ったかの累計を維持できます。

frame['winners_wins'] = frame.groupby('winner')['ones'].cumsum()

プレーヤー 1 の勝敗を数え続け、プレーヤー 2 についても同じようにしたいと思います。これは groupby 関数でできるはずなのですが、書き方がわかりません。

編集：

最初はうまく言えなかった。個々のプレーヤーについて追跡したいと思います。したがって、望ましい出力は次のようになります。

player1 player2 winner  loser   player1_wins  player2_wins
 Joe     Bill     Joe    Bill       1             0
 Steve   Doug     Steve  Doug       1             0
 Bill    Steve    Steve  Bill       0             2
 Doug    Joe      Doug    Joe       1             1
 Steve   Bill     Bill    Steve     2             1 
 Bill    Steve    Steve   Bill      1             3
 Joe     Doug     Doug    Joe       1             2   
 Steve   Bill     Steve   Bill      3             1

score 1 · Accepted Answer

player1'sとplayer2'swinsの現在の合計が必要なようです。これは、Pandas よりも Python を使用する非常に平凡な方法です。

行を順番にステップ実行し、前の結果を使用して次の行を計算する必要がある計算は、Pandas/Numpy 操作を助長しない傾向がcumsumあります。これは例外です。したがって、パンダの操作を使用してこれを行うための巧妙な方法はないと思いますが、間違っている可能性があります。

import pandas as pd
import collections

df = pd.DataFrame({'player1' : ['Joe', 'Steve', 'Bill', 'Doug',
                      'Steve','Bill','Joe','Steve'], 'player2' : ['Bill',
                      'Doug', 'Steve', 'Joe', 'Bill', 'Steve', 'Doug', 'Bill'],
                      'winner' : ['Joe','Steve' , 'Steve','Doug', 'Bill',
                      'Steve', 'Doug', 'Steve'], 'loser' : ['Bill', 'Doug',
                      'Bill', 'Joe', 'Steve', 'Bill', 'Joe', 'Bill'], },
                  columns = ['player1', 'player2', 'winner', 'loser'])

wins = collections.Counter()
def count_wins():
    for idx, row in df.iterrows():
        wins[row['winner']] += 1
        yield wins[row['player1']], wins[row['player2']]
df['player1_wins'], df['player2_wins'] = zip(*list(count_wins()))
print(df)

版画

  player1 player2 winner  loser  player1_wins  player2_wins
0     Joe    Bill    Joe   Bill             1             0
1   Steve    Doug  Steve   Doug             1             0
2    Bill   Steve  Steve   Bill             0             2
3    Doug     Joe   Doug    Joe             1             1
4   Steve    Bill   Bill  Steve             2             1
5    Bill   Steve  Steve   Bill             1             3
6     Joe    Doug   Doug    Joe             1             2
7   Steve    Bill  Steve   Bill             4             1

python - 別の列のパンダgroupby値

2 に答える 2

Related

Reference