python - 次のようにパンダデータを効率的に再配置する方法は?

Question

次の操作のパンダでの簡潔で効率的な定式化について、いくつかの助けが必要です。

形式のデータフレームが与えられた場合

id    a   b    c   d
1     0   -1   1   1
42    0    1   0   0
128   1   -1   0   1

次の形式のデータフレームを作成します。

id     one_entries
1      "c d"
42     "b"
128    "a d"

つまり、列「one_entries」には、元のフレームのエントリが 1 である列の連結名が含まれます。

score 5 · Accepted Answer

ブール規則を使用してラムダ関数を適用する 1 つの方法を次に示します。

In [58]: df
Out[58]:
    id  a  b  c  d
0    1  0 -1  1  1
1   42  0  1  0  0
2  128  1 -1  0  1

In [59]: cols = list('abcd')

In [60]: (df[cols] > 0).apply(lambda x: ' '.join(x[x].index), axis=1)
Out[60]:
0    c d
1      b
2    a d
dtype: object

結果をに割り当てることができますdf['one_entries'] =

適用機能の詳細。

最初の行を取ります。

In [83]: x = df[cols].ix[0] > 0

In [84]: x
Out[84]:
a    False
b    False
c     True
d     True
Name: 0, dtype: bool

x行のブール値、ゼロより大きい値を提供します。x[x]のみを返しTrueます。基本的に、列名をインデックスとするシリーズです。

In [85]: x[x]
Out[85]:
c    True
d    True
Name: 0, dtype: bool

x[x].index列名を提供します。

In [86]: x[x].index
Out[86]: Index([u'c', u'd'], dtype='object')

score 2 · Accepted Answer

John Galt と同じ理由ですが、少し短く、dict から新しい DataFrame を構築します。

pd.DataFrame({
    'one_entries': (test_df > 0).apply(lambda x: ' '.join(x[x].index), axis=1)
})

#       one_entries
#   1           c d
#  42             b
# 128           a d

python - 次のようにパンダデータを効率的に再配置する方法は?

2 に答える 2

Related

Reference