python - パンダのデータフレームインデックス値を変更するにはどうすればよいですか？

Question

私はdf：

>>> df
                   sales     cash
STK_ID RPT_Date                  
000568 20120930   80.093   57.488
000596 20120930   32.585   26.177
000799 20120930   14.784    8.157

そして、最初の行のインデックス値をからに変更し('000568','20120930')たい('000999','20121231')。最終結果は次のようになります。

>>> df
                   sales     cash
STK_ID RPT_Date                  
000999 20121231   80.093   57.488
000596 20120930   32.585   26.177
000799 20120930   14.784    8.157

これを達成する方法は？

score 22 · Accepted Answer

この設定では：

import pandas as pd
import io

text = '''\
STK_ID RPT_Date sales cash
000568 20120930 80.093 57.488
000596 20120930 32.585 26.177
000799 20120930 14.784 8.157
'''

df = pd.read_csv(io.BytesIO(text), delimiter = ' ', 
                 converters = {0:str})
df.set_index(['STK_ID','RPT_Date'], inplace = True)

インデックスは、次のようdf.indexに新しいものに再割り当てできます。MultiIndex

index = df.index
names = index.names
index = [('000999','20121231')] + df.index.tolist()[1:]
df.index = pd.MultiIndex.from_tuples(index, names = names)
print(df)
#                   sales    cash
# STK_ID RPT_Date                
# 000999 20121231  80.093  57.488
# 000596 20120930  32.585  26.177
# 000799 20120930  14.784   8.157

または、インデックスを列にし、列の値を再割り当てしてから、列をインデックスに戻すこともできます。

df.reset_index(inplace = True)
df.ix[0, ['STK_ID', 'RPT_Date']] = ('000999','20121231')
df = df.set_index(['STK_ID','RPT_Date'])
print(df)

#                   sales    cash
# STK_ID RPT_Date                
# 000999 20121231  80.093  57.488
# 000596 20120930  32.585  26.177
# 000799 20120930  14.784   8.157

IPythonを使用したベンチマークで%timeitは、インデックスの再割り当て（上記の最初の方法）は、インデックスをリセットして列の値を変更してから、インデックスを再設定する（上記の2番目の方法）よりも大幅に高速であることが示されています。

In [2]: %timeit reassign_index(df)
10000 loops, best of 3: 158 us per loop

In [3]: %timeit reassign_columns(df)
1000 loops, best of 3: 843 us per loop

python - パンダのデータフレームインデックス値を変更するにはどうすればよいですか？

1 に答える 1

Related

Reference