python - リストを含む列を持つ既存の DataFrame から新しい DataFrame を構築します (リストを使用して新しい行に入力します)。

Question

次のような DataFrame があります。

df = pd.DataFrame({'name': ['toto', 'tata', 'tati'], 'choices': 0})
df['choices'] = df['choices'].astype(object)
df['choices'][0] = [1,2,3]
df['choices'][1] = [5,4,3,1]
df['choices'][2] = [6,3,2,1,5,4]

print(df)

             choices  name
0           [1, 2, 3]  toto
1        [5, 4, 3, 1]  tata
2  [6, 3, 2, 1, 5, 4]  tati

このような df に基づいて DataFrame を構築したいと思います

             choice  rank  name
0                 1     0  toto
1                 2     1  toto
2                 3     2  toto
3                 5     0  tata
4                 4     1  tata
5                 3     2  tata
6                 1     3  tata
7                 6     0  tati
8                 3     1  tati
9                 2     2  tati
10                1     3  tati
11                5     4  tati
12                4     5  tati

各値のリストとインデックスを使用して新しい行を作成したいと考えています。

これは私がしました

size = df['choices'].map(len).sum()
df2 = pd.DataFrame(index=range(size), columns=df.columns)
del df2['choices']
df2['choice'] = np.nan
df2['rank'] = np.nan

k = 0
for i in df.index:
    choices = df['choices'][i]
    for rank, choice in enumerate(choices):
        df2['name'][k] = df['name'][i]
        df2['choice'][k] = choice
        df2['rank'][k] = rank
        k += 1

しかし、私はベクトル化されたソリューションを好みます。Python/Pandas で可能ですか?

score 5 · Accepted Answer

In [4]: s = df.choices.apply(Series).stack()

In [5]: s.name = 'choices' # needs a name to join

In[6]: del df['choices']

In[7]: df1 = df.join(s.reset_index(level=1))

In[8]: df1.columns = ['name', 'rank', 'choice']

In [9]: df1.sort(['name', 'rank']).reset_index(drop=True)
Out[9]: 
    name  rank  choice
0   tata     0       5
1   tata     1       4
2   tata     2       3
3   tata     3       1
4   tati     0       6
5   tati     1       3
6   tati     2       2
7   tati     3       1
8   tati     4       5
9   tati     5       4
10  toto     0       1
11  toto     1       2
12  toto     2       3

これは私のこのソリューションに関連していますが、あなたの場合、インデックス（ランク）をドロップする代わりに使用しています。

python - リストを含む列を持つ既存の DataFrame から新しい DataFrame を構築します (リストを使用して新しい行に入力します)。

1 に答える 1

Related

Reference