python-3.x - パンダを使用して辞書を平坦化する

Question

   [{'name': 'Test Item1',
  'column_values': [{'title': 'col2', 'text': 'Oladimeji Olaolorun'},
   {'title': 'col3', 'text': 'Working on it'},
   {'title': 'col4', 'text': '2019-09-17'},
   {'title': 'col5', 'text': '1'}],
  'group': {'title': 'Group 1'}},
 {'name': 'Test Item2',
  'column_values': [{'title': 'col2', 'text': 'Lucie Phillips'},
   {'title': 'col3', 'text': 'Done'},
   {'title': 'col4', 'text': '2019-09-20'},
   {'title': 'col5', 'text': '2'}],
  'group': {'title': 'Group 1'}},
 {'name': 'Test Item3',
  'column_values': [{'title': 'col2', 'text': 'David Binns'},
   {'title': 'col3', 'text': None},
   {'title': 'col4', 'text': '2019-09-25'},
   {'title': 'col5', 'text': '3'}],
  'group': {'title': 'Group 1'}},
 {'name': 'Item 4',
  'column_values': [{'title': 'col2', 'text': 'Lucie Phillips'},
   {'title': 'col3', 'text': 'Stuck'},
   {'title': 'col4', 'text': '2019-09-06'},
   {'title': 'col5', 'text': '4'}],
  'group': {'title': 'Group 2'}},
 {'name': 'Item 5',
  'column_values': [{'title': 'col2', 'text': 'David Binns'},
   {'title': 'col3', 'text': 'Done'},
   {'title': 'col4', 'text': '2019-09-28'},
   {'title': 'col5', 'text': '5'}],
  'group': {'title': 'Group 2'}},
 {'name': 'item 6',
  'column_values': [{'title': 'col2', 'text': 'Lucie Phillips'},
   {'title': 'col3', 'text': 'Done'},
   {'title': 'col4', 'text': '2020-03-05'},
   {'title': 'col5', 'text': '76'}],
  'group': {'title': 'Group 2'}}]

私は現在 Monday.com の API からデータを抽出しています。私の呼び出しは上記のような dict で上記の応答を返します。この dict を Dataframe にフラット化するための最良の方法を見つけようとしています。

以下の結果が得られるように見える場合、現在 json_normalize(results['data']['boards'][0]['items']) を使用しています

望ましい出力は以下のような表です

score 0 · Accepted Answer

モジュールglomを使用すると、ネストされたリストから必要な「テキスト」キーを簡単に抽出できます。データを pandas データフレームに読み込み、names 列を分割し、最後に親データフレームにマージします。

from glom import glom

spec = {'names':('column_values',['text']),
        'group': 'group.title',
        'Name' : 'name'
        }

この関数は、None エントリを文字列 'None' に置き換えます。

def replace_none(val_list):
    val_list = ['None' if v is None else v for v in val_list]
    return val_list

for i in M:
    i['names'] = replace_none(i['names'])

df = pd.DataFrame(M)

df_split = df['names'].str.join(',').str.split(',',expand=True).add_prefix('Col')

df = df.drop('names',axis=1)

pd.concat([df,df_split],axis=1)

    group   Name         Col0                Col1              Col2   Col3
0   Group 1 Test Item1  Oladimeji Olaolorun Working on it   2019-09-17  1
1   Group 1 Test Item2  Lucie Phillips      Done            2019-09-20  2
2   Group 1 Test Item3  David Binns         None            2019-09-25  3
3   Group 2 Item 4      Lucie Phillips      Stuck           2019-09-06  4
4   Group 2 Item 5      David Binns         Done            2019-09-28  5
5   Group 2 item 6      Lucie Phillips      Done            2020-03-05  76

更新:上記のコードはすべて不要です。以下のコードはより単純で、冗長ではなく、明確です。

d=[]
for ent in data:
    for entry in ent['column_values']:
        entry.update({'name':ent['name']})
        entry.update({'group':ent['group']['title']})
        d.append(entry)

res = pd.DataFrame(d)

res.set_index(['name','group','title']).unstack()

                                                               text
              title col2                col3            col4    col5
name         group              
Item 4      Group 2 Lucie Phillips      Stuck           2019-09-06  4
Item 5      Group 2 David Binns         Done            2019-09-28  5
Test Item1  Group 1 Oladimeji Olaolorun Working on it   2019-09-17  1
Test Item2  Group 1 Lucie Phillips      Done            2019-09-20  2
Test Item3  Group 1 David Binns         None            2019-09-25  3
item 6      Group 2 Lucie Phillips      Done            2020-03-05  76

python-3.x - パンダを使用して辞書を平坦化する

1 に答える 1

Related

Reference