python - グループのループのパンダ

Question

カテゴリフィールド「City」と 2 つのメトリック、Age と Weight を持つデータセットがあります。ループを使用して各都市の散布図をプロットしたいと考えています。ただし、必要な group by と loop を 1 つのステートメントで組み合わせるのに苦労しています。for ループを使用するだけでは、レコードごとにグラフが作成され、group by を実行すると、適切な数のグラフが得られますが、値はありません。

これは、コメントアウトされたグループで for ループのみを使用したコードです。

import pandas as pd
import numpy as np
import matplotlib.pylab as plt


d = {  'City': pd.Series(['London','New York', 'New York', 'London', 'Paris',
                        'Paris','New York', 'New York', 'London','Paris']),
       'Age' : pd.Series([36., 42., 6., 66., 38.,18.,22.,43.,34.,54]),
     'Weight': pd.Series([225,454,345,355,234,198,400, 256,323,310])
}

df = pd.DataFrame(d)

#for C in df.groupby('City'):
for C in df.City:
    fig = plt.figure(figsize=(5, 4))
    # Create an Axes object.
    ax = fig.add_subplot(1,1,1) # one row, one column, first plot
    # Plot the data.
    ax.scatter(df.Age,df.Weight, df.City == C, color="red", marker="^")

score 2 · Accepted Answer

plt.figure呼び出しごとに新しい図 (大まかに言えば、ウィンドウ) が作成されるため、複数回呼び出さないでください。

import pandas as pd
import numpy as np
import matplotlib.pylab as plt

d = {'City': ['London', 'New York', 'New York', 'London', 'Paris',
                        'Paris', 'New York', 'New York', 'London', 'Paris'],
     'Age': [36., 42., 6., 66., 38., 18., 22., 43., 34., 54],
     'Weight': [225, 454, 345, 355, 234, 198, 400, 256, 323, 310]}

df = pd.DataFrame(d)
fig, ax = plt.subplots(figsize=(5, 4))    # 1
df.groupby(['City']).plot(kind='scatter', x='Age', y='Weight', 
                          ax=ax,          # 2
                          color=['red', 'blue', 'green'])

plt.show()

ここに画像の説明を入力

plt.subplotsfigは Figureと Axes を返しますax。
Panda の plot メソッドに渡すax=axと、すべてのプロットが同じ軸上に表示されます。

都市ごとに個別の図を作成するには:

import pandas as pd
import numpy as np
import matplotlib.pylab as plt

d = {'City': ['London', 'New York', 'New York', 'London', 'Paris',
                        'Paris', 'New York', 'New York', 'London', 'Paris'],
     'Age': [36., 42., 6., 66., 38., 18., 22., 43., 34., 54],
     'Weight': [225, 454, 345, 355, 234, 198, 400, 256, 323, 310]}

df = pd.DataFrame(d)
groups = df.groupby(['City'])
for city, grp in groups:                           # 1
    fig, ax = plt.subplots(figsize=(5, 4))
    grp.plot(kind='scatter', x='Age', y='Weight',  # 2
             ax=ax)               

    plt.show()

これはおそらくあなたが見逃していたすべてです。GroupBy オブジェクトを反復処理すると、groupby キーとサブ DataFrame の 2 つのタプルが返されます。
for ループ内ではgrpなく、サブ DataFrame を使用します。df

score 2 · Accepted Answer

他の投稿の group by を使用し、コードに挿入して、次の方法で各グループのグラフを生成しました。

import pandas as pd
import numpy as np
import matplotlib.pylab as plt


d = {  'City': pd.Series(['London','New York', 'New York', 'London','Paris',
                        'Paris','New York', 'New York', 'London','Paris']),
       'Age' : pd.Series([36., 42., 6., 66., 38.,18.,22.,43.,34.,54]) ,
     'Weight': pd.Series([225,454,345,355,234,198,400, 256,323,310])

}

df = pd.DataFrame(d)

groups = df.groupby(['City'])
for city, grp in groups: 
    fig = plt.figure(figsize=(5, 4))
    # Create an Axes object.
    ax = fig.add_subplot(1,1,1) # one row, one column, first plot
    # Plot the data.
    ax.scatter(df.Age,df.Weight, df.City == city, color="red", marker="^")

python - グループのループのパンダ

2 に答える 2

Related

Reference