python - リストのさまざまなリストにある特定のインデックスに対して操作を実行し、それらを別のインデックスでグループ化する方法

Question

このタスクを実行するために私が使用しなければならないスキルのレベルを示すために、以下に非常に多くのコードがあります。初心者のテクニックのみお願いします。

def get_monthly_averages（original_list）：

#print(original_list)
daily_averages_list = [ ]
product_vol_close = [ ] # used for numerator
monthly_averages_numerator_list = [ ]
for i in range (0, len(original_list)):
    month_list = original_list[i][0][0:7]        #Cutting day out of the date leaving Y-M 
    volume_str = float(original_list[i][5])        #V
    adj_close_str = float(original_list[i][6])       #C
    daily_averages_sublists = [month_list,volume_str,adj_close_str]    #[Date,V,C]
    daily_averages_list.append(daily_averages_sublists)
for i in range (0, len(daily_averages_list)):      #Attempt at operation
    vol_close = daily_averages_list[i][1]*daily_averages_list[i][2]
    month_help = daily_averages_list[i][0]
    product_vol_sublists = [month_help,vol_close]
    product_vol_close.append(product_vol_sublists)
    print(product_vol_close)
    for i in range (0, len(product_vol_close)):     #<-------TROUBLE STARTS
        for product_vol_close[i][0]==product_vol_close[i][0]:  #When the month is the same
            monthly_averages_numerator = product_vol_close[i][1]+product_vol_close[i][1]
          # monthly_averages_numerator = sum(product_vol_close[i][1])         #tried both
            month_assn = product_vol_close[i][0]
            numerator_list_sublists = [month_assn,monthly_averages_numerator]                
            monthly_averages_numerator_list.append(numerator_list_sublists)
            print(monthly_averages_numerator_list)

元のリストは次の形式です。

[['2004-08-30', '105.28', '105.49', '102.01', '102.01', '2601000', '102.01'],
['2004-08-27', '108.10', '108.62', '105.69', '106.15', '3109000', '106.15'], 
['2004-08-26', '104.95', '107.95', '104.66', '107.91', '3551000', '107.91'],
['2004-08-25', '104.96', '108.00', '103.88', '106.00', '4598900', '106.00'],
['2004-08-24', '111.24', '111.60', '103.57', '104.87', '7631300', '104.87'], 
['2004-08-23', '110.75', '113.48', '109.05', '109.40', '9137200', '109.40'], 
['2004-08-20', '101.01', '109.08', '100.50', '108.31', '11428600', '108.31'],
['2004-08-19', '100.00', '104.06', '95.96', '100.34', '22351900', '100.34']]

0インデックスは日付、5番目はV、6番目はCです。

以下の操作を毎月個別に実行する必要があり、最終的に2つの要素を持つタプルが作成されます。以下に示すように、0は月年、1は「average_price」です。元のリスト内の各リストから5番目と6番目の値を取得して、次のように操作しようとしています...（クラスには初心者向けのテクニックを使用する必要があります...理解していただきありがとうございます）

average_price =（V1 * C1 + V2 * C2 + ... + Vn * Cn）/（V1 + V2 + ... + Vn）

（V=リストの各5番目の要素C=リストの各6番目の要素）

私の問題は、リスト全体ではなく、1か月だけで上記のタスクを実行し、次のような結果が得られることです。

[('month1',average_price),('month2',average_price),...]

私は作りました

for i in range (0, len(product_vol_close)):     #<-------TROUBLE STARTS
    for product_vol_close[i][0]==product_vol_close[i][0]:

同じ月と同じ年にそれらをグループ化するとき。

私がやろうとしていることを見せようとしています。これを私が望むように機能させる方法についての答えを見つけることができません。

それでも混乱がある場合はコメントしてください！この件について、今しばらくお待ちいただき、ご理解とご協力を賜りますようお願い申し上げます。

私は完全に迷子になっています。

score 1 · Accepted Answer

ここで重要なのは、リストの使用をやめ、辞書を使用することです。これにより、物事をグループ化することができます。

通常defaultdictは collections モジュールから使用しますが、これは許可されていない可能性がある宿題のように見えるため、これを行うための「長い」方法を次に示します。

サンプルデータでは、日付ごとに 1 つの行しかないため、コードスニペットでも同じと仮定します。私たちの生活を楽にするために、年月ごとに日付を保存します。それが私たちが計算の基礎にしているものだからです：

>>> date_scores = {}
>>> for i in data:
...    year_month = i[0][:7] # this will be our key for the dictionary
...    if year_month not in date_scores:
...         # In this loop, we check if the key exists or not; if it doesn't
...         # we initialize the dictionary with an empty list, to which we will
...         # add the data for each day.
...         date_scores[year_month] = []
...    
...    date_scores[year_month].append(i[1:]) # Add the data to the list for that
...                                          # for the year-month combination
... 
>>> date_scores
{'2004-08': [['105.28', '105.49', '102.01', '102.01', '2601000', '102.01'], ['108.10', '108.62', '105.69', '106.15', '3109000', '106.15'], ['104.95', '107.95', '104.66', '107.91', '3551000', '107.91'], ['104.96', '108.00', '103.88', '106.00', '4598900', '106.00'], ['111.24', '111.60', '103.57', '104.87', '7631300', '104.87'], ['110.75', '113.48', '109.05', '109.40', '9137200', '109.40'], ['101.01', '109.08', '100.50', '108.31', '11428600', '108.31'], ['100.00', '104.06', '95.96', '100.34', '22351900', '100.34']]}

これで、年と月の組み合わせごとに、辞書にリストが作成されました。このリストには、データがある月の各日のサブリストがあります。これで、次のようなことができます。

>>> print 'We have data for {} days for 2004-08'.format(len(date_scores['2004-08']))
We have data for 8 days for 2004-08

これにより、ループに関する問題の大部分が解決されると思います。

score 0 · Accepted Answer

My suggestion is to stick to a single main loop over the rows of your data. Something like this (pseudocode):

current_month = None
monthly_value = []
monthly_volume = []

for row in data:
    date, volume, price = parse(row) # you need to write this yourself
    month = month_from_date(date) # this too

    if month != current_month: # do initialization for each new month
        current_month = month
        monthly_value.append(0)
        monthly_volume.append(0)

    monthly_value[-1] += volume*price # indexing with -1 gives last value
    monthly_volume[-1] += volume

You can then do a second loop to compute the averages. Note that this requires that your data be grouped by month. If your data is not so nicely organized, you could replace the lists in the above code with dictionaries (indexed by month). Or you could use a defaultdict (from the collections module in the standard library) which wouldn't require any per-month initialization. But perhaps that's a little more advanced than you want.

python - リストのさまざまなリストにある特定のインデックスに対して操作を実行し、それらを別のインデックスでグループ化する方法

同じ月と同じ年にそれらをグループ化するとき。

2 に答える 2

Related

Reference