python - Python CSV 宿題プログラム

Question

csv と関数を使用してファイルを読み取る宿題があります。

基本的な考え方は、2 年間のサッカー選手のラッシャーレーティングを計算することです。提供されたファイルのデータを使用します。サンプルファイルは次のようになります。

名前, ,pos,team,g,rush,ryds,rtd,rtdr,ravg,fum,fuml,fpts,year
AJ、フィーリー、QB、STL、5,3,4,0,0,1.3,3,2,20.3,2011
アーロン、ブラウン、RB、DET、1,1,0,0,0,0,0,0,0.9、2011
アーロン、ロジャース、QB、GB、15、60、257、3、5、4.3、4、0、403.4、2011
エイドリアン、ピーターソン、RB、分、12,208,970,12,5.8,4.7,1,0,188.9,2011
アフマド、ブラッドショー、RB、NYG、12,171,659,9,5.3,3.9,1,1,156.6,2011

つまり、ファイルから最初の行を削除し、残りの行をコンマで分割して読み取る必要があります。

ラッシャーの評価を計算するには、次のものが必要です。

Yds は、アテンプトごとの平均獲得ヤードです。これは [合計ヤード / (4.05 * 試行回数)] です。この数値が 2.375 より大きい場合は、代わりに 2.375 を使用する必要があります。

perTDs は、キャリーあたりのタッチダウンの割合です。これは [(39.5 * タッチダウン) / 試行] です。この数値が 2.375 より大きい場合は、2.375 を使用する必要があります。

perFumbles は、キャリーごとのファンブルの割合です。これは [2.375 - ((21.5 * ファンブル) / 試行)] です。

ラッシャーの評価は [Yds + perTDs + perFumbles] * (100 / 4.5) です。

私がこれまでに持っているコード:

playerinfo = []
teaminfo10 = []
teaminfo11 = []

import csv

file = raw_input("Enter filename: ")
read = open(file,"rU")
read.readline()
fileread = csv.reader(read)

#Each line is iterated through, and if rush attempts are greater than 10, the
#player may be used for further statistics.
for playerData in fileread:
    if int(playerData[5]) > 10:
    
        attempts = int(playerData[5])
        totalYards = int(playerData[6])
        touchdowns = int(playerData[7])
        fumbles = int(playerData[10])
    
        #Rusher rating for each player is found. This rating, coupled with other
        #data about the player is formatted and appended into a list of players.
        rushRating = ratingCalc(attempts,totalYards,touchdowns,fumbles)
        rusherData = rushFunc(playerData,rushRating)
        playerinfo.append(rusherData)
    
        #Different data about the player is formatted and added to one of two
        #lists of teams, based on year. 
        teamData = teamFunc(playerData)
        if playerData[13] == '2010':
            teaminfo10.append(teamData)
        else:
            teaminfo11.append(teamData)

#The list of players is sorted in order of decreasing rusher rating.
playerinfo.sort(reverse = True)
#The two team lists of players are sorted by team.
teaminfo10.sort()
teaminfo11.sort()

print "The following statistics are only for the years 2010 and 2011."
print "Only those rushers who have rushed more than 10 times are included."
print
print "The top 50 rushers based on their rusher rating in individual years are:"

#50 players, in order of decreasing rusher ratings, are printed along with other
#data.
rushPrint(playerinfo,50)

#A similar list of running backs is created, in order of decreasing rusher
#ratings.
RBlist = []
for player in playerinfo:
    if player[2] == 'RB':
        RBlist.append(player)

print "\nThe top 20 running backs based on their rusher rating in individual\
years are:"
#The top 20 running backs on the RBlist are printed, with other data.
rushPrint(RBlist,20)


#The teams with the greatest overall rusher rating (if their attempts are
#greater than 10) are listed in order of decreasing rusher rating, for both 2010
#and 2011.
teamListFunc(teaminfo10,'2010')

teamListFunc(teaminfo11,'2011')

#The player(s) with the most yardage is printed.
yardsList = mostStat(6,fObj,False)
print "\nThe people who rushed for the most yardage are:"
for item in yardsList:
    print "%s rushing for %d yards for %s in %s."\
    % (item[1],item[0],item[2],item[3])

#The player(s) with the most touchdowns is printed.
TDlist = mostStat(7,fObj,False)
print"\nThe people who have scored the most rushing touchdowns are:"
for item in TDlist:
    print "%s rushing for %d touchdowns for %s in %s."\
    % (item[1],item[0],item[2],item[3])

#The player(s) with the most yardage per rushing attempt is printed.
ypaList = mostStat(6,fObj,True)
print"\nThe people who have the highest yards per rushing attempt with over 10\
rushes are:"
for item in ypaList:
    print "%s with a %.2f yards per attempt rushing average for %s in %s."\
    % (item[1],item[0],item[2],item[3])

#The player(s) with the most fumbles is printed.
fmblList = mostStat(10,fObj,False)
print"\nThere are %d people with the most fumbles. They are:" % (len(fmblList))
for item in fmblList:
    print "%s with %d fumbles for %s in %s." % (item[1],item[0],item[2],item[3])


def ratingCalc(atts,totalYrds,TDs,fmbls):
    """Calculates rusher rating."""
    yrds = totalYrds / (4.05 * atts)
    if yrds > 2.375:
        yrds = 2.375

    perTDs = 39.5 * TDs / atts
    if perTDs > 2.375:
        perTDs = 2.375

    perFumbles = 2.375 - (21.5 * fmbls / atts)

    rating = (yrds + perTDs + perFumbles) * (100/4.5)

    return rating    

def rushFunc(information,rRating):
    """Formats player info into [rating,name,pos,team,yr,atts]"""
    rusherInfo = []
    rusherInfo.append(rRating)
    name = information[0] + ' ' + information[1]
    rusherInfo.append(name)
    rusherInfo.append(information[2])
    rusherInfo.append(information[3])
    rusherInfo.append(information[13])
    rusherInfo.append(information[5])

    return rusherInfo


def teamFunc(plyrInfo):
    """Formats player info into [team,atts,yrds,TDs,fmbls] for team sorting"""
    teamInfo = []
    teamInfo.append(plyrInfo[3])
    teamInfo.append(plyrInfo[5])
    teamInfo.append(plyrInfo[6])
    teamInfo.append(plyrInfo[7])
    teamInfo.append(plyrInfo[10])

    return teamInfo

def rushPrint(lst,num):
    """Prints players and their data in order of rusher rating."""
    print "Name                           Pos   Year  Attempts   Rating  Team"
    count = 0
    while count < num:
        index = lst[count]
        print "%-30s %-5s %4s  %5s      %3.2f  %s"\
              % (index[1],index[2],index[4],index[5],index[0],index[3])
        count += 1

そうそう、まだ定義しなければならない関数がたくさんあります。しかし、これまでのコードについてどう思いますか? 非効率ですか？何が悪いのか教えていただけますか？このコードは信じられないほど長くなるように見えますが (300 行程度)、先生は比較的短いプロジェクトにする必要があると言いました。

score 3 · Accepted Answer

プロジェクト全体を大幅に簡素化するコードを次に示します。

目の前のタスクを理解するのに少し時間がかかるかもしれませんが、全体として、正しいデータ型と「連想配列」(辞書) を扱うと、作業がずっと楽になります。

import csv

reader = csv.DictReader(open('mycsv.txt', 'r'))
#opens the csv file into a dictionary

list_of_players = map(dict, reader)
#puts all the dictionaries (by row) as a separate element in a list. 
#this way, its not a one-time iterator and all your info is easily accessible

for i in list_of_players:
    for stat in ['rush','ryds','rtd','fum','fuml','year']:
        i[stat] = int(i[stat])
    #the above loop makes all the intended integers..integers instead of strings
    for stat in ['fpts','ravg','rtdr']:
        i[stat] = float(i[stat])
    #the above loop makes all the intended floats..floats instead of strings

for i in list_of_players:
    print i['name'], i[' '], i['fpts']
    #now you can easily access and loop through your players with meaningful names
    #using 'fpts' rather than predetermined numbers [5]

このサンプルコードは、名前と統計情報 (名、姓、fpts) を簡単に操作できることを示しています。

>>> 
A.J. Feeley 20.3
Aaron Brown 0.9
Aaron Rodgers 403.4
Adrian Peterson 188.9
Ahmad Bradshaw 156.6

もちろん、要求されたすべての統計情報 (最大値など) を取得するには、ある程度の調整が必要になりますが、これにより、データ型を最初から正しく保つことで、これらのタスクの実行が冗長になります。

この代入は (これらの構造を使用して) 300 行をはるかに下回る行数で実行できるようになりました。Python を使用すればするほど、それらを実行する従来のイディオムを学ぶことができます。lambda と sorted() は、あなたが恋に落ちる関数の例です...すぐに!

python - Python CSV 宿題プログラム

1 に答える 1

Related

Reference