-2

私は毎日発行されているExcelレポートを持っており、それを要約して傾向分析を提供する必要があります。このレポートには、作成日、作業項目タイプを含む作業項目のリストがあります。2011年、2012年に作成された作業項目の数を取得するにはどうすればよいですか?また、作業項目の種類ごとにカウントを取得するにはどうすればよいですか?これまでのところ、次のようにすることで、Excelデータをロードし、行数を取得することができました-

library(gdata)
wi20121812 = read.xls("WorkItemReport20121812.xls")
nrow(wi20121812)

サンプルデータ

   > dput(head(workItemReport2))
structure(list(DocType = structure(c(6L, 7L, 6L, 6L, 8L, 6L), .Label = c("TYPE10WI", 
"TYPE11WI", "TYPE12WI", "TYPE13WI", "TYPE14WI", "TYPE1WI", "TYPE2WI", 
"TYPE3WI", "TYPE4WI", "TYPE5WI", "TYPE6WI", "TYPE7WI", "TYPE8WI", 
"TYPE9WI"), class = "factor"), CreatedDate = structure(c(7L, 
22L, 146L, 181L, 153L, 191L), .Label = c("1/10/12 15:43 AM/PM ", 
"1/10/12 16:06 AM/PM ", "1/10/12 5:28 AM/PM ", "1/10/12 5:56 AM/PM ", 
"1/11/12 19:51 AM/PM ", "1/11/12 5:26 AM/PM ", "1/12/11 21:58 AM/PM ", 
"1/12/12 11:08 AM/PM ", "1/12/12 5:41 AM/PM ", "1/12/12 9:56 AM/PM ", 
"1/13/12 14:01 AM/PM ", "1/13/12 15:08 AM/PM ", "1/13/12 15:11 AM/PM ", 
"1/13/12 8:51 AM/PM ", "1/16/12 10:27 AM/PM ", "1/16/12 10:28 AM/PM ", 
"1/16/12 16:37 AM/PM ", "1/16/12 7:52 AM/PM ", "1/18/12 15:02 AM/PM ", 
"1/18/12 16:03 AM/PM ", "1/18/12 16:13 AM/PM ", "1/19/11 19:23 AM/PM ", 
"1/20/12 10:48 AM/PM ", "1/20/12 12:23 AM/PM ", "1/20/12 8:38 AM/PM ", 
"1/23/12 5:53 AM/PM ", "1/24/12 15:18 AM/PM ", "1/24/12 8:23 AM/PM ", 
"1/24/12 8:58 AM/PM ", "1/25/12 11:38 AM/PM ", "1/25/12 5:28 AM/PM ", 
"1/26/12 13:48 AM/PM ", "1/26/12 15:53 AM/PM ", "1/26/12 15:58 AM/PM ", 
"1/26/12 16:13 AM/PM ", "1/26/12 16:18 AM/PM ", "1/26/12 7:33 AM/PM ", 
"1/27/12 7:48 AM/PM ", "1/3/12 17:48 AM/PM ", "1/3/12 18:33 AM/PM ", 
"1/3/12 9:07 AM/PM ", "1/30/12 11:22 AM/PM ", "1/30/12 22:52 AM/PM ", 
"1/30/12 23:10 AM/PM ", "1/31/12 19:54 AM/PM ", "1/31/12 20:39 AM/PM ", 
"1/31/12 5:42 AM/PM ", "1/31/12 9:42 AM/PM ", "1/4/12 14:02 AM/PM ", 
"1/4/12 9:52 AM/PM ", "1/5/12 13:42 AM/PM ", "1/5/12 17:42 AM/PM ", 
....
....
"9/6/12 9:02 AM/PM ", "9/7/12 11:48 AM/PM ", "9/7/12 12:58 AM/PM ", 
"9/7/12 13:52 AM/PM ", "9/7/12 15:07 AM/PM ", "9/7/12 15:12 AM/PM ", 
"9/7/12 15:22 AM/PM ", "9/7/12 15:47 AM/PM ", "9/7/12 15:52 AM/PM ", 
"9/7/12 8:42 AM/PM ", "9/7/12 9:32 AM/PM ", "9/8/11 23:43 AM/PM "
), class = "factor")), .Names = c("DocType", "CreatedDate"), row.names = c(NA, 
6L), class = "data.frame")
> 
4

2 に答える 2

1

未回答の質問の一部である「作業項目タイプのカウントを取得する方法」は非常に簡単です。

res <- table(wi20121812[, "WorkItemType"])

これにより、各WorkItemTypeが発生した頻度を示す簡単なテーブルが得られます。絶対数ではなく比例して使用する必要がある場合は、結果に対してprop.table()を実行します。

prop.table(res)

または、両方を同時に実行します。

res <- prop.table(table(wi20121812[, "WorkItemType"]))
于 2012-12-30T16:13:21.580 に答える
0

パッケージddplyから使用できます:plyr

res = ddply(df, "year", summarise, amount = length(year))

またはcount、同じパッケージから使用します(これはさらに簡単です):

res = count(df, "year")

ここdfで、はdata.frameデータyearを含み、はその行が作成された年の詳細を示すカテゴリ変数を含む列の列名です。

于 2012-12-18T15:08:04.653 に答える