r - ddply を使用したパネルのラグ変数

Question

本質的にパネルデータセットであるものの精度の変化を（推定された信頼区間に基づいて）生成しようとしています。

簡単な例として、私が書いた関数を無意味な例に適用します....

precision.gain <- function(x){
  x        <- ts(x, start=x[1])
  x.length <- seq(length = length(x))
  x.lag    <- lag(x, -1)
  x.gain   <- ((x - x.lag) * 100) / x
  x.gain   <- c(NA, x.gain)
  x.gain
}
t <- data.frame(x=1:20)
t <- cbind(t, precision.gain(t$x))
t
x precision.gain(t$x)
1   1                  NA
2   2           50.000000
3   3           33.333333
4   4           25.000000
5   5           20.000000 
6   6           16.666667
7   7           14.285714
8   8           12.500000
9   9           11.111111
10 10           10.000000
11 11            9.090909
12 12            8.333333
13 13            7.692308
14 14            7.142857
15 15            6.666667
16 16            6.250000
17 17            5.882353
18 18            5.555556
19 19            5.263158
20 20            5.000000

それは機能し、素晴らしいですが、サンプルが....

subset(results.normal.sum, n2 > 20 & n2 < 30, select=c(sd2, n2, ci.width1))
    sd2 n2 ci.width1
11  0.4 22 0.6528714
12  0.4 24 0.6167015
13  0.4 26 0.5895856
14  0.4 28 0.5658297
46  0.6 22 0.6529126
47  0.6 24 0.6196544
48  0.6 26 0.5922061
49  0.6 28 0.5642688
81  0.8 22 0.6513849
82  0.8 24 0.6194468
83  0.8 26 0.5923094
84  0.8 28 0.5636396
116 1.0 22 0.6522927
117 1.0 24 0.6191043
118 1.0 26 0.5900129
119 1.0 28 0.5652429
151 1.2 22 0.6518072
152 1.2 24 0.6193353
153 1.2 26 0.5892683
154 1.2 28 0.5632235
186 1.4 22 0.6527031
187 1.4 24 0.6191458
188 1.4 26 0.5899453
189 1.4 28 0.5640431
221 1.6 22 0.6521401
222 1.6 24 0.6191883
223 1.6 26 0.5893458
224 1.6 28 0.5637215
256 1.8 22 0.6512491
257 1.8 24 0.6180401
258 1.8 26 0.5905810
259 1.8 28 0.5647388
291 2.0 22 0.6515769
292 2.0 24 0.6183121
293 2.0 26 0.5896990
294 2.0 28 0.5663394

Hadley Wickham の plyr パッケージの ddply() を使用してみました.....

ddply(results.normal.sum, .(sd2), precision.gain, x=ci.width1)
Error in .fun(piece, ...) : unused argument(s) (piece)

tapply() を直接使用すると、そこにたどり着きますが、cbind() である可能性のあるデータフレームは返されません。

> tapply(results.normal.sum$ci.width1, sd2, precision.gain)
$`0.4`
 [1]          NA -771.332292  -68.852635  -30.514545  -19.877447  -14.515380
 [7]  -11.147183   -9.282641   -7.680418   -6.836209   -5.954992   -5.865053
[13]   -4.599158   -4.198409   -4.155838   -3.529773   -3.590234   -3.432364
[19]   -2.899601   -3.092533   -2.721967   -2.506706   -2.498318   -2.321500
[25]   -2.299822   -2.187855   -2.116990   -1.896162   -1.853487   -1.604902
[31]   -2.194138   -1.473042   -1.710051   -1.701994   -1.417754

$`0.6`
 [1]          NA -756.196418  -68.222048  -30.566420  -19.216860  -15.162929
 [7]  -10.645899   -9.628775   -7.326799   -7.178820   -5.770681   -5.367216
[13]   -4.634938   -4.951049   -3.949776   -3.761633   -3.326209   -3.387764
[19]   -3.009317   -3.074398   -2.397660   -2.678573   -2.626077   -2.268373
[25]   -2.426720   -1.956498   -2.119986   -1.859410   -1.992678   -1.707448
[31]   -1.991583   -1.595951   -1.765913   -1.415065   -1.655725
....

近くにいるような気がしますが、何かが欠けているか、誤解しています。

ここで同様の質問を見つけましたが、提供された回答/解決策がわかりません。

助けてくれてありがとう、

スラックライン

score 1 · Accepted Answer

必要なものを正しく推測した場合、以下はの便利な:=演算子を活用するソリューションですdata.table。

最初にサンプルデータを読み取ります。

testData <- textConnection("sd2 n2 ci.width1
11  0.4 22 0.6528714
12  0.4 24 0.6167015
13  0.4 26 0.5895856
14  0.4 28 0.5658297
46  0.6 22 0.6529126
47  0.6 24 0.6196544
48  0.6 26 0.5922061
49  0.6 28 0.5642688
81  0.8 22 0.6513849
82  0.8 24 0.6194468
83  0.8 26 0.5923094
84  0.8 28 0.5636396
116 1.0 22 0.6522927
117 1.0 24 0.6191043
118 1.0 26 0.5900129
119 1.0 28 0.5652429
151 1.2 22 0.6518072
152 1.2 24 0.6193353
153 1.2 26 0.5892683
154 1.2 28 0.5632235
186 1.4 22 0.6527031
187 1.4 24 0.6191458
188 1.4 26 0.5899453
189 1.4 28 0.5640431
221 1.6 22 0.6521401
222 1.6 24 0.6191883
223 1.6 26 0.5893458
224 1.6 28 0.5637215
256 1.8 22 0.6512491
257 1.8 24 0.6180401
258 1.8 26 0.5905810
259 1.8 28 0.5647388
291 2.0 22 0.6515769
292 2.0 24 0.6183121
293 2.0 26 0.5896990
294 2.0 28 0.5663394")

次に、データをとに入れdata.tableます...

library(data.table)
dt <- data.table(read.table(testData, header = TRUE))
dt[, list(n2, ci.width1, prec.gain = precision.gain(ci.width1)), by = sd2]

これが出力です

> dt[, list(n2, ci.width1, prec.gain = precision.gain(ci.width1)), by = sd2]
   sd2 n2 ci.width1 prec.gain
   0.4 22 0.6528714        NA
   0.4 24 0.6167015 -5.865058
   0.4 26 0.5895856 -4.599146
   0.4 28 0.5658297 -4.198419
   0.6 22 0.6529126        NA
   0.6 24 0.6196544 -5.367218
   0.6 26 0.5922061 -4.634924
   0.6 28 0.5642688 -4.951062
   0.8 22 0.6513849        NA
   0.8 24 0.6194468 -5.155907
   0.8 26 0.5923094 -4.581626
   0.8 28 0.5636396 -5.086548
     1 22 0.6522927        NA
     1 24 0.6191043 -5.360712
     1 26 0.5900129 -4.930638
     1 28 0.5652429 -4.382187
   1.2 22 0.6518072        NA
   1.2 24 0.6193353 -5.243024
   1.2 26 0.5892683 -5.102430
   1.2 28 0.5632235 -4.624239
   1.4 22 0.6527031        NA
   1.4 24 0.6191458 -5.419935
   1.4 26 0.5899453 -4.949696
   1.4 28 0.5640431 -4.592238
   1.6 22 0.6521401        NA
   1.6 24 0.6191883 -5.321774
   1.6 26 0.5893458 -5.063666
   1.6 28 0.5637215 -4.545560
   1.8 22 0.6512491        NA
   1.8 24 0.6180401 -5.373276
   1.8 26 0.5905810 -4.649506
   1.8 28 0.5647388 -4.575956
     2 22 0.6515769        NA
     2 24 0.6183121 -5.379937
     2 26 0.5896990 -4.852153
     2 28 0.5663394 -4.124664
cn sd2 n2 ci.width1 prec.gain

r - ddply を使用したパネルのラグ変数

1 に答える 1

Related

Reference