私はいくつかの異なるテーブルを持っており、Rで次のような関数を書きたいと思っています:
表1:
coordinates var1.pred var1.var observed residual zscore fold
1 (2579410, 1079720) 5.057024 0.4325275 5.468 0.41097625 0.62489903 1
2 (2579330, 1079730) 5.329797 0.3945041 4.498 -0.83179667 -1.32431534 2
3 (2579260, 1079770) 4.788211 0.5576228 5.114 0.32578861 0.43628035 3
4 (2579930, 1080030) 5.067753 0.4972365 4.764 -0.30375347 -0.43076434 4
5 (2579700, 1079770) 5.116632 0.5792768 4.626 -0.49063190 -0.64463327 5
6 (2579540, 1079640) 4.865667 0.6122453 6.522 1.65633254 2.11682434 6
7 (2579860, 1079880) 5.139779 0.4655840 4.856 -0.28377887 -0.41589245 7
「observed」の値が次の 2 つの値の許容範囲を超えている場合は、それを外れ値としてラベル付けします。
var1.pred+(1.96*sqrt(var1.var))
var1.pred-(.96*sqrt(var1.var))
言い換えると:
if
var1.pred-(1.96*sqrt(var1.var)) < 'observed' < var1.pred-(1.96*sqrt(var1.var))
結果は正常、そうでない場合は外れ値になります。
また、列の名前は同じで、テーブル名は 1,2,3 .... です。
dat <- structure(list(coordinates = structure(c(3L, 2L, 1L, 7L, 5L,
4L, 6L), .Label = c("(2579260, 1079770)", "(2579330, 1079730)",
"(2579410, 1079720)", "(2579540, 1079640)", "(2579700, 1079770)",
"(2579860, 1079880)", "(2579930, 1080030)"), class = "factor"),
var1.pred = c(5.057024, 5.329797, 4.788211, 5.067753, 5.116632,
4.865667, 5.139779), var1.var = c(0.4325275, 0.3945041, 0.5576228,
0.4972365, 0.5792768, 0.6122453, 0.465584), observed = c(5.468,
4.498, 5.114, 4.764, 4.626, 6.522, 4.856), residual = c(0.41097625,
-0.83179667, 0.32578861, -0.30375347, -0.4906319, 1.65633254,
-0.28377887), zscore = c(0.62489903, -1.32431534, 0.43628035,
-0.43076434, -0.64463327, 2.11682434, -0.41589245), fold = 1:7), .Names = c("coordinates",
"var1.pred", "var1.var", "observed", "residual", "zscore", "fold"
), row.names = c(NA, -7L), class = "data.frame")