R の Caret パッケージで相互作用項を使用して堅牢な線形回帰を当てはめたいのですが、次のエラーが発生します。
train.default(x, y, weights = w, ...) のエラー : 停止中 さらに: 警告メッセージ: ノミナルTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing valuesリサンプリングされたパフォーマンス測定で。
私のコードの下:
mod <- train(
Pac ~ clearSkyPOA + clearSkyPOA*TotalCover+Temp2,
data = training,
method = "rlm",
metric = "RMSE",
preProc= c("center","scale","BoxCox"),
trControl = trainControl(method="cv", number = 5),
na.action =na.omit)
相互作用用語「clearSkyPOA*TotalCover」を削除すると、期待どおりに機能します。たとえば、次のコードを使用します。
mod <- train(
Pac ~ clearSkyPOA + TotalCover+Temp2,
data = training,
method = "rlm",
metric = "RMSE",
preProc= c("center","scale","BoxCox"),
trControl = trainControl(method="cv", number = 5),
na.action=na.omit
)
次の結果が得られます。
Robust Linear Model
4363 samples
3 predictor
Pre-processing: centered (3), scaled (3), Box-Cox transformation (2)
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 3490, 3490, 3491, 3491, 3490
Resampling results across tuning parameters:
intercept psi RMSE Rsquared
FALSE psi.huber 291.3261 0.7501889
FALSE psi.hampel 291.3261 0.7501889
FALSE psi.bisquare 291.3470 0.7499932
TRUE psi.huber 115.0178 0.7488397
TRUE psi.hampel 114.2018 0.7500523
TRUE psi.bisquare 115.4231 0.7483018
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were intercept = TRUE and psi = psi.hampel.
何か不足していますか?以下は、dput(トレーニング) からの 20 サンプルの結果です。
structure(list(Pac = c(3.42857142857143, 38.25, 120.916666666667,
258, 367.166666666667, 269.083333333333, 233.75, 112.416666666667,
21.9166666666667, 0.2, 1.5, 12.4166666666667, 134.916666666667,
104.333333333333, 394.583333333333, 342.5, 303.333333333333,
151.5, 42.0833333333333, 4.83333333333333), clearSkyPOA = c(63.0465796511235,
230.023517163135, 472.935466225438, 646.271261971453, 739.926063392829,
751.872076941902, 681.91937141018, 531.40317803238, 306.020562749019,
120.318359249055, 68.2689523552881, 229.800769386719, 473.162397232603,
647.082096293271, 741.364282016807, 753.955817698295, 684.656233771643,
534.787114500355, 309.953073794329, 114.55351678131), TotalCover = c(0.602923,
0.5798824, 0.5095124, 0.3896642, 0.2744389, 0.232004, 0.3052016,
0.4355463, 0.5392107, 0.5571411, 0.4599758, 0.4555472, 0.4434351,
0.41583, 0.3704268, 0.306295, 0.2271317, 0.1551105, 0.1170307,
0.1307881), Temp = c(13.72545, 13.91255, 14.04348, 14.06298,
13.98118, 13.82455, 13.61805, 13.3806, 13.12966, 12.87026, 12.37558,
12.76012, 13.12112, 13.37877, 13.5505, 13.67806, 13.7903, 13.86462,
13.86556, 13.76468), Temp2 = c(188.3879777025, 193.5590475025,
197.2193305104, 197.7674064804, 195.4733941924, 191.1181827025,
185.4512858025, 179.04045636, 172.3879717156, 165.6435924676,
153.1549803364, 162.8206624144, 172.1637900544, 178.9914867129,
183.61605025, 187.0893253636, 190.17237409, 192.2276877444, 192.2537541136,
189.4664155024)), .Names = c("Pac", "clearSkyPOA", "TotalCover",
"Temp", "Temp2"), row.names = c(NA, 20L), class = "data.frame")