r - miselect パッケージを使用した MI Lasso / Elastic Net の test および trainingdataset で MI 後にスタックされたデータセットを分割する

Question

R は初めてで、MI Lasso/Elastic Net Regression を実行する必要があります。MI には「マウス」パッケージを使用します。パッケージ「miselect」で ML モデルを実行するには、スタック形式の MI データが必要です。次のように、MI の後にスタックされたデータセットを取得します。

imputed_long <- complete(imputed, include=F, "long")

モデルがデータにどの程度適合しているかを確認するには、データセットをテストデータセットとトレーニングデータセットに分割する必要があります。

2 つの質問があります。

MI の前にデータを分割したほうがよいでしょうか。その後、テストとトレーニングのデータセットと ML モデルに対して個別に MI を実行しますか? または、MI を実行した後にデータセットを分割する必要がありますか? 積み重ねられたデータセットをトレーニングとテストのデータセットに分割するにはどうすればよいですか (80/20 が最適です)。
「miselect」で ML モデルの予測パフォーマンスを取得するにはどうすればよいですか? miselect パッケージの説明に例が見つかりません。コードでアルファとラムダをクロス検証できますが、続行する方法がわかりません。

dim(dfs[[1]]) #12
vars=as.vector(names(dfs[[1]]))
xvars=vars[-which(as.vector(names(dfs[[1]])) == "outcome")]

# Generate list of imputed design matrices and imputed responses

x <- list()
y <- list()
for (i in 1:15) { #15 imputierte DS
  x[[i]] <- as.matrix(dfs[[i]][, xvars])
  y[[i]] <- dfs[[i]]$outcome
}

dim(x[[i]]) #1348 * 11
length(y[[i]]) #1348

#set seed to ensure reproducible results
set.seed(42)

pf       <- rep(1, dim(x[[i]])[2]) #Penalty factor. Can be used to differentially penalize certain variables
adWeight <- rep(1, dim(x[[i]])[2]) #Numeric vector of length p representing the adaptive weights for the L1 penalty

# Simulations demonstrate that the "stacked" objective function approaches tend to be more computationally efficient and have better estimation and selection properties. 
#stacked, elastic net
weights  <- 1 - rowMeans(is.na(newdata)) #Numeric vector of length n containing the proportion #observed (non-missing) for each row in the un-imputed data.
alpha    <- c(.5 , 1) #elastic net



fit.stacked <- cv.saenet(x, y, pf, adWeight, weights, family = "binomial",
                 alpha = alpha, nfolds = 5)

# Get selected variables from the 1 standard error rule
coef(fit.stacked, lambda = fit$lambda.1se, alpha = fit$alpha.1se)
print(fit.stacked)

coefficients=coef(fit.stacked)
coefficients```

Thank you so much in advance!

r - miselect パッケージを使用した MI Lasso / Elastic Net の test および trainingdataset で MI 後にスタックされたデータセットを分割する

0 に答える 0

Related

Reference