r - R (Amelia、zelig、lme4) で設定された乗算代入データのマルチレベル回帰モデル

Question

乗算代入データ (Amelia で作成) でマルチレベルモデルを実行しようとしています。サンプルは、グループ = 24、N = 150 のクラスター化されたサンプルに基づいています。

library("ZeligMultilevel")
ML.model.0 <- zelig(dv~1 + tag(1|group), model="ls.mixed",
data=a.out$imputations)
summary(ML.model.0)

このコードは、次のエラーコードを生成します。

Error in object[[1]]$result$call : 
$ operator not defined for this S4 class

OLS 回帰を実行すると、次のように機能します。

model.0 <- zelig(dv~1, model="ls", data=a.out$imputations)
m.0 <- coef(summary(model.0)) 
print(m.0, digits = 2)

      Value Std. Error t-stat  p-value
[1,]    45       0.34    130 2.6e-285

実例を提供させていただきます。

require(Zelig)
require(Amelia)
require(ZeligMultilevel)

data(freetrade)
length(freetrade$country) #grouping variable

#Imputation of missing data

a.out <- amelia(freetrade, m=5, ts="year", cs="country")

# Models: (1) OLS; (2) multi-level 

model.0 <- zelig(polity~1, model="ls", data=a.out$imputations)
m.0 <- coef(summary(model.0)) 
print(m.0, digits = 2)

ML.model.0 <- zelig(polity~1 + tag(1|country), model="ls.mixed", data=a.out$imputations)
summary(ML.model.0)

問題は、Zelig が Amelia の mi クラスとどのようにやり取りするかにあると思います。したがって、代替の R パッケージである lme4 に目を向けました。

require(lme4)
write.amelia(obj=a.out, file.stem="inmi", format="csv", na="NA")
diff <-list(5)  # a list to store each model, 5 is the number of the imputed datasets

for (i in 1:5) {
file.name <- paste("inmi", 5 ,".csv",sep="")
data.to.use <- read.csv(file.name)
diff[[5]] <- lmer(polity ~ 1 + (1 | country),
data = data.to.use)}
diff

結果は次のとおりです。

[[1]]
[1] 5

[[2]]
NULL

[[3]]
NULL

[[4]]
NULL

[[5]]
Linear mixed model fit by REML 
Formula: polity ~ 1 + (1 | country) 
   Data: data.to.use 
  AIC  BIC logLik deviance REMLdev
 1006 1015 -499.9     1002   999.9
Random effects:
 Groups   Name        Variance Std.Dev.
 country  (Intercept) 14.609   3.8222  
 Residual             17.839   4.2236  
Number of obs: 171, groups: country, 9

Fixed effects:
            Estimate Std. Error t value
(Intercept)    2.878      1.314    2.19

などで置き換えdiff[[5]]てもdiff[[4]]、結果は同じままdiff[[3]]です。それでも、これが実際に結合されたデータセットの結果なのか、1 つの帰属データセットの結果なのか疑問に思っています。何かご意見は？ありがとう！

score 6 · Accepted Answer

このオブジェクトの集計関数を変更しました (ソースを取得し、./R/summary.R ファイルを開きました)。コードの流れを作るためにいくつかの中括弧を追加し、 a をに変更しgetcoefましたcoef。これはこの特定のケースで機能するはずですが、一般的かどうかはわかりません。関数getcoefは slot を検索しますがcoef3、これは見たことがありません。おそらく@BenBolkerはここに目を向けることができますか？これが結果のように見えることを保証することはできませんが、出力は正当に見えます。おそらく、パッケージの作成者に連絡して、将来のバージョンでこれを修正してもらうことができます。

要約(ML.model.0)

  Model: ls.mixed
  Number of multiply imputed data sets: 5 

Combined results:

Call:
zelig(formula = polity ~ 1 + tag(1 | country), model = "ls.mixed", 
    data = a.out$imputations)

Coefficients:
        Value Std. Error   t-stat    p-value
[1,] 2.902863   1.311427 2.213515 0.02686218

For combined results from datasets i to j, use summary(x, subset = i:j).
For separate results, use print(summary(x), subset = i:j).

変更された機能:

summary.MI <- function (object, subset = NULL, ...) {
  if (length(object) == 0) {
    stop('Invalid input for "subset"')
  } else {
    if (length(object) == 1) {
      return(summary(object[[1]]))
    }
  }

  # Roman: This function isn't fecthing coefficients robustly. Something goes wrong. Contact package author. 
  getcoef <- function(obj) {
    # S4
    if (!isS4(obj)) {
      coef(obj)
    } else {
      if ("coef3" %in% slotNames(obj)) {
        obj@coef3
      } else {
        obj@coef
      }
    }
  }

    #
    res <- list()

    # Get indices
    subset <- if (is.null(subset)) {
      1:length(object)
    } else {
      c(subset)
    }

    # Compute the summary of all objects
    for (k in subset) {
      res[[k]] <- summary(object[[k]])
    }


    # Answer
    ans <- list(
      zelig = object[[1]]$name,
      call = object[[1]]$result@call,
      all = res
    )

    #
    coef1 <- se1 <- NULL

    #
    for (k in subset) {
#       tmp <-  getcoef(res[[k]]) # Roman: I changed this to coef, not 100% sure if the output is the same
      tmp <- coef(res[[k]])
      coef1 <- cbind(coef1, tmp[, 1])
      se1 <- cbind(se1, tmp[, 2])
    }

    rows <- nrow(coef1)
    Q <- apply(coef1, 1, mean)
    U <- apply(se1^2, 1, mean)
    B <- apply((coef1-Q)^2, 1, sum)/(length(subset)-1)
    var <- U+(1+1/length(subset))*B
    nu <- (length(subset)-1)*(1+U/((1+1/length(subset))*B))^2

    coef.table <- matrix(NA, nrow = rows, ncol = 4)
    dimnames(coef.table) <- list(rownames(coef1),
                                 c("Value", "Std. Error", "t-stat", "p-value"))
    coef.table[,1] <- Q
    coef.table[,2] <- sqrt(var)
    coef.table[,3] <- Q/sqrt(var)
    coef.table[,4] <- pt(abs(Q/sqrt(var)), df=nu, lower.tail=F)*2
    ans$coefficients <- coef.table
    ans$cov.scaled <- ans$cov.unscaled <- NULL

    for (i in 1:length(ans)) {
      if (is.numeric(ans[[i]]) && !names(ans)[i] %in% c("coefficients")) {
        tmp <- NULL
        for (j in subset) {
          r <- res[[j]]
          tmp <- cbind(tmp, r[[pmatch(names(ans)[i], names(res[[j]]))]])
        }
        ans[[i]] <- apply(tmp, 1, mean)
      }
    }

    class(ans) <- "summaryMI"
    ans
  }

r - R (Amelia、zelig、lme4) で設定された乗算代入データのマルチレベル回帰モデル

1 に答える 1

Related

Reference