r - ロジスティック回帰のリストから係数のすべての標準誤差を抽出します

Question

ロジスティック回帰モデルのリストから標準誤差を抽出したいと思います。

これはロジスティック回帰関数であり、このように設計されているため、一度に複数の分析を実行できます。

glmfunk <- function(x) glm(  ldata$DFREE ~  x , family=binomial)

データフレームldataの変数のサブセットで実行します。

glmkort <- lapply(ldata[,c(2,3,5,6,7,8)],glmfunk)

次のように係数を抽出できます。

sapply(glmkørt, "[[", "coefficients")

しかし、係数の標準誤差を抽出するにはどうすればよいですか？str（glmkort）で見つからないようですが？

これがAGEのstr（glmkort）で、標準エラーを探しています。

str(glmkort)
List of 6
 $ AGE    :List of 30
  ..$ coefficients     : Named num [1:2] -1.17201 -0.00199
  .. ..- attr(*, "names")= chr [1:2] "(Intercept)" "x"
  ..$ residuals        : Named num [1:40] -1.29 -1.29 -1.29 -1.29 4.39 ...
  .. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
  ..$ fitted.values    : Named num [1:40] 0.223 0.225 0.225 0.225 0.228 ...
  .. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
  ..$ effects          : Named num [1:40] 3.2662 -0.0282 -0.4595 -0.4464 2.042 ...
  .. ..- attr(*, "names")= chr [1:40] "(Intercept)" "x" "" "" ...
  ..$ R                : num [1:2, 1:2] -2.64 0 -86.01 14.18
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:2] "(Intercept)" "x"
  .. .. ..$ : chr [1:2] "(Intercept)" "x"
  ..$ rank             : int 2
  ..$ qr               :List of 5
  .. ..$ qr   : num [1:40, 1:2] -2.641 0.158 0.158 0.158 0.159 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:40] "1" "2" "3" "4" ...
  .. .. .. ..$ : chr [1:2] "(Intercept)" "x"
  .. ..$ rank : int 2
  .. ..$ qraux: num [1:2] 1.16 1.01
  .. ..$ pivot: int [1:2] 1 2
  .. ..$ tol  : num 1e-11
  .. ..- attr(*, "class")= chr "qr"
  ..$ family           :List of 12
  .. ..$ family    : chr "binomial"
  .. ..$ link      : chr "logit"
  .. ..$ linkfun   :function (mu)  
  .. ..$ linkinv   :function (eta)  
  .. ..$ variance  :function (mu)  
  .. ..$ dev.resids:function (y, mu, wt)  
  .. ..$ aic       :function (y, n, mu, wt, dev)  
  .. ..$ mu.eta    :function (eta)  
  .. ..$ initialize:  expression({     if (NCOL(y) == 1) {         if (is.factor(y))              y <- y != levels(y)[1L]         n <- rep.int(1, nobs)         y[weights == 0] <- 0         if (any(y < 0 | y > 1))              stop("y values must be 0 <= y <= 1")         mustart <- (weights * y + 0.5)/(weights + 1)         m <- weights * y         if (any(abs(m - round(m)) > 0.001))              warning("non-integer #successes in a binomial glm!")     }     else if (NCOL(y) == 2) {         if (any(abs(y - round(y)) > 0.001))              warning("non-integer counts in a binomial glm!")         n <- y[, 1] + y[, 2]         y <- ifelse(n == 0, 0, y[, 1]/n)         weights <- weights * n         mustart <- (n * y + 0.5)/(n + 1)     }     else stop("for the binomial family, y must be a vector of 0 and 1's\n", "or a 2 column matrix where col 1 is no. successes and col 2 is no. failures") })
  .. ..$ validmu   :function (mu)  
  .. ..$ valideta  :function (eta)  
  .. ..$ simulate  :function (object, nsim)  
  .. ..- attr(*, "class")= chr "family"
  ..$ linear.predictors: Named num [1:40] -1.25 -1.24 -1.24 -1.24 -1.22 ...
  .. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
  ..$ deviance         : num 42.7
  ..$ aic              : num 46.7
  ..$ null.deviance    : num 42.7
  ..$ iter             : int 4
  ..$ weights          : Named num [1:40] 0.173 0.174 0.174 0.174 0.176 ...
  .. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
  ..$ prior.weights    : Named num [1:40] 1 1 1 1 1 1 1 1 1 1 ...
  .. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
  ..$ df.residual      : int 38
  ..$ df.null          : int 39
  ..$ y                : Named num [1:40] 0 0 0 0 1 0 1 0 0 0 ...
  .. ..- attr(*, "names")= chr [1:40] "1" "2" "3" "4" ...
  ..$ converged        : logi TRUE
  ..$ boundary         : logi FALSE
  ..$ model            :'data.frame':   40 obs. of  2 variables:
  .. ..$ ldata$DFREE: int [1:40] 0 0 0 0 1 0 1 0 0 0 ...
  .. ..$ x          : int [1:40] 39 33 33 32 24 30 39 27 40 36 ...
  .. ..- attr(*, "terms")=Classes 'terms', 'formula' length 3 ldata$DFREE ~ x
  .. .. .. ..- attr(*, "variables")= language list(ldata$DFREE, x)
  .. .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
  .. .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. .. ..$ : chr [1:2] "ldata$DFREE" "x"
  .. .. .. .. .. ..$ : chr "x"
  .. .. .. ..- attr(*, "term.labels")= chr "x"
  .. .. .. ..- attr(*, "order")= int 1
  .. .. .. ..- attr(*, "intercept")= int 1
  .. .. .. ..- attr(*, "response")= int 1
  .. .. .. ..- attr(*, ".Environment")=<environment: 0x017a5674> 
  .. .. .. ..- attr(*, "predvars")= language list(ldata$DFREE, x)
  .. .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
  .. .. .. .. ..- attr(*, "names")= chr [1:2] "ldata$DFREE" "x"
  ..$ call             : language glm(formula = ldata$DFREE ~ x, family = binomial)
  ..$ formula          :Class 'formula' length 3 ldata$DFREE ~ x
  .. .. ..- attr(*, ".Environment")=<environment: 0x017a5674> 
  ..$ terms            :Classes 'terms', 'formula' length 3 ldata$DFREE ~ x
  .. .. ..- attr(*, "variables")= language list(ldata$DFREE, x)
  .. .. ..- attr(*, "factors")= int [1:2, 1] 0 1
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:2] "ldata$DFREE" "x"
  .. .. .. .. ..$ : chr "x"
  .. .. ..- attr(*, "term.labels")= chr "x"
  .. .. ..- attr(*, "order")= int 1
  .. .. ..- attr(*, "intercept")= int 1
  .. .. ..- attr(*, "response")= int 1
  .. .. ..- attr(*, ".Environment")=<environment: 0x017a5674> 
  .. .. ..- attr(*, "predvars")= language list(ldata$DFREE, x)
  .. .. ..- attr(*, "dataClasses")= Named chr [1:2] "numeric" "numeric"
  .. .. .. ..- attr(*, "names")= chr [1:2] "ldata$DFREE" "x"
  ..$ data             :<environment: 0x017a5674> 
  ..$ offset           : NULL
  ..$ control          :List of 3
  .. ..$ epsilon: num 1e-08
  .. ..$ maxit  : num 25
  .. ..$ trace  : logi FALSE
  ..$ method           : chr "glm.fit"
  ..$ contrasts        : NULL
  ..$ xlevels          : Named list()
  ..- attr(*, "class")= chr [1:2] "glm" "lm"
 $ BECK   :List of 30

これが私が例に使用したデータです。私はこの質問の目的のためにそれを短縮しました：

> ldata
   ID AGE   BECK IVHX NDRUGTX RACE TREAT SITE DFREE
1   1  39  9.000    3       1    0     1    0     0
2   2  33 34.000    2       8    0     1    0     0
3   3  33 10.000    3       3    0     1    0     0
4   4  32 20.000    3       1    0     0    0     0
5   5  24  5.000    1       5    1     1    0     1
6   6  30 32.550    3       1    0     1    0     0
7   7  39 19.000    3      34    0     1    0     1
8   8  27 10.000    3       2    0     1    0     0
9   9  40 29.000    3       3    0     1    0     0
10 10  36 25.000    3       7    0     1    0     0
11 11  38 18.900    3       8    0     1    0     0
12 12  29 16.000    1       1    0     1    0     0
13 13  32 36.000    3       2    1     1    0     1
14 14  41 19.000    3       8    0     1    0     0
15 15  31 18.000    3       1    0     1    0     0
16 16  27 12.000    3       3    0     1    0     0
17 17  28 34.000    3       6    0     1    0     0
18 18  28 23.000    2       1    0     1    0     0
19 19  36 26.000    1      15    1     1    0     1
20 20  32 18.900    3       5    0     1    0     1
21 21  33 15.000    1       1    0     0    0     1
22 22  28 25.200    3       8    0     0    0     0
23 23  29  6.632    2       0    0     0    0     0
24 24  35  2.100    3       9    0     0    0     0
25 25  45 26.000    3       6    0     0    0     0
26 26  35 39.789    3       5    0     0    0     0
27 27  24 20.000    1       3    0     0    0     0
28 28  36 16.000    3       7    0     0    0     0
29 29  39 22.000    3       9    0     0    0     1
30 30  36  9.947    2      10    0     0    0     0
31 31  37  9.450    3       1    0     0    0     0
32 32  30 39.000    3       1    0     0    0     0
33 33  44 41.000    3       5    0     0    0     0
34 34  28 31.000    1       6    1     0    0     1
35 35  25 20.000    1       3    1     0    0     0
36 36  30  8.000    3       7    0     1    0     0
37 37  24  9.000    1       1    0     0    0     0
38 38  27 20.000    1       1    0     0    0     0
39 39  30  8.000    1       2    1     0    0     1
40 40  34  8.000    3       0    0     1    0     0

score 8 · Accepted Answer

からの例を使用して?glm

## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())

## copy twice to a list to illustrate
lmod <- list(mod1 = glm.D93, mod2 = glm.D93)

次に、それらをそのまま計算するsummary()か、を呼び出した後に抽出することができsummary()ます。前者は、必要なものだけを計算するため、はるかに効率的です。後者は、標準誤差がどのように導出されるかを知ることに依存していません。

標準誤差を直接計算する

標準誤差は、モデルの分散共分散行列から計算できます。この行列の対角には係数の分散が含まれており、標準誤差は単純にこれらの分散の平方根です。抽出関数はvcov()分散共分散行列を取得し、対角線を次のように平方根しsqrt(diag())ます。

> lapply(lmod, function(x) sqrt(diag(vcov(x))))
$mod1
(Intercept)    outcome2    outcome3  treatment2  treatment3 
  0.1708987   0.2021708   0.1927423   0.2000000   0.2000000 

$mod2
(Intercept)    outcome2    outcome3  treatment2  treatment3 
  0.1708987   0.2021708   0.1927423   0.2000000   0.2000000

への呼び出しからそれらを抽出します`summary()`

または、標準誤差 (およびそれ以上)summary()を計算させてlapply()から、または(標準誤差が格納されている) 2 番目の列sapply()を抽出して取得する無名関数を適用することができます。coef(summary(x))

lapply(lmod, function(x) coef(summary(x))[,2])

どちらが与える

> lapply(lmod, function(x) coef(summary(x))[,2])
$mod1
(Intercept)    outcome2    outcome3  treatment2  treatment3 
  0.1708987   0.2021708   0.1927423   0.2000000   0.2000000 

$mod2
(Intercept)    outcome2    outcome3  treatment2  treatment3 
  0.1708987   0.2021708   0.1927423   0.2000000   0.2000000

一方、次のsapply()ようになります：

> sapply(lmod, function(x) coef(summary(x))[,2])
                 mod1      mod2
(Intercept) 0.1708987 0.1708987
outcome2    0.2021708 0.2021708
outcome3    0.1927423 0.1927423
treatment2  0.2000000 0.2000000
treatment3  0.2000000 0.2000000

やりたいことに応じて、1 回の呼び出しで係数と標準誤差の両方を抽出できます。

> lapply(lmod, function(x) coef(summary(x))[,1:2])
$mod1
                 Estimate Std. Error
(Intercept)  3.044522e+00  0.1708987
outcome2    -4.542553e-01  0.2021708
outcome3    -2.929871e-01  0.1927423
treatment2   1.337909e-15  0.2000000
treatment3   1.421085e-15  0.2000000

$mod2
                 Estimate Std. Error
(Intercept)  3.044522e+00  0.1708987
outcome2    -4.542553e-01  0.2021708
outcome3    -2.929871e-01  0.1927423
treatment2   1.337909e-15  0.2000000
treatment3   1.421085e-15  0.2000000

しかし、あなたはそれらを別々に好むかもしれませんか？

r - ロジスティック回帰のリストから係数のすべての標準誤差を抽出します

これはロジスティック回帰関数であり、このように設計されているため、一度に複数の分析を実行できます。

データフレームldataの変数のサブセットで実行します。

次のように係数を抽出できます。

これが私が例に使用したデータです。私はこの質問の目的のためにそれを短縮しました：

1 に答える 1

標準誤差を直接計算する

への呼び出しからそれらを抽出しますsummary()

Related

Reference

への呼び出しからそれらを抽出します`summary()`