r - Rプロビット回帰の限界効果

Question

私はRを使用して研究を再現し、著者が報告したのとほぼ同じ結果を得ています。しかし、ある時点で、非現実的に小さいように見える限界効果を計算します。私の推論と以下のコードを見て、私がどこかで間違っているかどうかを確認していただければ幸いです。

私のサンプルには24535個の観測値が含まれており、従属変数「x028bin」は値0と1をとるバイナリ変数であり、さらに10個の説明変数があります。これらの独立変数のうち9つには数値レベルがあり、独立変数「f025grouped」はさまざまな宗教宗派からなる要素です。

宗教宗派のダミーを含むプロビット回帰を実行してから、限界効果を計算したいと思います。そのためには、最初に欠落している値を削除し、従属変数と独立変数の間のクロスタブを使用して、小さいセルまたは0個のセルがないことを確認します。次に、正常に動作するプロビットモデルを実行すると、妥当な結果も得られます。

probit4AKIE <- glm(x028bin ~ x003 + x003squ + x025secv2 + x025terv2 + x007bin + x04chief + x011rec + a009bin + x045mod + c001bin + f025grouped, family=binomial(link="probit"), data=wvshm5red2delna, na.action=na.pass)

summary(probit4AKIE)

ただし、プロビット係数とスケールファクターからすべての変数を平均して限界効果を計算する場合、得られる限界効果は小さすぎます（例：2.6042e-78）。コードは次のようになります。

ttt <- cbind(wvshm5red2delna$x003,
wvshm5red2delna$x003squ,
wvshm5red2delna$x025secv2,
wvshm5red2delna$x025terv2,
wvshm5red2delna$x007bin,
wvshm5red2delna$x04chief,
wvshm5red2delna$x011rec,
wvshm5red2delna$a009bin,
wvshm5red2delna$x045mod,
wvshm5red2delna$c001bin,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped) #I put variable "f025grouped" 9 times because this variable consists of 9 levels

ttt <- as.data.frame(ttt)

xbar <- as.matrix(mean(cbind(1,ttt[1:19]))) #1:19 position of variables in dataframe ttt

betaprobit4AKIE <- probit4AKIE$coefficients

zxbar <- t(xbar) %*% betaprobit4AKIE

scalefactor <- dnorm(zxbar)

marginprobit4AKIE <- scalefactor * betaprobit4AKIE[2:20] #2:20 are the positions of variables in the output of the probit model 'probit4AKIE' (variables need to be in the same ordering as in data.frame ttt), the constant in the model occupies the first position

marginprobit4AKIE #in this step I obtain values that are much too small

データセットが大きすぎるため、実際の例を提供できないことをお詫び申し上げます。コメントをいただければ幸いです。どうもありがとう。

一番、

トビアス

score 1 · Accepted Answer

@Gavinは正しいので、姉妹サイトで質問することをお勧めします。

いずれにせよ、これがプロビット係数を解釈するための私のトリックです。

プロビット回帰係数は、スケール（1.6）までは、ロジット係数と同じです。したがって、プロビットモデルの適合がである場合Pr(y=1) = fi(.5 - .3*x)、これはロジスティックモデルと同等Pr(y=1) = invlogit(1.6(.5 - .3*x))です。

そして、これを使用して、パッケージの関数invlogitを使用してグラフィックを作成しますarm。もう1つの可能性は、すべての係数（切片を含む）に1.6を掛けてから、「4で割るルール」（GelmanとHillの本を参照）を適用することです。つまり、新しい係数を4で割ると、次のようになります。 xの単位差に対応する予測差の上限。

これが例です。

x1 = rbinom(100,1,.5)
x2 = rbinom(100,1,.3)
x3 = rbinom(100,1,.9)
ystar = -.5  + x1 + x2 - x3 + rnorm(100)
y = ifelse(ystar>0,1,0)
probit = glm(y~x1 + x2 + x3, family=binomial(link='probit'))
xbar <- as.matrix(mean(cbind(1,ttt[1:3])))

# now the graphic, i.e., the marginal effect of x1, x2 and x3
library(arm)
curve(invlogit(1.6*(probit$coef[1] + probit$coef[2]*x + probit$coef[3]*xbar[3] + probit$coef[4]*xbar[4]))) #x1
curve(invlogit(1.6*(probit$coef[1] + probit$coef[2]*xbar[2] + probit$coef[3]*x + probit$coef[4]*xbar[4]))) #x2
curve(invlogit(1.6*(probit$coef[1] + probit$coef[2]*xbar[2] + probit$coef[3]*xbar[3] + probit$coef[4]*x))) #x3

score 1 · Accepted Answer

probit これは、または のトリックを行いますlogit：

mfxboot <- function(modform,dist,data,boot=1000,digits=3){
  x <- glm(modform, family=binomial(link=dist),data)
  # get marginal effects
  pdf <- ifelse(dist=="probit",
                mean(dnorm(predict(x, type = "link"))),
                mean(dlogis(predict(x, type = "link"))))
  marginal.effects <- pdf*coef(x)
  # start bootstrap
  bootvals <- matrix(rep(NA,boot*length(coef(x))), nrow=boot)
  set.seed(1111)
  for(i in 1:boot){
    samp1 <- data[sample(1:dim(data)[1],replace=T,dim(data)[1]),]
    x1 <- glm(modform, family=binomial(link=dist),samp1)
    pdf1 <- ifelse(dist=="probit",
                   mean(dnorm(predict(x, type = "link"))),
                   mean(dlogis(predict(x, type = "link"))))
    bootvals[i,] <- pdf1*coef(x1)
  }
  res <- cbind(marginal.effects,apply(bootvals,2,sd),marginal.effects/apply(bootvals,2,sd))
  if(names(x$coefficients[1])=="(Intercept)"){
    res1 <- res[2:nrow(res),]
    res2 <- matrix(as.numeric(sprintf(paste("%.",paste(digits,"f",sep=""),sep=""),res1)),nrow=dim(res1)[1])
    rownames(res2) <- rownames(res1)
  } else {
    res2 <- matrix(as.numeric(sprintf(paste("%.",paste(digits,"f",sep=""),sep="")),nrow=dim(res)[1]))
    rownames(res2) <- rownames(res)
  }
  colnames(res2) <- c("marginal.effect","standard.error","z.ratio")
  return(res2)
}

出典：http ：//www.r-bloggers.com/probitlogit-marginal-effects-in-r/

r - Rプロビット回帰の限界効果

2 に答える 2

Related

Reference