r - R で constparty オブジェクト (CHAID 出力) から予測子を抽出する

Question

ほとんどがカテゴリ変数の大きなデータセット (アンケート結果) があります。カイ二乗検定を使用して、変数間の依存関係をテストしました。変数間に不可解な数の依存関係があります。CHAID パッケージの関数を使用してchaid()相互作用を検出し、変数ごとにこれらの依存関係の基礎となる構造を分離 (したいこと) しました。通常、カイ 2 乗検定では変数の多数の依存関係 (たとえば 10 ～ 20) が明らかになり、chaid関数によってこれがより理解しやすいもの (たとえば 3 ～ 5) に減らされます。私がやりたいことは、chaid()結果に関連することが示された変数の名前を抽出することです。

chaid()出力はconstpartyオブジェクトの形式です。私の質問は、そのようなオブジェクトのノードに関連付けられた変数名を抽出する方法です。

自己完結型のコード例を次に示します。

library(evtree) # for the ContraceptiveChoice dataset
library(CHAID)
library(vcd)
library(MASS)

data("ContraceptiveChoice")
longform = formula(contraceptive_method_used ~ wifes_education + 
                 husbands_education +  wifes_religion + wife_now_working + 
                 husbands_occupation + standard_of_living_index + media_exposure)
z = chaid(longform, data = ContraceptiveChoice)
# plot(z)
z
# This is the part I want to do programatically
shortform = formula(contraceptive_method_used ~ wifes_education + husbands_occupation)
# The thing I want is a programatic way to extract 'shortform'  from 'z' 

# Examples of use of 'shortfom'   
loglm(shortform, data = ContraceptiveChoice)

score 0 · Accepted Answer

考えられる解決策の 1 つ:

nn <- nodeapply(z)
n.names= names(unlist(nn[[1]]))
ext <- unlist(sapply(n.names, function(x) grep("split.varid.", x, value=T)))
ext <- gsub("kids.split.varid.", "", ext)
ext <- gsub("split.varid.", "", ext)
dep.var <- as.character(terms(z)[1][[2]]) # get the dependent variable
plus = paste(ext, collapse=" + ")     
mul = paste(ext, collapse=" * ")
shortform <- as.formula(paste (dep.var, plus, sep = " ~ "))
satform <- as.formula(paste (dep.var, mul, sep = " ~ "))
mosaic(shortform, data = ContraceptiveChoice)
#stp <- step(glm(satform, data=ContraceptiveChoice, family=binomial), direction="both")

r - R で constparty オブジェクト (CHAID 出力) から予測子を抽出する

1 に答える 1

Related

Reference