1

データはに保存されます.txt。同じテキストに200語が保存されています。これらの原材料をRに入力し、バイナリロジスティック回帰を実行するにはどうすればよいeach of these wordsですか?

num 0 0.010752688172
num 0 0.003300330033

thanksgiving 0 0.0123456790123
thanksgiving 0 0.0016339869281
thanksgiving 0 0.00338983050847

off 0 0.00431034482759
off 0 0.00302114803625
off 1 0.001100110011
off 0 0.00377358490566
off 1 0.00166112956811
off 1 0.00281690140845
off 0 0.00564971751412
off 0 0.00112994350282
off 0 0.003300330033
off 0 0.0042735042735
off 1 0.00326797385621
off 0 0.00159489633174
off 0 0.00378787878788
4

2 に答える 2

3

まあ、私は怠惰なので:

allwords <- unique(dataframe[,1])
firstword <- dataframe[dataframe[,1]==allwords[1],]

などは、データを単語ごとに分割します。ただし、関数の1つを使用して、の値ごとに回帰関数を実行するのも同じくらい簡単なので、、、...firstwordを作成する必要はありません。secondwordapplyallwords

于 2012-06-15T19:29:07.807 に答える
1

これが私がplyrパッケージでそれをする方法です:

# Load the plyr library
library(plyr)

# Read in the data
allwords <- read.table("words.txt")

# Name the variables more meaningfully than this
names(allwords) <- c("word", "y", "x")

# dlply iterates over the data.frame, splitting by "word", 
# and running a glm with the arguments formula = y ~ x and family = binomial
# and returns a list of the resulting glm objects
models <- dlply(allwords,
                .var = "word",
                .fun = glm, formula = y ~ x, family = binomial)

# It's then easy to iterate over that list using lapply, llply, ldply, etc.
# (depending on what you want back out)
# Summarize:
llply(models, summary)

# Get all the coefficients
ldply(models, coef)

# Get AICs
# Not that you can compare these among word-models, but you get the idea.
ldply(models, AIC)

# Or, if you want to work with a particular model
models$num
于 2012-06-15T19:50:13.790 に答える