r - R: データセットの組み合わせからより複雑な計算を実行するには?

Question

現在、組み込みのデータセット iris からくしを持っています。これまでのところ、値のペアの lm() の係数を見つけることができるようになりました。

myPairs <- combn(names(iris[1:4]), 2)

formula <- apply(myPairs, MARGIN=2, FUN=paste, collapse="~")

model <- lapply(formula, function(x) lm(formula=x, data=iris)$coefficients[2])

model

ただし、さらに数歩進んで、lm() の係数を使用して、さらなる計算に使用したいと思います。私はこのようなことをしたいと思います:

Coefficient <- lm(formula=x, data=iris)$coefficients[2]
Spread <- myPairs[1] - coefficient*myPairs[2]
library(tseries)
adf.test(Spread)

手順自体は簡単ですが、データセット内の各コームに対してこれを行う方法を見つけることができませんでした。(補足として、adf.test はそのようなデータには適用されませんが、デモ用に iris データセットを使用しているだけです)。このような手順のループを記述した方がよいでしょうか?

score 2 · Accepted Answer

これらはすべて内で実行できますcombn。

すべての組み合わせに対して回帰を実行し、2 番目の係数を抽出したい場合は、

fun <- function(x) coef(lm(paste(x, collapse="~"), data=iris))[2]
combn(names(iris[1:4]), 2, fun)

次に、関数を拡張してスプレッドを計算できます

fun <- function(x) {
         est <- coef(lm(paste(x, collapse="~"), data=iris))[2]
         spread <- iris[,x[1]] - est*iris[,x[2]]
         adf.test(spread)
        }

out <- combn(names(iris[1:4]), 2, fun, simplify=FALSE)
out[[1]]

#   Augmented Dickey-Fuller Test

#data:  spread
#Dickey-Fuller = -3.879, Lag order = 5, p-value = 0.01707
#alternative hypothesis: stationary

最初の結果を手動で実行した場合と比較する

est <- coef(lm(Sepal.Length ~ Sepal.Width, data=iris))[2]
spread <- iris[,"Sepal.Length"] - est*iris[,"Sepal.Width"]
adf.test(spread)

#   Augmented Dickey-Fuller Test

# data:  spread
# Dickey-Fuller = -3.879, Lag order = 5, p-value = 0.01707
# alternative hypothesis: stationary

r - R: データセットの組み合わせからより複雑な計算を実行するには?

3 に答える 3

Related

Reference