r - 重み付けされたピアソンの相関?

Question

私は、各行 (2396) が 34 の連続する時間セグメントからなる個別の状況を表す2396x34 double matrixという名前を付けました。y

また、34 の連続する時間セグメントの 1 つの状況を表すnumeric[34]名前もあります。x

y現在、次のxように各行間の相関を計算しています。

crs[,2] <- cor(t(y),x)

corここで必要なのは、上記のステートメントの関数を加重相関に置き換えることです。重みベクトルxy.wtは 34 要素の長さであるため、34 の連続する時間セグメントのそれぞれに異なる重みを割り当てることができます。

関数を見つけて、最初にデータを取得すれば、Weighted Covariance Matrix関数と同じように機能するはずだと考えました。実際、関数が相関行列を返すように指定することもできます。残念ながら、2 つの変数 (と) を別々に指定できないため、同じ方法で使用できるようには見えません。cov.wtscalecorxy

速度をあまり犠牲にすることなく、説明した方法で加重相関を取得する方法を知っている人はいますか?

編集:おそらく、私が探しているのと同じ結果を得るためにy、関数の前にいくつかの数学関数を適用できます。corたぶん、各要素にを掛けたらxy.wt/sum(xy.wt)？

編集 #2パッケージに別の関数corrが見つかりました。boot

corr(d, w = rep(1, nrow(d))/nrow(d))

d   
A matrix with two columns corresponding to the two variables whose correlation we wish to calculate.

w   
A vector of weights to be applied to each pair of observations. The default is equal weights for each pair. Normalization takes place within the function so sum(w) need not equal 1.

これも私が必要とするものではありませんが、より近いです。

編集 #3 これは、私が扱っているデータのタイプを生成するためのコードです。

x<-cumsum(rnorm(34))
y<- t(sapply(1:2396,function(u) cumsum(rnorm(34))))
xy.wt<-1/(34:1)

crs<-cor(t(y),x) #this works but I want to use xy.wt as weight

score 23 · Accepted Answer

残念ながら、yが複数行の行列である場合、受け入れられた答えは間違っています。エラーは行にあります

vy <- rowSums( w * y * y )

yの列をで乗算したいのですが、これは必要に応じてリサイクルさwれたの要素で行を乗算します。wしたがって

> f(x, y[1, , drop = FALSE], xy.wt)
[1] 0.103021

は正しいです。この場合、乗算は要素ごとに実行されるため、ここでは列ごとの乗算と同等ですが、

> f(x, y, xy.wt)[1]
[1] 0.05463575

行単位の乗算により、間違った答えが返されます。

関数を次のように修正できます

f2 <- function( x, y, w = rep(1,length(x))) {
  stopifnot(length(x) == dim(y)[2] )
  w <- w / sum(w)
  # Center x and y, using the weighted means
  x <- x - sum(x * w)
  ty <- t(y - colSums(t(y) * w))
  # Compute the variance
  vx <- sum(w * x * x)
  vy <- colSums(w * ty * ty)
  # Compute the covariance
  vxy <- colSums(ty * x * w)
  # Compute the correlation
  vxy / sqrt(vx * vy)
}

パッケージcorrから生成された結果と比較して結果を確認します。boot

> res1 <- f2(x, y, xy.wt)
> res2 <- sapply(1:nrow(y), 
+                function(i, x, y, w) corr(cbind(x, y[i,]), w = w),
+                x = x, y = y, w = xy.wt)
> all.equal(res1, res2)
[1] TRUE

それ自体が、この問題を解決できる別の方法を提供します。

score 3 · Accepted Answer

相関関係の定義に戻ることができます。

f <- function( x, y, w = rep(1,length(x))) {
  stopifnot( length(x) == dim(y)[2] )
  w <- w / sum(w)
  # Center x and y, using the weighted means
  x <- x - sum(x*w)
  y <- y - apply( t(y) * w, 2, sum )
  # Compute the variance
  vx <- sum( w * x * x )
  vy <- rowSums( w * y * y ) # Incorrect: see Heather's remark, in the other answer
  # Compute the covariance
  vxy <- colSums( t(y) * x * w )
  # Compute the correlation
  vxy / sqrt(vx * vy)
}
f(x,y)[1]
cor(x,y[1,]) # Identical
f(x, y, xy.wt)

score 3 · Accepted Answer

これは、2つの行列間の加重ピアソン相関を計算するための一般化です（元の質問のようにベクトルと行列ではありません）。

matrix.corr <- function (a, b, w = rep(1, nrow(a))/nrow(a)) 
{
    # normalize weights
    w <- w / sum(w)

    # center matrices
    a <- sweep(a, 2, colSums(a * w))
    b <- sweep(b, 2, colSums(b * w))

    # compute weighted correlation
    t(w*a) %*% b / sqrt( colSums(w * a**2) %*% t(colSums(w * b**2)) )
}

上記の例とHeatherの相関関数を使用して、次のことを確認できます。

> sum(matrix.corr(as.matrix(x, nrow=34),t(y),xy.wt) - f2(x,y,xy.wt))
[1] 1.537507e-15

構文の呼び出しに関しては、これは重み付けされていないものに似ていcorます。

> a <- matrix( c(1,2,3,1,3,2), nrow=3)
> b <- matrix( c(2,3,1,1,7,3,5,2,8,1,10,12), nrow=3)
> matrix.corr(a,b)
     [,1]      [,2] [,3]      [,4]
[1,] -0.5 0.3273268  0.5 0.9386522
[2,]  0.5 0.9819805 -0.5 0.7679882
> cor(a, b)
     [,1]      [,2] [,3]      [,4]
[1,] -0.5 0.3273268  0.5 0.9386522
[2,]  0.5 0.9819805 -0.5 0.7679882

r - 重み付けされたピアソンの相関?

3 に答える 3

Related

Reference