r - 共有オカレンスをカウントし、重複を削除します

Question

私はこれを持っています data.frame：

df <- read.table(text= "   section to from    time
                             a     1  5        9       
                             a     2  5        9        
                             a     1  5        10       
                             a     2  6        10       
                             a     2  7        11       
                             a     2  7        12       
                             a     3  7        12       
                             a     4  7        12
                             a     4  6        13  ", header = TRUE)

各行は、ある時点でのIDの同時発生を識別toしfromますtime。基本的に、toとのIDの時間明示ネットワークfrom。

である特定の時間範囲内でどのIDがtoIDを共有したか知りたいです。それ以外の場合は、両方のID1と2が互いに2日以内にコーヒーショップに行ったかどうかを知りたいです。from2to5

id1とinsharedidは2、それぞれ9と10にあるため、タイムウィンドウ2内でイベントを共有します。to5fromtime1from

                             a     1  5        9       
                             a     2  5        9        
                             a     1  7        13       
                             a     2  7        13

その後1、2_2

したがって、私が望む最終的な出力は次のようにdfなります。

                           section to.a to.b    noShared
                             a     1    2        1       
                             a     2    3        1        
                             a     2    4        1       
                             a     3    4        1

私はそこにいくつかの方法を得ることができます：

library(plyr)                            
library(tnet)


a <- ddply(df, .(section,to,time), function(x)  
          data.frame(from = unique(x$from)) )

b <- ddply(a, .(section,time), function(x) {

            b <- as.tnet(x[, c("to","from")], type="binary two-mode tnet")
            b <- projecting_tm(b, method="sum")
            return(b)

       })

これにより、各ポイント内のto共有IDのどのIDが取得されます。fromtime

ただし、には2つの主な問題がありbます。

まず、各時点で、ペアがids両方向に2回出現します。

 1  2  5  9 # id 1 and 2 went to coffee shop 5  at time 9
 2  1  5  9 # id 2  and 1 went to coffee shop 5 at time 9

 I only want each sombination to appear once: 

  1  2  5  # id 1 and 2 went to coffee shop 5  at time 9</strike>

~~次に、時間枠内で結果をビニングして、最終結果に共有イベントの数だけの時間が含まれないようにする必要があります。~~

編集

時間の問題には、予想よりも多くの問題があります。この質問には最初の問題で十分です。

score 2 · Accepted Answer

bの生成（質問の最初の部分）

projecteing_tmネットワークの変換であるコードを変更します。

b <- ddply(a, .(section,time), function(x) {
  ## first I create the origin network
  net2 <- x[, c("to","from")]
  colnames(net2) <- c('i','p')
  net2 <- net2[order(net2[, "i"], net2[, "p"]), ]
  np <- table(net2[, "p"])
  net2 <- merge(net2, cbind(p = as.numeric(rownames(np)),np = np))
  ## trasnformed network
  net1 <- merge(net2, cbind(j = net2[, "i"], p = net2[, "p"]))
  net1 <- net1[net1[, "i"] != net1[, "j"], c("i", "j","np")]
  net1 <- net1[order(net1[, "i"], net1[, "j"]), ]
  index <- !duplicated(net1[, c("i", "j")])
  net1 <- cbind(net1[index, c("i", "j")])
  net1
})

だからここであなたは警告なしにあなたのbを手に入れます

> b
  section time i j
1       a    9 1 2
2       a    9 2 1
3       a   12 2 3
4       a   12 2 4
5       a   12 3 2
6       a   12 3 4
7       a   12 4 2
8       a   12 4 3

質問の2番目の部分では、bから重複を削除しますか？

b[!duplicated(t(apply(b[3:4], 1, sort))), ]
  section time i j
1       a    9 1 2
3       a   12 2 3
4       a   12 2 4
6       a   12 3 4

この部分では、この質問に対する回答を使用します。

r - 共有オカレンスをカウントし、重複を削除します

1 に答える 1

Related

Reference