r - rbindfill のようなベクトルのリストのマージ

Question

名前付きベクトルのリストがあります (dputバージョンについては以下と最後を参照してください)。一緒に「マージ」して行列を作成し、ベクトルに名前 (この場合は文字) が含まれていない場合はゼロを埋めたいと思います。これはそれほど難しいことではないようですが、問題に対する基本的な解決策は見つかりませんでした。do.callmatch を使用することを考えましたが、and rbindtogetherを使用するおしゃれな方法があると確信している場合、それは非常に時間がかかるようです。

名前付きベクトルのリスト:

$greg

e i k l 
1 2 1 1 

$sam

! c e i t 
1 1 1 2 1 

$teacher

? c i k l 
1 1 1 1 1

最終的に必要な出力

           !  ?  c  e  i  k  l  t
greg       0  0  0  1  2  1  1  0 
sam        1  0  1  1  2  0  0  1 
teacher    0  1  1  0  1  1  1  0

おそらくこれは人々が与える出力であり、NAを0で埋めるのは簡単です

           !  ?  c  e  i  k  l  t
greg      NA NA NA  1  2  1  1 NA 
sam        1 NA  1  1  2 NA NA  1 
teacher   NA  1  1 NA  1  1  1 NA

サンプルデータ

L2 <- structure(list(greg = structure(c(1L, 2L, 1L, 1L), .Dim = 4L, .Dimnames = structure(list(
        c("e", "i", "k", "l")), .Names = ""), class = "table"), sam = structure(c(1L, 
    1L, 1L, 2L, 1L), .Dim = 5L, .Dimnames = structure(list(c("!", 
    "c", "e", "i", "t")), .Names = ""), class = "table"), teacher = structure(c(1L, 
    1L, 1L, 1L, 1L), .Dim = 5L, .Dimnames = structure(list(c("?", 
    "c", "i", "k", "l")), .Names = ""), class = "table")), .Names = c("greg", 
    "sam", "teacher"))

score 6 · Accepted Answer

これはかなり単純な基本ソリューションです。

# first determine all possible column names
cols <- sort(unique(unlist(lapply(L2,names), use.names=FALSE)))
# initialize the output
out <- matrix(0, length(L2), length(cols), dimnames=list(names(L2),cols))
# loop over list and fill in the matrix
for(i in seq_along(L2)) {
  out[names(L2)[i], names(L2[[i]])] <- L2[[i]]
}

ベンチマークで更新:

f1 <- function(L2) {
  cols <- sort(unique(unlist(lapply(L2,names), use.names=FALSE)))
  out <- matrix(0, length(L2), length(cols), dimnames=list(names(L2),cols))
  for(i in seq_along(L2)) out[names(L2)[i], names(L2[[i]])] <- L2[[i]]
  out
}   
f2 <- function(L2) {
  L.names <- sort(unique(unlist(sapply(L2, names))))
  L3 <- t(sapply(L2, function(x) x[L.names]))
  colnames(L3) <- L.names
  L3[is.na(L3)] <- 0
  L3
}
f3 <- function(L2) {
  m <- do.call(rbind, lapply(L2, as.data.frame))
  m$row <- sub("[.].*", "", rownames(m))
  m$Var1 <- factor(as.character(m$Var1))
  xtabs(Freq ~ row + Var1, m)
}
library(rbenchmark)
benchmark(f1(L2), f2(L2), f3(L2), order="relative")[,1:5]
#     test replications elapsed relative user.self
# 1 f1(L2)          100   0.022    1.000     0.020
# 2 f2(L2)          100   0.051    2.318     0.052
# 3 f3(L2)          100   0.788   35.818     0.760
set.seed(21)
L <- replicate(676, {n=sample(10,1); l=sample(26,n);
  setNames(sample(6,n,TRUE), letters[l])}, simplify=FALSE)
names(L) <- levels(interaction(letters,LETTERS))
benchmark(f1(L), f2(L), order="relative")[,1:5]
#    test replications elapsed relative user.self
# 1 f1(L)          100    1.84    1.000     1.828
# 2 f2(L)          100    4.24    2.304     4.220

score 4 · Accepted Answer

私はこのようなことを考えます：

names <- sort(unique(unlist(lapply(L2, names), use.names=FALSE)))
L3 <- t(vapply(L2, function(x) x[names], FUN.VALUE=numeric(length(names))))
colnames(L3) <- names
L3[is.na(L3)] <- 0

score 3 · Accepted Answer

reshape2ソリューション。これは、reshape2パッケージを使用して、リストを長い形式に溶かし、次にそれを使用dcastして幅の広い形式に戻すことで簡単に実行できます。

> library(reshape2)
> m <- melt(L2)
> m$Var.1 <- factor(as.character(m$Var.1)) # optional - if columns should be sorted
> dcast(m, L1 ~ Var.1, fill = 0)
       L1 ! ? c e i k l t
1    greg 0 0 0 1 2 1 1 0
2     sam 1 0 1 1 2 0 0 1
3 teacher 0 1 1 0 1 1 1 0

基本ソリューション。これが対応する基本ソリューションで、最初の2行が溶融を実行し、次の行で列がソートされ、最後の行が長いものから広いものに再形成されます。

> m <- do.call(rbind, lapply(L2, as.data.frame))
> m$row <- sub("[.].*", "", rownames(m))
> m$Var1 <- factor(as.character(m$Var1))
> xtabs(Freq ~ row + Var1, m)
         Var1
row       ! ? c e i k l t
  greg    0 0 0 1 2 1 1 0
  sam     1 0 1 1 2 0 0 1
  teacher 0 1 1 0 1 1 1 0

編集：基本ソリューションを追加し、並べ替え行を変更しました。

score 1 · Accepted Answer

これを入力している間、私はこの解決策を考えましたが、より効率的な解決策があるかどうか疑問に思います:

chars <- sort(unique(unlist(lapply(L2, names))))
L3 <- lapply(L2, function(x){
   nots <- chars[!chars %in% names(x)]
   new <- rev(c(x, rep(0, length(nots))))
   names(new)[1:length(nots)] <- nots
   new[order(names(new))]
})
do.call(rbind, L3)

収量：

        ! ? c e i k l t
greg    0 0 0 1 2 1 1 0
sam     1 0 1 1 2 0 0 1
teacher 0 1 1 0 1 1 1 0

r - rbindfill のようなベクトルのリストのマージ

4 に答える 4

Related

Reference