r - 「長い」形式のテーブルを再形成して、列を区切る

Question

次のようなテーブルがあります。

|Condition|Session|Time|
|        A|      1| 100|
|        A|      1| 200|
|        B|      2| 200|
|        B|      2| 300|
|        B|      2| 500|
|        A|      3| 300|
|        A|      4| 200|

次の形式に変換したいと思います。

|   A|   B|   A|   A|
|   1|   2|   3|   4|
| 100| 200| 300| 200|
| 200| 300|    |    |
|    | 500|    |    |

つまり、最初の 2 行は「Condition」と「Session」であり、残りの行は「Time」列 (可変数の行) を表します。

Rでこれを達成するにはどうすればよいですか?

score 1 · Accepted Answer

まず第一に、aのすべての列はdata.frame同じタイプです。したがって、希望のテーブルを転置した形にすることができます。

おそらく、次のように実行できます。

foo = data.frame(Condition=c("A","A","B","B","B","A","A"), 
                 Session=c(1,1,2,2,2,3,4), 
                 Time = c(1,2,2,3,5,3,2)*100)
bar = aggregate(Time~Condition+Session, foo, identity)
bar
#   Condition Session          Time
# 1         A       1      100, 200
# 2         B       2 200, 300, 500
# 3         A       3           300
# 4         A       4           200
bar[1,3]
# $`0`
# [1] 100 200

score 1 · Accepted Answer

ここに 1 つのオプションがあります。（潜在的に大きな）警告は、（非常に便利な）が非標準のカスタム関数を使用していることですcbind.fill：

> dat <- read.table(text = "|Condition|Session|Time|
+ |        A|      1| 100|
+ |        A|      1| 200|
+ |        B|      2| 200|
+ |        B|      2| 300|
+ |        B|      2| 500|
+ |        A|      3| 300|
+ |        A|      4| 200|",header = TRUE,sep = "|")
dat$X <- dat$X.1 <- NULL

dat$Condition <- factor(dat$Condition,labels = LETTERS[1:2])

tmp <- with(dat,split(Time,list(Condition,Session)))
tmp <- tmp[sapply(tmp,function(x) length(x) > 0)]
res <- do.call(cbind.fill,tmp)

nm <- strsplit(names(tmp),split="\\.")

res <- rbind(as.numeric(sapply(nm,'[',2)),res)
colnames(res) <- sapply(nm,'[',1)
> res
       A   B   A   A
[1,]   1   2   3   4
[2,] 100 200 300 200
[3,] 200 300  NA  NA
[4,]  NA 500  NA  NA

の核となるアイデアはcbind.fill、この質問にあります。ただし、大幅に変更されたバージョンのコードを使用しているため、同じ結果を約束することはできません。

score 1 · Accepted Answer

ddplyfrom the plyrpackage とdcastfromを使用して別の可能な解決策を提供しますreshape2：

library(reshape2)
library(plyr)

dat = read.table(text=gsub("\\|", " ", "|Condition|Session|Time|
|        A|      1| 100|
|        A|      1| 200|
|        B|      2| 200|
|        B|      2| 300|
|        B|      2| 500|
|        A|      3| 300|
|        A|      4| 200|"), header=TRUE)

# Add column 'Rank' for each combination of Condition by Session.
dat = ddply(dat, .(Condition, Session), .fun=summarise, 
            Rank=rank(Time), Time=Time)

res = dcast(dat, Condition + Session ~ Rank, value.var="Time")

# Sort by 'Session'.
res = res[order(res$Session), ]

# As @Ali pointed out, you may want to leave the results as
# an un-transposed data.frame.
res

#   Condition Session   1   2   3
# 1         A       1 100 200  NA
# 4         B       2 200 300 500
# 2         A       3 300  NA  NA
# 3         A       4 200  NA  NA

# Transposing will coerce the data.frame to a character matrix.
t(res)

#           1     4     2     3    
# Condition "A"   "B"   "A"   "A"  
# Session   "1"   "2"   "3"   "4"  
# 1         "100" "200" "300" "200"
# 2         "200" "300" NA    NA   
# 3         NA    "500" NA    NA

score 1 · Accepted Answer

 dat <- read.table(text="Condition|Session|Time
 A|      1| 100
 A|      1| 200
 B|      2| 200
 B|      2| 300
 B|      2| 500
 A|      3| 300
 A|      4| 200", header=TRUE,sep="|")
 tapply(dat$Time, paste(dat$Condition, dat$Session, sep="_"), list)
#----------
$A_1
[1] 100 200

$A_3
[1] 300

$A_4
[1] 200

$B_2
[1] 200 300 500
#--------------------
 tdat <-.Last.value
 lmax <- max(sapply(tdat, function(x) length(x)) )
 as.data.frame( lapply(tdat, function(x) c(x, rep(NA, lmax- length(x)) ) ) )
#---------------------
  A_1 A_3 A_4 B_2
1 100 300 200 200
2 200  NA  NA 300
3  NA  NA  NA 500

r - 「長い」形式のテーブルを再形成して、列を区切る

4 に答える 4

Related

Reference