1

このような df でタプルを一意 (rle 一意) にする方法

structure(c("M01", "M01", "M01", "M01", "M01", "M02", "M02", 
"M02", "M02", "M03", "M03", "F04", "F04", "F02", "F02", "F04", 
"F10", "F10", NA, "F10", "F01", "F01"), .Dim = c(11L, 2L), .Dimnames = list(
    NULL, c("a", "b")))

> sample
      a     b    
 [1,] "M01" "F04"
 [2,] "M01" "F04"
 [3,] "M01" "F02"
 [4,] "M01" "F02"
 [5,] "M01" "F04"
 [6,] "M02" "F10"
 [7,] "M02" "F10"
 [8,] "M02" NA   
 [9,] "M02" "F10"
[10,] "M03" "F01"
[11,] "M03" "F01"

これを取得するには:

structure(c("M01", "M01", "M01", "M02", "M02", "M03", "F04", 
"F02", "F04", "F10", "F10", "F01"), .Dim = c(6L, 2L), .Dimnames = list(
    NULL, c("d", "c")))
> output
     d     c    
[1,] "M01" "F04"
[2,] "M01" "F02"
[3,] "M01" "F04"
[4,] "M02" "F10"
[5,] "M02" "F10"
[6,] "M03" "F01"

したがって、アイデアはタプルを使用してdfを取得することですが、行に基づいて、前の要素のみに基づいて一意であるため、 : unique(sample) 必要なものが得られません。タプルを考慮し、出力として df を保持する方法で、この df で rle を実行できますか? より良いアプローチはありますか?

rle(sample[,2]$values)

望ましい結果が得られますが、明らかに列 1 の貴重な情報が失われます。

4

1 に答える 1

6

これはどう?

# dd is the matrix structure you posted in the question
dd <- as.data.frame(dd)                     ## convert to data.frame
dd[] <- lapply(dd, as.character)            ## change columns to character
na.omit(dd[cumsum(rle(dd$b)$lengths), ])    ## get indices by cumsum'ing rle-lengths 
                                            ## wrap with na.omit to remove NA rows
#      a   b
# 2  M01 F04
# 4  M01 F02
# 5  M01 F04
# 7  M02 F10
# 9  M02 F10
# 11 M03 F01
于 2013-03-19T15:36:34.357 に答える