r - 行に同じ文字が含まれている場合に列を削除する

Question

文字と数字の組み合わせのマトリックスがあり、列の両方の行に同じ文字が表示される列を削除する必要があります。簡単な例:

> chars <- c("A1","A2","B1","B2")
> charsmat <- combn(chars, 2)
> charsmat
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,] "A1" "A1" "A1" "A2" "A2" "B1"
[2,] "A2" "B1" "B2" "B1" "B2" "B2"

1 つの列の両方の行 (この場合は列 1 と 6) に同じ文字が含まれている場合、その列を削除する必要があります。gsub()またはを使用str_extract()して文字を分離し、行間に一致があるかどうかをテストしますが、それを制定する方法については行き詰まっています。ご協力いただきありがとうございます。

score 3 · Accepted Answer

まず、アルファベット部分のみを抽出した新しいマトリックスを作成します。

> (charsmat.alpha <- substr(charsmat, 0, 1))
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,] "A"  "A"  "A"  "A"  "A"  "B" 
[2,] "A"  "B"  "B"  "B"  "B"  "B"

charsmat次に、の 2 つの行がcharsmat.alpha同じでない列のサブセットを取得します。

> charsmat[,(charsmat.alpha[1,] != charsmat.alpha[2,])]
     [,1] [,2] [,3] [,4]
[1,] "A1" "A1" "A2" "A2"
[2,] "B1" "B2" "B1" "B2"

score 1 · Accepted Answer

これは、行 1 エントリの文字が行 2 エントリの文字と一致する列を削除する、より一般的な解決策です。

## Your data
chars <- c("A1","A2","B1","B2")
charsmat <- combn(chars, 2)

vetMatrix <- function(mat) {
    ## Remove non-alpha characters from matrix entries
    mm <- gsub("[^[:alpha:]]", "", mat)    
    ## Construct character class regex patterns from first row
    patterns <- paste0("[", mm[1,], "]")
    xs <- mm[2,]    
    ## Extract columns in which no character in first row is found in second
    mat[,!mapply("grepl", patterns, xs), drop=FALSE]
}

## Try it with your matrix ...
vetMatrix(charsmat)
#      [,1] [,2] [,3] [,4]
# [1,] "A1" "A1" "A2" "A2"
# [2,] "B1" "B2" "B1" "B2"

## ... and with a different matrix
mat <- matrix(c("AB1", "B1", "AA11", "BB22", "this", "that"), ncol=3) 
mat
#      [,1]  [,2]   [,3]  
# [1,] "AB1" "AA11" "this"
# [2,] "B1"  "BB22" "that"
vetMatrix(mat)
#     [,1]  
# [1,] "AA11"
# [2,] "BB22"

r - 行に同じ文字が含まれている場合に列を削除する

2 に答える 2

Related

Reference