r - R の他の列のデータに基づく条件付き計算

Question

初心者: 3 列のカテゴリ値を持つデータテーブルがあり、最初の 3 列の値に基づいて行ごとに計算された値を持つ 4 番目の列を追加したいと考えています。これまでのところ、私は持っています:

tC <- textConnection("Visit1    Visit2  Visit3
yes no  no
yes no  yes
yes yes yes")
data1 <- read.table(header=TRUE, tC)
close.connection(tC)
rm(tC)
data1["pattern"] <- NA

次に、例えば、visit1、visit2、visit3 の値が「はい」、「いいえ」、「いいえ」の場合、そのパターン列で NA が「1」に置き換えられるように、列 4 に入力したいと思います。行。他の言語では、これはいくつかの IF ステートメントを含む FOR ループになります。私は apply ファミリーを見てきましたが、R でのこれに対する最良のアプローチと構文についてはまだよくわかりません。

score 3 · Accepted Answer

これが最も効率的な方法かどうかはわかりませんが、一意の行を見つけて、data.frame の各行について、一致する一意の行を見つけることができます。したがって、この番号がパターン ID です。ただし、行を単一の文字列要素に折りたたむ必要があります。そうしないと、R のベクトル化が目的の邪魔になります。以下の例では、わずかに拡張されたサンプルデータを使用しています。

#  Visit1 Visit2 Visit3
#1    yes     no     no
#2    yes     no    yes
#3    yes    yes    yes
#4     no    yes     no
#5    yes     no    yes

#  Get unique combinations
pats <- unique( data1 )

#  Colapse each row to a single string element
pats <- apply( pats , 1 , paste , collapse = " " )

#do the same to your data and compare with the patterns
data1$pattern <- apply( data1 , 1 , function(x) match( paste( x , collapse = " " ) , pats ) )
#  Visit1 Visit2 Visit3 pattern
#1    yes     no     no       1
#2    yes     no    yes       2
#3    yes    yes    yes       3
#4     no    yes     no       4
#5    yes     no    yes       2

score 2 · Accepted Answer

@SimonO101 の拡張サンプルデータを使用していると仮定すると、expand.gridとfactor.

まず、3 つの列の「はい」と「いいえ」の組み合わせをすべて作成します。

facLevs <- expand.grid(c("yes", "no"), c("yes", "no"), c("yes", "no"))
facLevs
#   Var1 Var2 Var3
# 1  yes  yes  yes
# 2   no  yes  yes
# 3  yes   no  yes
# 4   no   no  yes
# 5  yes  yes   no
# 6   no  yes   no
# 7  yes   no   no
# 8   no   no   no

次に、列の組み合わせを因数分解します。を使用do.call(paste, ...)すると、よりも簡単にこれを行うことができますapply(mydf, ...)。これを変換してas.numeric、数値グループを取得します。

mydf$pattern <- as.numeric(factor(do.call(paste, mydf[1:3]), 
                                  do.call(paste, facLevs)))
mydf
#   Visit1 Visit2 Visit3 pattern
# 1    yes     no     no       7
# 2    yes     no    yes       3
# 3    yes    yes    yes       1
# 4     no    yes     no       6
# 5    yes     no    yes       3

ご覧のとおり、は、作成pattern = 7したの 7 行目にある値に対応していますfacLevs data.frame。

便宜上、次のmydfとおりです。

mydf <- structure(list(Visit1 = c("yes", "yes", "yes", "no", "yes"), 
                       Visit2 = c("no", "no", "yes", "yes", "no"), 
                       Visit3 = c("no", "yes", "yes", "no", "yes")), 
                  .Names = c("Visit1", "Visit2", "Visit3"), 
                  class = "data.frame", row.names = c("1", "2", "3", "4", "5"))

score 0 · Accepted Answer

更新しました

for cycle で答えてください:

updateRow <- function(rIndex, data1) { 
  if ((data1[rIndex, 1] == "yes") && 
      (data1[rIndex, 2] == "no") && 
      (data1[rIndex, 3] == "no")) { 
        data1[rIndex, 4] <- 1
  }   
}

for (i in c(1:3)) updateRow(i, data1); # dim(data1)[2]-1 the column number if you need to change it.

必要に応じて if を変更できます。これがあなたの望むものであることを願っています。

r - R の他の列のデータに基づく条件付き計算

3 に答える 3

Related

Reference