r - 2 番目の列の値に基づいて 1 つの列に補正係数を適用する

Question

サンプルデータ

A<-c(1,4,5,6,2,3,4,5,6,7,8,7)
B<-c(4,6,7,8,2,2,2,3,8,8,7,8)
DF<-data.frame(A,B)

私がしたいのは、列 B の値に基づいて、列 A に補正係数を適用することです。ルールは次のようになります。

If B less than 4  <- Multiply A by 1
If B equal to  4 and less than 6  <- Multiply A by 2
If B equal or greater than 6 <-  Multiply by 4

「if」文を書くこともできると思いますが (良い例があれば嬉しいです)、角括弧のインデックスを使用して高速化することにも興味があります。

最終結果は次のようになります

ect

score 2 · Accepted Answer

これを使って：

within(DF, A <- ifelse(B>=6, 4, ifelse(B<4, 1, 2)) * A)

またはこれ（@agstudyによって修正）：

within(DF, {A[B>=6] <- A[B>=6]*4; A[B>=4 & B<6] <- A[B>=4 & B<6]*2})

ベンチマーク:

DF <- data.frame(A=rpois(1e4, 5), B=rpois(1e4, 5))
a <- function(DF) within(DF, A <- ifelse(B>=6, 4, ifelse(B<4, 1, 2)) * A)
b <- function(DF) within(DF, {A[B>=6] <- A[B>=6]*4; A[B>=4 & B<6] <- A[B>=4 & B<6]*2})

identical(a(DF), b(DF))
#[1] TRUE

microbenchmark(a(DF), b(DF), times=1000)
#Unit: milliseconds
#  expr      min        lq   median        uq      max neval
# a(DF) 8.603778 10.253799 11.07999 11.923116 53.91140  1000
# b(DF) 3.763470  3.889065  5.34851  5.480294 39.72503  1000

score 1 · Accepted Answer

findInterval私は、そのような操作の一連の要因へのインデックスとして使用することを好みます。ifelse複数の呼び出しを伴うネストされたテスト条件付きベクトルと結果ベクトルの増殖は、私の効率の感覚を損ないます。

 DF$A <- DF$A * c(1,2,4)[findInterval(DF$B, c(-Inf,4,6,Inf) ) ]
 DF
    A B
1   2 4
2  16 6
3  20 7
4  24 8
snipped ....

基準：

DF <- data.frame(A=rpois(1e4, 5), B=rpois(1e4, 5))
a <- function(DF) within(DF, A <- ifelse(B>=6, 4, ifelse(B<4, 1, 2)) * A)
b <- function(DF) within(DF, {A[B>=6] <- A[B>=6]*4; A[B>=4 & B<6] <- A[B>=4 & B<6]*2})
ccc <- function(DF) within(DF, {A * c(1,2,4)[findInterval(B, c(-Inf,4,6,Inf) ) ]})
microbenchmark(a(DF), b(DF), ccc(DF), times=1000)
#-----------
Unit: microseconds
    expr      min        lq    median        uq      max neval
   a(DF) 7616.107 7843.6320 8105.0340 8322.5620 93549.85  1000
   b(DF) 2638.507 2789.7330 2813.8540 3072.0785 92389.57  1000
 ccc(DF)  604.555  662.5335  676.0645  698.8665 85375.14  1000

注:within自分の関数をコーディングしていた場合は、これを使用しなかったでしょう。

score 1 · Accepted Answer

@Ferdinand ソリューションに似ていますが、transform

transform(DF, newcol = ifelse(B<4,  A,
                                   ifelse(B>=6,4*A,2*A)))
      A B newcol
1  1 4      2
2  4 6     16
3  5 7     20
4  6 8     24
5  2 2      2
6  3 2      3
7  4 2      4
8  5 3      5
9  6 8     24
10 7 8     28
11 8 7     32
12 7 8     28

r - 2 番目の列の値に基づいて 1 つの列に補正係数を適用する

3 に答える 3

Related

Reference