r - R-異なるデータフレームに列を追加（追加）するための最良の方法

Question

私は次のような3つの異なるデータフレームを持っています：

V1.x<-c(1,2,3,4,5)
V2.x<-c(2,2,7,3,1)
V3.x<-c(2,4,3,2,9)
D1<-data.frame(ID=c("A","B","C","D","E"),V1.x=V1.x,V2.x=V2.x,V3.x=V3.x)

V1.y<-c(2,3,3,3,5)
V2.y<-c(1,2,3,3,5)
V3.y<-c(6,4,3,2,2)
D2<-data.frame(ID=c("A","B","C","D","E"),V1.y=V1.y,V2.y=V2.y,V3.y=V3.y)

V1<-c(3,2,4,4,5)
V2<-c(3,7,3,4,5)
V3<-c(5,4,3,6,3)
D3<-data.frame(ID=c("A","B","C","D","E"),V1=V1,V2=V2,V3=V3)

すべてのV1列、すべてのV2列、およびすべてのV3列を合計したいと思います。

V1_Add<-D1$V1.x+D2$V1.y+D3$V1
V2_Add<-D1$V2.x+D2$V2.y+D3$V2
V3_Add<-D1$V3.x+D2$V3.y+D3$V3

個々の列の合計を取得するのに問題なく機能しますが、実際のデータでは、列番号はV1：V80から取得されるため、各列を個別に入力する必要がないのは素晴らしいことです。また、次のように最終的な合計をすべて含む1つのデータフレームを作成することをお勧めします。

  ID  V1  V2  V3
1  A  6  6   13
2  B  7  11  12
3  C  10 13  9
4  D  11 10  10
5  E  15 11  14

score 2 · Accepted Answer

これはあなたが望むもののようなものですか？

D.Add <- data.frame(D1[,1],(D1[,-1]+D2[,-1]+D3[,-1]))
colnames(D.Add)<-colnames(D3)

score 2 · Accepted Answer

これはおそらくやり過ぎのアプローチですが、任意の数の列と任意の数の「インデックス」列にもかなり一般化できるはずです。すべての data.frames に同じ数の列があり、それらが正しい順序になっていると想定しています。まず、すべての data.frames からリストオブジェクトを作成します。この質問を参照して、プログラムでそれを行いました。

ClassFilter <- function(x, class) inherits(get(x), "data.frame")
Objs <- Filter( ClassFilter, ls() )
Objs <- lapply(Objs, "get")

次に、を使用してすべての数値列を追加Reduceし、最後に数値以外の列と一緒に戻す関数を作成しました。

FUN <- function(x){
  colsToProcess <- lapply(x, function(y) y[, unlist(sapply(y, is.numeric))])
  result <- Reduce("+", colsToProcess)
  #Get the non numeric columns
  nonNumericCols <- x[[1]]  
  nonNumericCols <- nonNumericCols[, !(unlist(sapply(nonNumericCols, is.numeric)))]
  return(data.frame(Index = nonNumericCols, result))
}

そして最後に、実際に：

> FUN(Objs)
  Index V1.x V2.x V3.x
1     A    6    6   13
2     B    7   11   12
3     C   10   13    9
4     D   11   10   10
5     E   15   11   14

score 2 · Accepted Answer

library(reshape2)
library(plyr)

# First let's standardize column names after ID so they become V1 through Vx. 
# I turned it into a function to make this easy to do for multiple data.frames
standardize_col_names <- function(df) {
# First column remains ID, then I name the remaining V1 through Vn-1 
# (since first column is taken up by the ID)
names(df) <- c("ID", paste("V",1:(dim(df)[2]-1),sep=""))
return(df)
}

D1 <- standardize_col_names(D1)
D2 <- standardize_col_names(D2)
D3 <- standardize_col_names(D3)

# Next, we melt the data and bind them into the same data.frame
# See one example with melt(D1, id.vars=1). I just used rbind to combine those
melted_data <- rbind(melt(D1, id.vars=1), melt(D2, id.vars=1), melt(D3, id.vars=1))
# note that the above step can be folded into the function as well. 
# Then you throw all the data.frames into a list and ldply through this function.

# Finally, we cast the data into what you need which is the sum of the columns
 dcast(melted_data, ID~variable, sum)
  ID V1 V2 V3
1  A  6  6 13
2  B  7 11 12
3  C 10 13  9
4  D 11 10 10
5  E 15 11 14



 # Combined everything above more efficiently :

   standardize_df <- function(df) {
    names(df) <- c("ID", paste("V",1:(dim(df)[2]-1),sep=""))
    return(melt(df, id.vars = 1))
    }

   all_my_data <- list(D1,D2,D3)
   melted_data <- ldply(all_my_data, standardize_df)
   summarized_data <- dcast(melted_data, ID~variable, sum)

score 0 · Accepted Answer

ブロック全体を合計するだけで、これはどうですか？:

D1[,2:4] + D3[,2:4] + D2[,2:4]

...結果は...

  V1.x V2.x V3.x
1    6    6   13
2    7   11   12
3   10   13    9
4   11   10   10
5   15   11   14

すべての変数が同じ順序であると想定していますが、それ以外の場合はうまく機能するはずです。

r - R-異なるデータフレームに列を追加（追加）するための最良の方法

4 に答える 4

Related

Reference