0

My question comprises two parts. I have a matrix with IDs and several columns (representing time) of values from 0-180. I'd like to summarize these with sub groups, then compare across the columns. For example, how many IDs switch from 0-10 in column 5, to 11+ in column 6?

Now, my first thought was a SAS-style format command. This would let me group integers into different blocks (0-10,11-20,21-30,etc). But, it seems that this doesn't exist.

My solution has been to loop through all values of this matrix (dual for loops) and check whether the values fall between certain ranges(string of if statements), then enter this value into a new matrix that keeps track of only classes. Example:

# search through columns
for (j in 2:(dim(Tab2)[2])){ 
    # search through lines
    for (i in 1:dim(Tab2)[1]){ 
        if (is.na(Tab2[i,j])){
            tempGliss[i,j] <- "NA"}
        else if (Tab2[i,j]==0){
            tempGliss[i,j] <- "Zero"}
        else if (Tab2[i,j]>0 & Tab2[i,j]<=7){
            tempGliss[i,j] <- "1-7"}
        else if (Tab2[i,j]>=7 & Tab2[i,j]<=14){
            tempGliss[i,j] <- "7-14"}
        else if (Tab2[i,j]>=15 & Tab2[i,j]<=30){
            tempGliss[i,j] <- "15-30"}
        else if (Tab2[i,j]>=31 & Tab2[i,j]<=60){
            tempGliss[i,j] <- "31-60"}
        else if (Tab2[i,j]>=61 & Tab2[i,j]<=90){
            tempGliss[i,j] <- "61-90"}
        else if (Tab2[i,j]>=91 & Tab2[i,j]<=120){
            tempGliss[i,j] <- "91-120"}
        else if (Tab2[i,j]>=121 & Tab2[i,j]<=150){
            tempGliss[i,j] <- "121-150"}
        else if (Tab2[i,j]>=151 & Tab2[i,j]<=180){
            tempGliss[i,j] <- "151-180"}
        else if (Tab2[i,j]>180){
            tempGliss[i,j] <- ">180"}
    }
}

Here Tab2 is my original matrix, and tempGliss is what I'm creating as a class. This takes a VERY LONG TIME! It doesn't help that my file is quite large. Is there any way I can speed this up? Alternatives to the for loops or the if statements?

4

1 に答える 1

1

多分あなたは使うことができますcut

Tab2 <- data.frame(a = 1:9, b = c(0, 7, 14, 30, 60, 90, 120, 150, 155)
        ,c = c(0, 1, 7, 15, 31, 61, 91, 121, 155))
repla <- c("Zero", "1-7", "7-14", "15-30", "31-60", "61-90", "91-120", "121-150", "151-180", ">180")

for(j in 2:(dim(Tab2)[2])){
 dum <- cut(Tab2[,j], c(-Inf,0,7,14,30,60,90,120,150,180, Inf))
 levels(dum) <- repla
 Tab2[,j] <- dum
}

> Tab2
  a       b       c
1 1    Zero    Zero
2 2     1-7     1-7
3 3    7-14     1-7
4 4   15-30   15-30
5 5   31-60   31-60
6 6   61-90   61-90
7 7  91-120  91-120
8 8 121-150 121-150
9 9 151-180 151-180

あまり詳しく見ていませんが、バンドを少し調整する必要があるかもしれません。

于 2013-06-11T11:58:15.417 に答える