Let's say I have a factor variable with numerous levels and I am trying to group them into several groups.
> levels(dat$years_continuously_insured_order2)
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18"
[19] "19" "20"
> levels(dat$age_of_oldest_driver)
[1] "-16" "1" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33"
[22] "34" "35" "36" "37" "38" "39" "40
I have a script which runs through these variables and groups them into several categories. However, the number of levels could (and usually is) different each time my script runs. Therefore, if my original code to group the variables was the following (see below), it wouldn't be of use if in an hour later, my script runs and the levels are different. Instead of 15 levels, I could now have 25 levels and the values are different, but I still need to group them into specific categories.
dat$years_continuously_insured2 <- NA
dat$years_continuously_insured2[dat$years_continuously_insured %in% levels(dat$years_continuously_insured)[1]] <- NA
dat$years_continuously_insured2[dat$years_continuously_insured %in% levels(dat$years_continuously_insured)[2:3]] <- "1 or less"
dat$years_continuously_insured2[dat$years_continuously_insured %in% levels(dat$years_continuously_insured)[4]] <- "2"
dat$years_continuously_insured2[dat$years_continuously_insured %in% levels(dat$years_continuously_insured)[5:7]] <- "3 +"
dat$years_continuously_insured2 <- factor(dat$years_continuously_insured2)
How can I find a more elegant way to group variables into segments? Are there better ways to do this in R?
Thanks!