Actually, what I did was to loop through the timezones instead of the number of rows in the data set ... then its much, much faster. I'll post code tomorrow.
In general, that's a lesson for R: don't loop through the big data frame, loop through the (much shorter) vector of categories and apply using the which() function.
As there are only 5 time zones, the loop only takes a few seconds now.
One other caveat is that if you put it into POSIXct format it will still graph the times in your machine's local timezone. So you need an extra step to then covert it into local time using force_tz().
cap$tdiff is really just created to make sure that the code is doing what it says it should be doing.
library("lubridate")
tzs <- as.character(unique(cap$timezone))
cap$localtimes <- as.POSIXlt(0,origin = "1970-01-01")
#now loop through by timezone instead of lines of cap[]
for (i in 1:length(tzs)) {
whichrows <- which(cap$timezone == tzs[i])
cap[whichrows,"localtimes"] <-
with_tz(cap[whichrows,"UTC"],tzone = tzs[i])
}
remove(i, whichrows)
cap$tdiff <- as.numeric((force_tz(cap$localtime, "UTC") - cap$UTC))
cap$localtime <- as.POSIXct(force_tz(cap$localtimes))