Azure Machine Learning の Execute R モジュールで操作したこれらの日付はすべて、出力で空白として書き出されます。つまり、これらの日付列は存在しますが、それらの列には値がありません。
データ フレームに読み込んでいる日付情報を含むソース変数には、2 つの異なる日付形式があります。それらは次のとおりです。
usage$Date1=c(‘8/6/2015’ ‘8/20/2015’ ‘7/9/2015’)
usage$Date2=c(‘4/16/2015 0:00’, ‘7/1/2015 0:00’, ‘7/1/2015 0:00’)
I inspected the log file in AML, and AML can't find the local time zone. The log file warnings specifically: [ModuleOutput] 1: In strptime(x, format, tz = tz) : [ModuleOutput] unable to identify current timezone 'C': [ModuleOutput] please set environment variable 'TZ' [ModuleOutput] [ModuleOutput] 2: In strptime(x, format, tz = tz) : unknown timezone 'localtime'
I referred to another answer regarding setting default time zone for strptime here
unknown timezone name in R strptime/as.POSIXct
I changed my code to explicitly define the global environment time variable.
Sys.setenv(TZ='GMT')
####Data frame usage cleanup, format and labeling
usage<-as.data.frame(usage)
usage$Date1<-as.character(usage$Date1)
usage$Date1<-as.POSIXct(usage$Date1, "%m/%d/%Y",tz="GMT")
usage$Date1<-format(usage$Date1, "%m/%d/%Y")
usage$Date1<-as.Date(usage$Date1, "%m/%d/%Y")
usage<-as.data.frame(usage)
usage$Date2<- as.POSIXct(usage$Date2, "%m/%d/%Y",tz="GMT")
usage$Date2<- format(usage$Date2,"%m/%d/%Y")
usage$Date2<-as.Date(usage$Date2, "%m/%d/%Y")
usage<-as.data.frame(usage)
The problem persists -as a result AzureML does not write these variables out, rather writing out these columns as blanks.
(This code works in R studio, where I presume the local time is taken from the system.)
After reading two blog posts on this problem, it seems that Azure ML doesn't support some date time formats:
http://www.mikelanzetta.com/2015/01/data-cleaning-with-azureml-and-r-dates/
So I tried to convert to POSIXct before sending it to the output stream, which I've done as follows: tenantusage$Date1 = as.POSIXct(tenantusage$Date1 , "%m/%d/%Y",tz = "EST5EDT"); tenantusage$Date2 = as.POSIXct(tenantusage$Date2 , "%m/%d/%Y",tz = "EST5EDT");
But encounter the same problem. The information in these variables refuses to write out to the output. Date1 and Date2 columns are blank.
Please advise!
thanks