r - R のデータのマッチング行列ペアを使用してデータを抽出する

Question

緯度、経度、気温のデータを含む 2 つのデータセットがあります。1 つのデータセットは、地域の境界とコンテンツを形成する対応する緯度/経度のペアを含む、関心のある地理的地域に対応します (マトリックスディメンション = 4518x2)

もう 1 つのデータセットには、対象地域を包囲するより大きな地域の緯度/経度と気温のデータが含まれています (マトリックスディメンション = 10875x3)。

私の質問は次のとおりです。最初のデータセットの緯度/経度データと一致する 2 番目のデータセットから適切な行データ (緯度、経度、気温) をどのように抽出しますか?

さまざまな「for ループ」、「サブセット」、および「固有」コマンドを試しましたが、一致する温度データを取得できません。

前もって感謝します！

10/31 編集: このデータの処理に "R" を使用していることを忘れていました。

対象地域の緯度/経度データは、各ファイルの名前に緯度/経度座標を含む 4,518 ファイルのリストとして提供されました。

x<- dir()

lenx<- length(x)

g <- strsplit(x, "_")

coord1 <- matrix(NA,nrow=lenx, ncol=1)  
coord2 <- matrix(NA,nrow=lenx, ncol=1)

for(i in 1:lenx) {  
coord1[i,1] <- unlist(g)[2+3*(i-1)]  
coord2[i,1] <- unlist(g)[3+3*(i-1)]     
} 

coord1<-as.numeric(coord1)  
coord2<-as.numeric(coord2)

coord<- cbind(coord1, coord2)

緯度/経度と気温のデータは、10,875 の緯度と経度のペアの気温データを含む NCDF ファイルから取得されました。

long<- tempcd$var[["Temp"]]$size[1]   
lat<- tempcd$var[["Temp"]]$size[2]   
time<- tempcd$var[["Temp"]]$size[3]  
proj<- tempcd$var[["Temp"]]$size[4]  

temp<- matrix(NA, nrow=lat*long, ncol = time)  
lat_c<- matrix(NA, nrow=lat*long, ncol=1)  
long_c<- matrix(NA, nrow=lat*long, ncol =1)  

counter<- 1  

for(i in 1:lat){  
    for(j in 1:long){  
        temp[counter,]<-get.var.ncdf(precipcd, varid= "Prcp", count = c(1,1,time,1), start=c(j,i,1,1))  
        counter<- counter+1  
    }  
}  

temp_gcm <- cbind(lat_c, long_c, temp)`

そこで問題は、「coord」からの緯度/経度のデータペアに対応する「temp_gcm」から値をどのように削除するかということです。

score 2 · Accepted Answer

ノエ、

これを行う方法はいくつか考えられます。最も効率的ではありませんが、最も単純なのはwhich()、一致を適用するデータフレームを反復しながら、論理引数をとるRの関数を利用することです。もちろん、これは、より大きなデータセット内に最大で 1 つの一致が存在する可能性があることを前提としています。あなたのデータセットに基づいて、私は次のようにします：

attach(temp_gcm)    # adds the temp_gcm column names to the global namespace
attach(coord)    # adds the coord column names to the global namespace

matched.temp = vector(length = nrow(coord)) # To store matching results
for (i in seq(coord)) {

   matched.temp[i] = temp[which(lat_c == coord1[i] & long_c == coord2[i])]
}

# Now add the results column to the coord data frame (indexes match)
coord$temperature = matched.temp

この関数は、一致するwhich(lat_c == coord1[i] & long_c == coord2[i])データフレーム内のすべての行のベクトルを返しますtemp_gcm。繰り返しの行からそれぞれ一致します (注: このベクトルの長さは 1 のみであると想定しています。つまり、一致する可能性があるのは 1 つだけです)。次に、論理条件を満たすデータフレーム内の列の値が割り当てられます。これを行う際の目標は、 dataframe の行にインデックスで対応する値と一致するベクトルを作成することであることに注意してください。lat_clong_ccoord1coord2imatched.temp[i]temptemp_gcmcoord

これが役立つことを願っています。これは初歩的なアプローチであることに注意してください。関数merge()を調べてapply()、より簡潔な方法でこれを行うことをお勧めします。

score 0 · Accepted Answer

IF ステートメントの結果として使用するゼロでいっぱいの追加の列を追加しました。「x」は temp_gcm の行数です。"y" は列の数です (タイムステップを表します)。「temp_s」は標準化された温度データです

indicator<- matrix(0, nrow = x, ncol = 1)

precip_s<- cbind(precip_s, indicator)

temp_s<- cbind(temp_s, indicator)

for(aa in 1:x){

    current_lat<-latitudes[aa,1] #Latitudes corresponding to larger area

    current_long<- longitudes[aa,1] #Longitudes corresponding to larger area

    for(ab in 1:lenx){ #Lenx coresponds to nrow(coord)

        if(current_lat == coord[ab,1] & current_long == coord[ab,2]) {
            precip_s[aa,(y/12+1)]<-1 #y/12+1 corresponds to "indicator column"
            temp_s[aa,(y/12+1)]<-1
        } 
    }
}


precip_s<- precip_s[precip_s[,(y/12+1)]>0,] #Removes rows with "0"s remaining in "indcator" column

temp_s<- temp_s[temp_s[,(y/12+1)]>0,]


precip_s<- precip_s[,-(y/12+1)] #Removes "indicator column

temp_s<- temp_s[,-(y/12+1)]

r - R のデータのマッチング行列ペアを使用してデータを抽出する

2 に答える 2

Related

Reference