r - geom_tileがデータのサブセットをプロットするのに、それ以上はプロットしないのはなぜですか？

Question

マップをプロットしようとしていますが、以下が機能しない理由がわかりません。

これが最小限の例です

testdf <- structure(list(x = c(48.97, 44.22, 44.99, 48.87, 43.82, 43.16, 38.96, 38.49, 44.98, 43.9), y = c(-119.7, -113.7, -109.3, -120.6,  -109.6, -121.2, -114.2, -118.9, -109.7, -114.1), z = c(0.001216,  0.001631, 0.001801, 0.002081, 0.002158, 0.002265, 0.002298, 0.002334, 0.002349, 0.00249)), .Names = c("x", "y", "z"), row.names = c(NA, 10L), class = "data.frame")

これは1〜8行で機能します。

ggplot(data = testdf[1,], aes(x,y,fill = z)) + geom_tile()
ggplot(data = testdf[1:8,], aes(x,y,fill = z)) + geom_tile()

ただし、9行ではありません。

ggplot(data = testdf[1:9,], aes(x,y,fill = z)) + geom_tile()

最終的に、私は非規則的なグリッドにデータをプロットする方法を探しています。geom_tileを使用することは必須ではありませんが、ポイントに対するスペースを埋める補間は使用できます。

完全なデータセットは要点として利用可能です

testdf上記は、完全なデータセットの小さなサブセットであり、米国の高解像度ラスター（> 7500行）です。

require(RCurl) # requires libcurl; sudo apt-get install libcurl4-openssl-dev
tmp <- getURL("https://gist.github.com/raw/4635980/f657dcdfab7b951c7b8b921b3a109c7df1697eb8/test.csv")
testdf <- read.csv(textConnection(x))

私が試したこと：

geom_pointの使用は機能しますが、望ましい効果はありません。
```
ggplot(data = testdf, aes(x,y,color=z)) + geom_point()
```

xまたはyのいずれかをベクトル1:10に変換すると、プロットは期待どおりに機能します。

newdf <- transform(testdf, y =1:10)

ggplot(data = newdf[1:9,], aes(x,y,fill = z)) + geom_tile()

newdf <- transform(testdf, x =1:10)
ggplot(data = newdf[1:9,], aes(x,y,fill = z)) + geom_tile()

sessionInfo()R version 2.15.2 (2012-10-26) Platform: x86_64-pc-linux-gnu (64-bit)


> attached base packages: [1] stats     graphics  grDevices utils    
> datasets  methods   base     

> other attached packages: [1] reshape2_1.2.2 maps_2.3-0    
> betymaps_1.0   ggmap_2.2      ggplot2_0.9.3 

> loaded via a namespace (and not attached):  [1] colorspace_1.2-0   
> dichromat_1.2-4     digest_0.6.1        grid_2.15.2        
> gtable_0.1.2        labeling_0.1         [7] MASS_7.3-23        
> munsell_0.4         plyr_1.8            png_0.1-4          
> proto_0.3-10        RColorBrewer_1.0-5  [13] RgoogleMaps_1.2.0.2
> rjson_0.2.12        scales_0.2.3        stringr_0.6.2      
> tools_2.15.2

score 11 · Accepted Answer

使用できない理由geom_tile()（またはより適切なのgeom_raster()は、これら2つgeomsが等間隔に配置されたタイルに依存しているためですが、そうではありません。データをポイントに強制変換し、等間隔のラスターにリサンプリングする必要があります。次に、でプロットしgeom_raster()ます。これを希望どおりにプロットするには、元のデータをわずかにリサンプリングする必要があることを受け入れる必要があります。

また、地図投影法の詳細についても読んでくださいraster:::projection。rgdal:::spTransform

require( RCurl )
require( raster )
require( sp )
require( ggplot2 )
tmp <- getURL("https://gist.github.com/geophtwombly/4635980/raw/f657dcdfab7b951c7b8b921b3a109c7df1697eb8/test.csv")
testdf <- read.csv(textConnection(tmp))
spdf <- SpatialPointsDataFrame( data.frame( x = testdf$y , y = testdf$x ) , data = data.frame( z = testdf$z ) )

# Plotting the points reveals the unevenly spaced nature of the points
spplot(spdf)

ここに画像の説明を入力してください

# You can see the uneven nature of the data even better here via the moire pattern
plot(spdf)

ここに画像の説明を入力してください

# Make an evenly spaced raster, the same extent as original data
e <- extent( spdf )

# Determine ratio between x and y dimensions
ratio <- ( e@xmax - e@xmin ) / ( e@ymax - e@ymin )

# Create template raster to sample to
r <- raster( nrows = 56 , ncols = floor( 56 * ratio ) , ext = extent(spdf) )
rf <- rasterize( spdf , r , field = "z" , fun = mean )

# Attributes of our new raster (# cells quite close to original data)
rf
class       : RasterLayer 
dimensions  : 56, 135, 7560  (nrow, ncol, ncell)
resolution  : 0.424932, 0.4248191  (x, y)
extent      : -124.5008, -67.13498, 25.21298, 49.00285  (xmin, xmax, ymin, ymax)

# We can then plot this using `geom_tile()` or `geom_raster()`
rdf <- data.frame( rasterToPoints( rf ) )    
ggplot( NULL ) + geom_raster( data = rdf , aes( x , y , fill = layer ) )

ここに画像の説明を入力してください

# And as the OP asked for geom_tile, this would be...
ggplot( NULL ) + geom_tile( data = rdf , aes( x , y , fill = layer ) , colour = "white" )

ここに画像の説明を入力してください

もちろん、このデータはまったく意味がないことを付け加えておきます。実際に行う必要があるのは、SpatialPointsDataFrameを取得し、それに正しい投影情報を割り当ててから、spTransformを介して緯度経度座標に変換してから、変換されたポイントをrasterzieすることです。実際には、ラスターデータに関する詳細情報が必要です。ここにあるのは近似値ですが、最終的にはデータの真の反映ではありません。

score 9 · Accepted Answer

これは問題の答えではなく、geom_tile()データをプロットする別の方法です。

x座標とy座標が30kmグリッド（グリッドの中央を想定）であるためgeom_point()、データを使用してプロットできます。適切な値を選択する必要がありshape=ます。形状15は長方形をプロットします。

もう1つの問題は、x値とy値です。データをプロットするときは、緯度と経度に対応するようにプロットする必要がありx=yますy=x。

coord_equal()正しいアスペクト比があることを確認します（ネット上の例として比率を持つこのソリューションを見つけました）。

ggplot(data = testdf, aes(y,x,colour=z)) + geom_point(shape=15)+
  coord_equal(ratio=1/cos(mean(testdf$x)*pi/180))

ここに画像の説明を入力してください

score 4 · Accepted Answer

答え：

データはプロットされていますが、非常に小さいです。

ここから：

"Tile plot as densely as possible, assuming that every tile is the same size.

このプロットを検討してください

ggplot(data = testdf[1:2,], aes(x,y,fill = z)) + geom_tile()

ここに画像の説明を入力してください

上のプロットには2つのタイルがあります。geom_tileすべてのタイルが同じサイズであることを考慮して、プロットを可能な限り密にしようとしています。ここでは、2つのタイルを重ねることなくこれほど大きくすることができます。4タイル分のスペースを作ります。

次のプロットを試して、結果のプロットが何を示しているかを確認してください。

df1 <- data.frame(x=c(1:3),y=(1:3))
#     df1
#  x   y
#1 1   1
#2 2   2
#3 3   3
ggplot(data = df1[1,], aes(x,y)) + geom_tile()   
ggplot(data = df1[1:2,], aes(x,y)) + geom_tile() 
ggplot(data = df1[1:3,], aes(x,y)) + geom_tile()

この例と比較してください：

 df2 <- data.frame(x=c(1:3),y=c(1,20,300))
 df2
 # x   y
#1 1   1
#2 2  20
#3 3 300

 ggplot(data = df2[1,], aes(x,y)) + geom_tile()
 ggplot(data = df2[1:2,], aes(x,y)) + geom_tile()
 ggplot(data = df2[1:3,], aes(x,y)) + geom_tile()

df1最初の2つのプロットはとで同じですが、df23番目のプロットdf2は異なることに注意してください。これは、タイルを作成できる最大のものが（x[1],y[1])と（の間であるためx[2],y[2])です。これ以上、それらはオーバーラップするため、これら2つのタイルと最後の3番目のタイルの間に多くのスペースが残りy=300ます。

これがどれほど賢明かはわかりませんが、widthパラメータもあります。geom_tileそのようなまばらなデータで別のオプションを空想しないでよろしいですか？

（完全なデータはまだプロットされています：を参照してくださいggplot(data = testdf, aes(x,y)) + geom_tile(width=1000)

score 1 · Accepted Answer

geom_tileを使用する場合は、最初に集計する必要があると思います。

# NOTE: tmp.csv downloaded from https://gist.github.com/geophtwombly/4635980/raw/f657dcdfab7b951c7b8b921b3a109c7df1697eb8/test.csv
testdf <- read.csv("~/Desktop/tmp.csv") 

# combine x,y coordinates by rounding
testdf$x2 <- round(testdf$x, digits=0)
testdf$y2 <- round(testdf$y, digits=0)

# aggregate on combined coordinates
library(plyr)
testdf <- ddply(testdf, c("x2", "y2"), summarize,
                z = mean(z))

# plot aggregated data using geom_tile
ggplot(data = testdf, aes(y2,x2,fill=z)) +
  geom_tile() +
  coord_equal(ratio=1/cos(mean(testdf$x2)*pi/180)) # copied from @Didzis Elferts answer--nice!

これをすべて実行すると、@ Didzis Elfertsが示唆しているように、geom_point（）の方が優れていると結論付けることができます。

r - geom_tileがデータのサブセットをプロットするのに、それ以上はプロットしないのはなぜですか？

完全なデータセットは要点として利用可能です

私が試したこと：

4 に答える 4

Related

Reference