chord-diagram - 複数レベルのデータを含むコードダイアグラムを Circlize

Question

少し行き詰まっています。サークライズのコード図を介して人身売買された種の地域間の流れを示したいのですが、列 1 と 2 が「接続」を表し、列 3 が「 factor" と列 4 が値です。以下にデータのサンプルを含めました (はい、インドネシアが地域であることは承知しています)。それぞれの種が特定の地域に固有のものではないことがわかります。以下に含まれるものと同様のプロットを作成したいと思いますが、各地域の「国」を「種」に置き換えます。これは可能ですか？

import_region    export_region  species                flow
North America    Europe         Acanthosaura armata     0.0104
Southeast Asia   Europe         Acanthosaura armata     0.0022
Indonesia        Europe         Acanthosaura armata     0.1971
Indonesia        Europe         Acrochordus granulatus  0.7846
Southeast Asia   Europe         Acrochordus granulatus  0.1101
Indonesia        Europe         Acrochordus javanicus   2.00E-04
Southeast Asia   Europe         Acrochordus javanicus   0.0015
Indonesia        North America  Acrochordus javanicus   0.0024
East Asia        Europe         Acrochordus javanicus   0.0028
Indonesia        Europe         Ahaetulla prasina       4.00E-04
Southeast Asia   Europe         Ahaetulla prasina       4.00E-04
Southeast Asia   East Asia      Amyda cartilaginea      0.0027
Indonesia        East Asia      Amyda cartilaginea      5.00E-04
Indonesia        Europe         Amyda cartilaginea      0.004
Indonesia        Southeast Asia Amyda cartilaginea      0.0334
Europe           North America  Amyda cartilaginea      4.00E-04
Indonesia        North America  Amyda cartilaginea      0.1291
Southeast Asia   Southeast Asia Amyda cartilaginea      0.0283
Indonesia        West Asia      Amyda cartilaginea      0.7614
South Asia       Europe         Amyda cartilaginea      2.8484
Australasia      Europe         Apodora papuana         0.0368
Indonesia        North America  Apodora papuana         0.324
Indonesia        Europe         Apodora papuana         0.0691
Europe           Europe         Apodora papuana         0.0106
Indonesia        East Asia      Apodora papuana         0.0129
Europe           North America  Apodora papuana         0.0034
East Asia        East Asia      Apodora papuana         2.00E-04
Indonesia        Southeast Asia Apodora papuana         0.0045
East Asia        North America  Apodora papuans         0.0042

私が望むものに似た図の例は、下のリンクをクリックしてください: コード図

score 5 · Accepted Answer

circlize パッケージでは、ChordDiagram()関数は「from」列、「to」列、およびオプションの「value」列のみを許可します。ただし、あなたの場合、実際には、元のデータフレームを変換して、3 列のデータフレームに変更することができます。

あなたの例では、たとえば北米の Acanthosaura_armata をヨーロッパの Acanthosaura_armata と区別したい場合、1 つの解決策はAcanthosaura_armata|North_America、一意の識別子を形成するなどの地域名と種名をマージすることです。次に、circlize パッケージを使用してこのデータセットを視覚化する方法を示します。

データを読み込みます。スペースをアンダースコアに置き換えたことに注意してください。

df = read.table(textConnection(
"import_region    export_region  species                flow
North_America    Europe         Acanthosaura_armata     0.0104
Southeast_Asia   Europe         Acanthosaura_armata     0.0022
Indonesia        Europe         Acanthosaura_armata     0.1971
Indonesia        Europe         Acrochordus_granulatus  0.7846
Southeast_Asia   Europe         Acrochordus_granulatus  0.1101
Indonesia        Europe         Acrochordus_javanicus   2.00E-04
Southeast_Asia   Europe         Acrochordus_javanicus   0.0015
Indonesia        North_America  Acrochordus_javanicus   0.0024
East_Asia        Europe         Acrochordus_javanicus   0.0028
Indonesia        Europe         Ahaetulla_prasina       4.00E-04
Southeast_Asia   Europe         Ahaetulla_prasina       4.00E-04
Southeast_Asia   East_Asia      Amyda_cartilaginea      0.0027
Indonesia        East_Asia      Amyda_cartilaginea      5.00E-04
Indonesia        Europe         Amyda_cartilaginea      0.004
Indonesia        Southeast_Asia Amyda_cartilaginea      0.0334
Europe           North_America  Amyda_cartilaginea      4.00E-04
Indonesia        North_America  Amyda_cartilaginea      0.1291
Southeast_Asia   Southeast_Asia Amyda_cartilaginea      0.0283
Indonesia        West_Asia      Amyda_cartilaginea      0.7614
South_Asia       Europe         Amyda_cartilaginea      2.8484
Australasia      Europe         Apodora_papuana         0.0368
Indonesia        North_America  Apodora_papuana         0.324
Indonesia        Europe         Apodora_papuana         0.0691
Europe           Europe         Apodora_papuana         0.0106
Indonesia        East_Asia      Apodora_papuana         0.0129
Europe           North_America  Apodora_papuana         0.0034
East_Asia        East_Asia      Apodora_papuana         2.00E-04
Indonesia        Southeast_Asia Apodora_papuana         0.0045
East_Asia        North_America  Apodora_papuans         0.0042"),
header = TRUE, stringsAsFactors = FALSE)

また、値が非常に小さい行をいくつか削除しました。

df = df[df[[4]] > 0.01, ]

種と地域に色を割り当てます。

library(circlize)
library(RColorBrewer)
all_species = unique(df[[3]])
color_species = structure(brewer.pal(length(all_species), "Set1"), names = all_species)
all_regions = unique(c(df[[1]], df[[2]]))
color_regions = structure(brewer.pal(length(all_regions), "Set2"), names = all_regions)

種ごとにグループ化

最初に、コードダイアグラムを種ごとにグループ化する方法を示します。

前述のように、species|region一意の識別子として使用します。

df2 = data.frame(from = paste(df[[3]], df[[1]], sep = "|"),
                 to = paste(df[[3]], df[[2]], sep = "|"),
                 value = df[[4]], stringsAsFactors = FALSE)

次に、すべてのセクターの順序を調整して、最初に種、次に地域で並べます。

combined = unique(data.frame(regions = c(df[[1]], df[[2]]), 
    species = c(df[[3]], df[[3]]), stringsAsFactors = FALSE))
combined = combined[order(combined$species, combined$regions), ]
order = paste(combined$species, combined$regions, sep = "|")

リンクの色を regoins の色と同じにしたい

grid.col = structure(color_regions[combined$regions], names = order)

コードダイアグラムは種ごとにグループ化されているため、種間のギャップは各種の内部よりも大きくなるはずです。

gap = rep(1, length(order))
gap[which(!duplicated(combined$species, fromLast = TRUE))] = 5

すべての設定が完了したら、コードダイアグラムを作成します。

次のコードでは、preAllocateTracks種を表す円形の線が後で追加されるように設定しています。

circos.par(gap.degree = gap)
chordDiagram(df2, order = order, annotationTrack = c("grid", "axis"),
    grid.col = grid.col, directional = TRUE,
    preAllocateTracks = list(
        track.height = 0.04,
        track.margin = c(0.05, 0)
    )
)

種を表すために円形の線が追加されます。

for(species in unique(combined$species)) {
    l = combined$species == species
    sn = paste(combined$species[l], combined$regions[l], sep = "|")
    highlight.sector(sn, track.index = 1, col = color_species[species], 
        text = species, niceFacing = TRUE)
}
circos.clear()

そして、地域と種の伝説:

legend("bottomleft", pch = 15, col = color_regions, 
    legend = names(color_regions), cex = 0.6)
legend("bottomright", pch = 15, col = color_species, 
    legend = names(color_species), cex = 0.6)

プロットは次のようになります。

地域ごとにグループ化

コードは似ているので、説明はしませんが、投稿にコードを添付してください。プロットは次のようになります。

## group by regions
df2 = data.frame(from = paste(df[[1]], df[[3]], sep = "|"),
                 to = paste(df[[2]], df[[3]], sep = "|"),
                 value = df[[4]], stringsAsFactors = FALSE)

combined = unique(data.frame(regions = c(df[[1]], df[[2]]), 
    species = c(df[[3]], df[[3]]), stringsAsFactors = FALSE))
combined = combined[order(combined$regions, combined$species), ]
order = paste(combined$regions, combined$species, sep = "|")
grid.col = structure(color_species[combined$species], names = order)

gap = rep(1, length(order))
gap[which(!duplicated(combined$species, fromLast = TRUE))] = 5

circos.par(gap.degree = gap)
chordDiagram(df2, order = order, annotationTrack = c("grid", "axis"),
    grid.col = grid.col, directional = TRUE,
    preAllocateTracks = list(
        track.height = 0.04,
        track.margin = c(0.05, 0)
    )
)
for(region in unique(combined$regions)) {
    l = combined$regions == region
    sn = paste(combined$regions[l], combined$species[l], sep = "|")
    highlight.sector(sn, track.index = 1, col = color_regions[region], 
        text = region, niceFacing = TRUE)
}
circos.clear()

legend("bottomleft", pch = 15, col = color_regions, 
    legend = names(color_regions), cex = 0.6)
legend("bottomright", pch = 15, col = color_species, l
    egend = names(color_species), cex = 0.6)

chord-diagram - 複数レベルのデータを含むコード ダイアグラムを Circlize

1 に答える 1

種ごとにグループ化

地域ごとにグループ化

Related

Reference

chord-diagram - 複数レベルのデータを含むコードダイアグラムを Circlize