1

I strive to produce a visual dynamic animation of timestamped transactions, where each transaction represents a contribution of a person to an artifact/file. To this end, I am using the R packages networkDynamic, network and ndtv.

The transactions have (in contrast to the examples in the networkDynamic package vignette) "real" timestamps. I want to wrap the rendering process inside a function that

  • starts rendering at the beginning of a "natural time frame" such as a day or a week (which most probably is not the timestamp of the first event)
  • renders "natural" labels to the players timeline instead of integers
  • uses "natural" slices such as a week/month/year based on the input data

I think I have managed to make the first slice start at the beginning of the week of the first event using lubridates floor_date. I have not looked into the last issue yet (labelling), because unfortunately, I have troubles to determine proper slicing parameters for my data set.

Please find below a reproducible example for RStudio. The example includes three lists named slice.par, one that does work, and two that don't. Simply hardcoding a parameter configuration that (only) works with the concrete example is not my goal, firstly because my real data set is much bigger (and therefore 'playing around' with the parameters costs much time) and secondly because I would like to have a function that works with many different data sets.

if (!require("pacman")) install.packages("pacman")
library("pacman")
pacman::p_load(network, networkDynamic, ndtv, lubridate)

UtilNumericAsDate <- function(nuUnixTimestamp) {
    return(as.POSIXct(nuUnixTimestamp, origin = "1970-01-01 00:00.00 UTC", tz = "UTC"))
}

UtilDateAsNumeric <- function(oTimestamp) {
    return(as.numeric(as.POSIXct(oTimestamp)))
}

stTransac <- "
'contributorId', 'artifactId', 'weight', 'instantId'
'A', 'a1', '1', '2003-06-01 23:09:40'
'A', 'a2', '1', '2004-02-27 11:48:41'
'A', 'a1', '2', '2006-06-25 20:36:49'
'A', 'a3', '1', '2007-01-28 00:35:31'
'A', 'a3', '2', '2007-04-25 16:03:57'
'A', 'a3', '3', '2007-07-19 19:43:49'
'B', 'a1', '1', '2008-02-06 12:37:56'
'C', 'a3', '1', '2008-04-07 02:27:36'
'C', 'a2', '1', '2008-06-01 02:15:35'
'C', 'a2', '2', '2008-10-05 02:32:45'
'B', 'a1', '2', '2009-06-22 01:57:45'
'C', 'a4', '1', '2009-09-15 02:56:33'
'C', 'a5', '1', '2010-06-30 19:42:25'
'C', 'a6', '1', '2011-06-12 23:58:17'
'B', 'a3', '1', '2013-08-30 19:34:28'
'C', 'a1', '1', '2014-10-23 20:49:54'
'C', 'a1', '2', '2014-10-24 16:46:07'
'A', 'a2', '2', '2015-09-26 16:58:17'
'A', 'a7', '1', '2015-10-04 17:40:12'
'A', 'a8', '1', '2015-12-02 10:55:47'
"

dfTransac <- read.csv(text = stTransac, sep = "," , quote = '\'' , strip.white = TRUE, stringsAsFactors = FALSE)

dfEdges <- unique(dfTransac[,1:2])
veUniqueContributors   <- unique(dfEdges[[1]])
veUniqueArtifacts      <- unique(dfEdges[[2]])
nuNrUniqueContributors <- length(veUniqueContributors)
nuNrUniqueArtifacts    <- length(veUniqueArtifacts)

net <- network.initialize(0, directed = TRUE, bipartite = length(veUniqueContributors))

add.vertices.networkDynamic(net, nuNrUniqueContributors, vertex.pid = veUniqueContributors)
add.vertices.networkDynamic(net, nuNrUniqueArtifacts, vertex.pid = veUniqueArtifacts)

net %v% "vertex.names" <- c(veUniqueContributors, veUniqueArtifacts)
net %v% "vertex.type"  <- c(rep("p", length(veUniqueContributors)), rep("a", length(veUniqueArtifacts)))
net %v% "vertex.col"   <- c(rep("blue", length(veUniqueContributors)), rep("gray", length(veUniqueArtifacts)))
net %v% "vertex.sides" <- c(rep(8, length(veUniqueContributors)), rep(4, length(veUniqueArtifacts)))
net %v% "vertex.rot"   <- c(rep(0, length(veUniqueContributors)), rep(45, length(veUniqueArtifacts)))
net %v% "vertex.lwd"   <- c(rep(1, length(veUniqueContributors)), rep(0, length(veUniqueArtifacts)))
net %v% "vertex.cex"   <- c(rep(2, length(veUniqueContributors)), rep(1, length(veUniqueArtifacts)))
set.network.attribute(net,'vertex.pid','vertex.names')
set.network.attribute(net,'edge.pid','edge.names')

add.edges.networkDynamic(net,
                         tail = get.vertex.id(net, dfEdges[[1]]),
                         head = get.vertex.id(net, dfEdges[[2]]),
                         edge.pid = paste0(dfEdges[[1]], "->", dfEdges[[2]]))

activate.edges(net,
               e = get.edge.id(net, paste0(dfTransac[["contributorId"]], "->", dfTransac[["artifactId"]])),
               at = UtilDateAsNumeric(dfTransac$instantId))

activate.edge.attribute(net,
                        prefix = "weight",
                        value = dfTransac$weight,
                        e = get.edge.id(net, paste0(dfTransac[["contributorId"]], "->", dfTransac[["artifactId"]])),
                        at = UtilDateAsNumeric(dfTransac$instantId))

reconcile.vertex.activity(net = net, mode = "encompass.edges", edge.active.default = FALSE)

nuStart      <- range(get.change.times(net, ignore.inf = FALSE))[1]
nuEnd        <- range(get.change.times(net, ignore.inf = FALSE))[2]

nuWeekStart  <- UtilDateAsNumeric(floor_date(UtilNumericAsDate(nuStart), "week"))
nuWeekEnd    <- UtilDateAsNumeric(ceiling_date(UtilNumericAsDate(nuEnd), "week"))

# This doesn't work: "Monthly" slices, 5 year aggregation
# Error: Attribute 'vertex.sides' had illegal missing values for vertex.sides or was not present in plot.network.default.
slice.par <- list(start = nuWeekStart,
                          end = nuWeekEnd,
                          interval = 1*60*60*24*7*4.5,
                          aggregate.dur = 1*60*60*24*7*52*5,
                          rule = "any")

# This doesn't work either: "Bimonthly" slices, "Bimonthly" aggregation
# Error: Attribute 'weight' had illegal missing values for edge.lwd or was not present in plot.network.default.
slice.par <- list(start = nuWeekStart,
                          end = nuWeekEnd,
                          interval = 1*60*60*24*7*4.5*2,
                          aggregate.dur = 1*60*60*24*7*4.5*2,
                          rule = "any")

# This works: "Bimonthly" slices, 5 year aggregation
slice.par <- list(start = nuWeekStart,
                          end = nuWeekEnd,
                          interval = 1*60*60*24*7*4.5*2,
                          aggregate.dur = 1*60*60*24*7*52*5,
                          rule = "any")

compute.animation(net, animation.mode = "kamadakawai", slice.par = slice.par, default.dist = 10)

render.d3movie(net,
               slice.par = slice.par,
               displaylabels = TRUE,
               output.mode = "htmlWidget",
               usearrows = TRUE,
               vertex.col = 'vertex.col',
               vertex.sides = 'vertex.sides',
               vertex.cex = 'vertex.cex',
               vertex.rot = 'vertex.rot',
               edge.lwd = 'weight',
               render.par = list(tween.frames = 10, show.time = TRUE))

How can I derive proper slicing parameters from the data set so that the rendering process does not choke on individual slices that miss attributes or edges without simply increasing the aggregation duration?

4

1 に答える 1

1

すでに確立されているように、関数にはバグがありrender.d3movieます。「空の」スライス (アクティブな頂点を含まない時間範囲) の値を検索しようとしています。Github のバグ レポートを参照してください。(上記のコードで実際にエラーを再現することはできませんが、間違いなくバグです。報告していただきありがとうございます)

データセットから適切なスライシング パラメータを導き出すにはどうすれば、単純に集計時間を長くすることなく、属性やエッジを欠いている個々のスライスでレンダリング プロセスが停止しないようにすることができますか?

バグが修正されるまで (できればすぐに)、次のことができます。

a)render.animation代わりに使用

b) スライス パラメータを選択して、ネットワークにアクティブな頂点がないスライスがないようにします。この機能を使用してtimeline、スライスが着地する場所を確認できます。たとえば、ノード (青) とエッジ (紫) のアクティビティ スペルをスライス ビン (垂直の灰色のバー) と共に表示するには、次のようにします。

timeline(net,slice.par=slice.par,main='timeline plot of activity spells')

ここに画像の説明を入力

c) 最適な解決策は、頂点アクティビティを微調整して、各タイム スライスでアクティブな頂点が常に存在するようにすることです。reconcile.vertex.activityこの場合、非常に短い持続時間のエッジのみを含むため、によって非常に短い持続時間が割り当てられた頂点がいくつかあります。別のルールを使用すると、それを回避するか、頂点が表示されたら常にアクティブになるように設定することができます (データにとって意味がある場合)。

その他の注意事項:

エッジで動的属性をビニングするときに複数の可能な値に遭遇したときに、どれを選択するかを知るために、slice.par$rule値をearliestの代わりにに設定する必要もあります。anyweight

networkDynamicところで、ユーティリティ関数を使用して を渡してネットワークを構築するよりコンパクトな方法があると思います。stTransacおそらく、大規模なデータセットの読み込みが高速になります。

于 2016-09-02T18:38:38.623 に答える