r - リスト列のあるティブル: 可能であれば配列に変換

Question

私は次のようにティブルを持っています：

uuu <- structure(list(IsCharacter = c("a", "b"),
                      ShouldBeCharacter = list("One", "Another"),
                      IsList = list("Element1", c("Element2", "Element3"))
               ),
           .Names = c("IsCharacter", "ShouldBeCharacter", "IsList"),
            row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"))
uuu
## A tibble: 2 × 3
#  IsCharacter ShouldBeCharacter    IsList
#        <chr>            <list>    <list>
#1           a         <chr [1]> <chr [1]>
#2           b         <chr [1]> <chr [2]>

すべての要素が同じ長さとタイプである「ShouldBeCharacter」のような列を「IsCharacter」のような列に変換し、残りの列はそのままにしておきたいと思います。

これまでのところ、問題を解決する次の関数がありますが、かなりハックに見えます。私が考えていないより良い解決策があるかどうか知りたいです：

lists_to_atomic <- function(data) {
  # Elements of length larger than one should be kept as lists.
  # So we compute the maximum length for each column
  length_column_elements <- apply(data, 2,
                                  function(x) max(sapply(x, function(y) length(y))))
  # to_simplify will contain column names of class list and with all elements of length 1
  to_simplify <- colnames(data)[length_column_elements == 1 & sapply(data, class) == "list"]
  # Do the conversion
  data[,to_simplify] <- tibble::as_tibble(lapply(as.list(data[,to_simplify]), function(x) {do.call(c, x)}))
  return(data)  
}

これが私が得た結果です。ShouldBeCharacter のタイプがどのように変更されたかに注意してください。

lists_to_atomic(uuu)
## A tibble: 2 × 3
#  IsCharacter ShouldBeCharacter    IsList
#        <chr>             <chr>    <list>
#1           a               One <chr [1]>
#2           b           Another <chr [2]>

このas_tibble(lapply(as.list(... do.call(c,...)))行は私には複雑すぎるように見えますが、より単純な代替案が見つかりません。

lists_to_atomic関数の信頼性を高める簡素化はありますか?

アップデート

tidyr::unnestリスト型の列と長さ1の要素を使用することは考えていませんでしたが、@taavi-pの回答に従って、関数をこれに単純化することができました:

lists_to_atomic <- function(data) {
  # Elements of length larger than one should be kept as lists.
  # So we compute the maximum length for each column
  length_column_elements <- apply(data, 2,
                                  function(x) max(sapply(x, function(y) length(y))))
  # to_simplify will contain column names of class list and with all elements of length 1
  to_simplify <- colnames(data)[length_column_elements == 1 & 
                                vapply(data,
                                       FUN = function(x) "list" %in% class(x),
                                       FUN.VALUE = logical(1))]

  # Do the conversion
  data2 <- tidyr::unnest_(data, unnest_cols = to_simplify)
  data2 <- data2[, colnames(data)] # Preserve original column order
  return(data2)
}

score 3 · Accepted Answer

あなたが試すことができます：

     library(tidyr)
     uuu %>% unnest(ShouldBeCharacter)

リスト列を処理する方法のその他の例は、「R for Data Science」にあります: http://r4ds.had.co.nz/many-models.html#list-columns-1

r - リスト列のあるティブル: 可能であれば配列に変換

アップデート

1 に答える 1

Related

Reference