これは私のために働く:
df %>%
filter(complete.cases(df))
またはもう少し一般的に:
library(dplyr) # 0.4
df %>% filter(complete.cases(.))
これには、フィルターに渡す前にチェーン内でデータを変更できるという利点があります。
より多くの列を持つ別のベンチマーク:
set.seed(123)
x <- sample(1e5,1e5*26, replace = TRUE)
x[sample(seq_along(x), 1e3)] <- NA
df <- as.data.frame(matrix(x, ncol = 26))
library(microbenchmark)
microbenchmark(
na.omit = {df %>% na.omit},
filter.anonymous = {df %>% (function(x) filter(x, complete.cases(x)))},
rowSums = {df %>% filter(rowSums(is.na(.)) == 0L)},
filter = {df %>% filter(complete.cases(.))},
times = 20L,
unit = "relative")
#Unit: relative
# expr min lq median uq max neval
# na.omit 12.252048 11.248707 11.327005 11.0623422 12.823233 20
#filter.anonymous 1.149305 1.022891 1.013779 0.9948659 4.668691 20
# rowSums 2.281002 2.377807 2.420615 2.3467519 5.223077 20
# filter 1.000000 1.000000 1.000000 1.0000000 1.000000 20