regex - 文字列の最初のカンマで分割

Question

base を使用して最初のコンマで次の文字列を効率的に分割するにはどうすればよいですか?

x <- "I want to split here, though I don't want to split elsewhere, even here."
strsplit(x, ???)

望ましい結果 (2 つの文字列):

[[1]]
[1] "I want to split here"   "though I don't want to split elsewhere, even here."

前もって感謝します。

編集：これについて言及することは考えていませんでした。これは、次のように、列、つまり文字列のベクトルに一般化できる必要があります。

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")

結果は、2 つの列または 1 つの長いベクトル (他のすべての要素を取得できます)、または各インデックス ([[n]]) が 2 つの文字列を持つ文字列のリストになります。

明確さの欠如についてお詫び申し上げます。

score 13 · Accepted Answer

これが私がおそらくすることです。ハッキーに見えるかもしれませんが、sub()とstrsplit()は両方ともベクトル化されているため、複数の文字列を渡したときにもスムーズに機能します。

XX <- "SoMeThInGrIdIcUlOuS"
strsplit(sub(",\\s*", XX, x), XX)
# [[1]]
# [1] "I want to split here"                               
# [2] "though I don't want to split elsewhere, even here."

score 9 · Accepted Answer

stringrパッケージから:

str_split_fixed(x, pattern = ', ', n = 2)
#      [,1]                  
# [1,] "I want to split here"
#      [,2]                                                
# [1,] "though I don't want to split elsewhere, even here."

(これは、1 行 2 列の行列です。)

score 4 · Accepted Answer

これは、最初のカンマの前後を正規表現でキャプチャする別のソリューションです。

x <- "I want to split here, though I don't want to split elsewhere, even here."
library(stringr)
str_match(x, "^(.*?),\\s*(.*)")[,-1] 
# [1] "I want to split here"                              
# [2] "though I don't want to split elsewhere, even here."

score 3 · Accepted Answer

library(stringr)

str_sub(x,end = min(str_locate(string=x, ',')-1))

これにより、必要な最初のビットが取得されます。start=andend=を変更して、str_sub他に必要なものを取得します。

そのような：

str_sub(x,start = min(str_locate(string=x, ',')+1 ))

str_trim先頭のスペースを取り除くためにラップします。

str_trim(str_sub(x,start = min(str_locate(string=x, ',')+1 )))

score 2 · Accepted Answer

これはうまくいきますが、ジョシュ・オブライエンのほうが好きです：

y <- strsplit(x, ",")
sapply(y, function(x) data.frame(x= x[1], 
    z=paste(x[-1], collapse=",")), simplify=F))

チェイスの反応に触発されました。

多くの人がベース以外のアプローチを提供したので、私が通常使用するものを追加すると思います (ただし、この場合はベース応答が必要でした)。

y <- c("Here's comma 1, and 2, see?", "Here's 2nd sting, like it, not a lot.")
library(reshape2)
colsplit(y, ",", c("x","z"))

regex - 文字列の最初のカンマで分割

5 に答える 5

Related

Reference