regex - 正規表現を使用してファイルからテキストの一部を抽出する

Question

次のコードを使用しようとしています。

x <- scan("myfile.txt", what="", sep="\n")

b <- grep('/^one/(.*?)/^four/', x, ignore.case = TRUE, perl = TRUE, value = TRUE,
     fixed = FALSE, useBytes = FALSE, invert = FALSE)

myfile.txt からテキストの移植を抽出する

zero
one
two
three
four
five

私が期待している出力は

one
two
three
four

「1つ」と「4つ」を含めたいのですが、それらを捨てたくありません:)

しかし、どういうわけか正規表現が機能していません。コンソール出力にはエラーはありませんが、テキストもありません...?

print(b) を使用しています

score 2 · Accepted Answer

あなたが探しているものはよくわかりませんが、ただの楽しみです...

R> x
[1] "zero"  "one"   "two"   "three" "four"  "five" 

R> grep("one|four", x) # get the position of "one" and "four"
[1] 2 5

x「1」から「4」の間のもののみを含めるサブセット

R> x[do.call(seq, as.list(grep("one|four", x)))]
[1] "one"   "two"   "three" "four"

score 1 · Accepted Answer

gsub('one(.*)four','\\1',paste(x,collapse=''))
[1] "zerotwothreefive"

または単語間のスペースを取得するには：

gsub('one(.*)four','\\1',paste(dat,collapse=' '))
[1] "zero  two three  five"

Gsee コメントの後に編集:

 gsub('.*(one.*four).*','\\1',paste(dat,collapse=' '))
[1] "one two three four"

しかし、ここでは正規表現を使用する必要はないと思います:

 dat[seq(which(dat == 'one'),which(dat == 'four'))]
[1] "one"   "two"   "three" "four"

もちろん、以前のインデックスが適切な順序でない場合は、min を使用できます。

regex - 正規表現を使用してファイルからテキストの一部を抽出する

2 に答える 2

Related

Reference