javascript - 最大 X 単語の後にテキストを分割する

Question

私は自分の問題の解決策を見つけるのに苦労しましたが、それを共有できれば幸いです. 問題は次のとおりです。

私はテキストを持っています。それにはあらゆる種類の句読点が含まれている可能性があります。私はそれを2つの部分に分割したい:

最大Xワード
- ドットやコンマなどの最後の単語に付けられた句読点を含む
テキストの終わり
- 2つの部分の間の間隔から始めます

ここではいくつかの例を示します。

str = "one two, three, quatro 5! : six sept  ocho nine 10!"

splitAfterXWords(str, 2)
// ["one two,", "three, quatro 5! : six sept  ocho nine 10!"]

splitAfterXWords(str, 5)
// ["one two, three, quatro 5!", " : six sept  ocho nine 10!"]

splitAfterXWords(str, 20)
// ["one two, three, quatro 5! : six sept  ocho nine 10!", ""]

splitAfterXWords(str, 6)
// ["one two, three, quatro 5! : six", " sept  ocho nine 10!"]

score 3 · Accepted Answer

以下は、その作業を行う関数です。

function splitAfterXWords(to_split, words){
    regex = new RegExp("(([\\s;:!,.?\"'’]*[^\\s]+){" + words + "})(.*)")
    result = regex.exec(to_split)
    return result ? [result[1], to_split.substr(result[1].length)] : [to_split, '']
}

この fiddleで動作していることがわかります。

改善とコメントを歓迎します!

score 3 · Accepted Answer

nこれは、特定の文から単語を取得する私の試みです。

var regexp = /\s*\S+\/;
function truncateToNWords(s, n) {
   var l=0;
   if (s == null || n<= 0) return l;
   for (var i=0; i<n && (match = regexp.exec(s)) != null; i++) {
      s = s.substring(match[0].length);
      l += match[0].length;
   }
   return l;
}

// your sentence
var s = "one two, three, quatro 5!: six sept  ocho nine 10!";

l = truncateToNWords(s, 2);
console.log([s.substring(0, l), s.substring(l)]);

l = truncateToNWords(s, 5);
console.log([s.substring(0, l), s.substring(l)]);

l = truncateToNWords(s, 6);
console.log([s.substring(0, l), s.substring(l)]);

l = truncateToNWords(s, 20);
console.log([s.substring(0, l), s.substring(l)]);

出力：

["one two,", " three, quatro 5!: six sept ocho nine 10!"]
["one two, three, quatro 5!:", " six sept ocho nine 10!"]
["one two, three, quatro 5!: six", " sept ocho nine 10!"]
["one two, three, quatro 5!: six sept ocho nine 10!", ""]

javascript - 最大 X 単語の後にテキストを分割する

2 に答える 2

出力：

Related

Reference