node.js - 文字列Node JSの単語を分割してカウントする

Question

そのような文字列が 2 つある場合

s1 = "This is a foo bar sentence ."
s2 = "This sentence is similar to a foo bar sentence ."

そして、文字列をこの形式になるように分割したい

x1 = ["This":1,"is":1,"a":1,"bar":1,"sentence":1,"foo":1]
x2 = ["This":1,"is":1,"a":1,"bar":1,"sentence":2,"similar":1,"to":1,"foo":1]

文字列の単語を分割してカウントし、各文字列が単語を表し、数字が文字列内のこの単語の数を表すペアにします。

score 8 · Accepted Answer

句読点の削除、空白の正規化、小文字化、スペースでの分割、ループを使用した単語の出現回数のインデックスオブジェクトへのカウント。

function countWords(sentence) {
  var index = {},
      words = sentence
              .replace(/[.,?!;()"'-]/g, " ")
              .replace(/\s+/g, " ")
              .toLowerCase()
              .split(" ");

    words.forEach(function (word) {
        if (!(index.hasOwnProperty(word))) {
            index[word] = 0;
        }
        index[word]++;
    });

    return index;
}

または、ES6 アロー関数スタイルでは:

const countWords = sentence => sentence
  .replace(/[.,?!;()"'-]/g, " ")
  .replace(/\s+/g, " ")
  .toLowerCase()
  .split(" ")
  .reduce((index, word) => {
    if (!(index.hasOwnProperty(word))) index[word] = 0;
    index[word]++;
    return index;
  }, {});

node.js - 文字列Node JSの単語を分割してカウントする

1 に答える 1

Related

Reference