これは、単語の頻度を計算するための私のコードです
word_arr= ["I", "received", "this", "in", "email", "and", "found", "it", "a", "good", "read", "to", "share......", "Yes,", "Dr", "M.", "Bakri", "Musa", "seems", "to", "know", "what", "is", "happening", "in", "Malaysia.", "Some", "of", "you", "may", "know.", "He", "is", "a", "Malay", "extra horny", "horny nor", "nor their", "their babes", "babes are", "are extra", "extra SEXY..", "SEXY.. .", ". .", ". .It's", ".It's because", "because their", "their CONDOMS", "CONDOMS are", "are Made", "Made In", "In China........;)", "China........;) &&"]
arr_stop_kwd=["a","and"]
frequencies = Hash.new(0)
word_arr.each { |word|
if !arr_stop_kwd.include?(word.downcase) && !word.match('&&')
frequencies["#{word.downcase}"] += 1
end
}
100k のデータがある場合、9.03 秒かかります。それは、他の方法で計算できますか?
事前にThx