I have a textarea with unfiltered user input, which includes line returns, spaces, punctuation marks, etc. I would like to get all the distinct lowercased words, their occurrence, sorted by occurrence. I haven't found a straight forward way to extract words when the strip() string is variable. Any ideas how to achieve this?
For example:
WORD1 Word2 word1 Word1, ...
word2 HELLO ...
. . hello .hi
would become
val array = {
word1 : 3,
word2 : 2,
hello : 2,
hi : 1
};
Thanks for your help!