I have a list of bigrams of a sentence and another original list of relevantbigrams, I want to check that if any of the relevantbigrams are present in the sentences then I want to return the sentence. I was thinking of implementing it as follows: map each of the bigrams in the list to the sentence they come from then do a search on the key an return the value.
example:
relevantbigrams = (This is, is not, not what)
bigrams List(list(This of, of no, no the),list(not what, what is))
So each list is a bigram of separate sentences. Here "not what" from the second sentence matches, so I would like to return the second sentence. I am planning to have a map of Map("This of" -> "This of no the", "of no" ->"This of no the", "not what"->"not what is"). etc. and return the sentences that match on relevant bigram, so here I return "not what is"
This is my code:
val bigram = usableTweets.map(x =>Tokenize(x).sliding(2).flatMap{case Vector(x,y) => List(x+" "+y)}.map(z => z, x))
for(i<- 0 to relevantbigram.length)
if(bigram.contains(relevantbigram(i)))) bigram.get(relevantbigram(i))
else useableTweets.head