extract similar terms Solr -


I want to remove from a data set, all of which are the same terms and then ask the barrier to deny them Are there. for example.

For the index set, how would I tell that BlackBerry and Nokia have two similar words. Or say there are two common things

can it be obtained through solr? This is not synonyms but I need to get the obstacle to equality.

Definitely you are looking for the exact situation, but you can check it out.
Mahout provides support for topic modeling, which will help you group groups from your dataset.

A theme model, roughly a hierarchical biiscience model, with each document One possibility connects the possibility distribution to the "subject", for example, in the newswire collection, a document about "sports", such as "baseball", "home run", "player", and steroid use About Word Include May be. Baseball may include "sports", "drugs", and "politics", note that labels "sports", "drugs", and "politics" are human-defined post-hoc labels, and that the algorithm itself is an associate Specifies the word with the probability. In these models, the parameter assessment work is to learn both subjects, and in what format they are used in the ratio.

So if you have documents for mobile, you will find a group of words with blackberry, iphone, mobile and more.
These can not be similar posts but they will be related to the same subject.

Comments

Popular posts from this blog

excel vba - How to delete Solver(SOLVER.XLAM) code -

github - Teamcity & Git - PR merge builds - anyway to get HEAD commit hash? -

ios - Replace text in UITextView run slowly -