-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Shiv Prasad edited this page Jan 19, 2021
·
1 revision
For a given text, retrieve the associated topic and top N similar texts by Topic Modelling approach (LDA).
-
For a given set of documents:
- Find the ideal model parameters for topic modelling (LDA) i.e. number of topics, learning decay.
- Generate document-word matrix with weightage of each word.
- Generate topic-word matrix with number of words limited to each topic.
-
Predict:
- For a given text, retrieve the best topic.
- Get the dominant word in the predicted topic.
- Dominant word ultimately is the topic tag
-
Get similar documents:
- For a given text, derive distance with all documents.
- Get the top N documents based on distance.
- Predict a topic (dominant word associated to the topic) for a given text.
- Find N similar texts for a given text and documents.