You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can Flair be used to train a classifier with data from only one class to predict the likelihood that new text belongs to that class? I currently utilize a two class classifier that differentiates between my target documents and a random assortment of Wikipedia articles as the second class. However, this method seems wrong, as it requires generating an exhaustive list of counterexamples. I think modeling this as an anomaly detection problem be more appropriate?
The text was updated successfully, but these errors were encountered:
Hi @quantarb
Flair doesn't have a Anomaly Detection model supported. I think the 2-class aproach is already a good solution, if you combine with with a sampling strategy:
train a classifier with all positive examples you have + a few negative that you have choosen by hand
predict the whole corpus or a subset that is large enough. Sort by the confidence of the model (highest conf for anomaly) and manually label the first N (I would take like 100) as anomaly/not-anomaly.
if the new labeled examples contain too many not-anomalies, start at step 1 again.
However if you don't find that sufficient, I suppose you will be happier with aproaches that are not supported here and might do more research.
Question
Can Flair be used to train a classifier with data from only one class to predict the likelihood that new text belongs to that class? I currently utilize a two class classifier that differentiates between my target documents and a random assortment of Wikipedia articles as the second class. However, this method seems wrong, as it requires generating an exhaustive list of counterexamples. I think modeling this as an anomaly detection problem be more appropriate?
The text was updated successfully, but these errors were encountered: