[Question]: Anomaly Detection / One Class Classification #3411

quantarb · 2024-02-27T19:14:30Z

Question

Can Flair be used to train a classifier with data from only one class to predict the likelihood that new text belongs to that class? I currently utilize a two class classifier that differentiates between my target documents and a random assortment of Wikipedia articles as the second class. However, this method seems wrong, as it requires generating an exhaustive list of counterexamples. I think modeling this as an anomaly detection problem be more appropriate?

helpmefindaname · 2024-03-01T09:50:30Z

Hi @quantarb
Flair doesn't have a Anomaly Detection model supported. I think the 2-class aproach is already a good solution, if you combine with with a sampling strategy:

train a classifier with all positive examples you have + a few negative that you have choosen by hand
predict the whole corpus or a subset that is large enough. Sort by the confidence of the model (highest conf for anomaly) and manually label the first N (I would take like 100) as anomaly/not-anomaly.
if the new labeled examples contain too many not-anomalies, start at step 1 again.

However if you don't find that sufficient, I suppose you will be happier with aproaches that are not supported here and might do more research.

quantarb added the question Further information is requested label Feb 27, 2024

jeffpicard mentioned this issue Jul 8, 2024

[Feature]: Anomaly Detection / One Class Classification #3496

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question]: Anomaly Detection / One Class Classification #3411

[Question]: Anomaly Detection / One Class Classification #3411

quantarb commented Feb 27, 2024

helpmefindaname commented Mar 1, 2024

[Question]: Anomaly Detection / One Class Classification #3411

[Question]: Anomaly Detection / One Class Classification #3411

Comments

quantarb commented Feb 27, 2024

Question

helpmefindaname commented Mar 1, 2024