Skip to content

Latest commit

 

History

History

spam-mail-detection-w-tensorflow-distilbert

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Spam Mail Detection w/Tensorflow (DistilBERT)

(kaggle link -> https://www.kaggle.com/code/banddaniel/spam-mail-detection-w-tensorflow-distilbert)

I tried to predict a spam mail with finetuning a DistilBert based Tensorflow model.

  • I applied several preprocessing operations (cleaning,dropping stop words),
  • Used tf.data pipeline for efficient training,
  • I only used only 20 max length for sequence length (bert models support up to 512 input lengths),
  • Only 18000 samples be used for training (12000 samples for validating and 20000 samples for testing),
Screenshot 2024-03-14 at 8 45 08 PM

My Another Projects

References

  1. https://towardsdatascience.com/hugging-face-transformers-fine-tuning-distilbert-for-binary-classification-tasks-490f1d192379
  2. https://www.kaggle.com/code/preatcher/emotion-detection-by-using-bert