This project is a Streamlit application that detects whether an SMS message is spam or non-spam using a pre-trained deep learning model. The model was trained on SMS data and uses natural language processing techniques for prediction. The application provides an intuitive interface for users to input messages and view predictions.
- Interactive Interface: Enter SMS messages to classify them as spam or non-spam.
- Real-time Predictions: Displays the predicted label along with the confidence score.
- Visualization: Pie chart showing the proportion of spam vs non-spam predictions.
- Prediction History: Maintains a log of all predictions made during the session.
- Custom Styling: User-friendly interface with CSS enhancements.
You can try the application live:
- Docker Deployment on Hugging Face Spaces: SMS Spam Detector - Docker
- Streamlit Cloud Deployment: SMS Spam Detector - Streamlit
- Python 3.8 or later
- pip
- A GPU-enabled machine (optional but recommended for TensorFlow)
-
Clone the repository:
git clone https://github.com/<your-username>/sms-spam-detector.git cd sms-spam-detector
-
Create a virtual environment:
python -m venv env source env/bin/activate # On Windows: .\env\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Download the pre-trained model and tokenizer:
- Place the
model_wordembed.keras
file in themodels/
directory. - Place the
tokenizer_word_index.npy
file in the same directory.
- Place the
-
Run the Streamlit app:
streamlit run app.py
-
Access the app in your browser at
http://localhost:8501
.
├── app.py # Main Streamlit application
├── models/ # Directory for the model and tokenizer
│ ├── model_wordembed.keras
│ ├── tokenizer_word_index.npy
├── requirements.txt # Python dependencies
└── README.md # Project documentation
- Start the application by running
streamlit run app.py
. - Enter a message in the text box.
- Click the "Predict" button to view the classification result.
- Check the visualization for the proportion of predictions.
- Framework: Streamlit
- Machine Learning: TensorFlow, Keras
- Visualization: Plotly
- NLP: Tokenizer, Embedding layers
- Add support for additional languages.
- Include training scripts for fine-tuning the model.
- Enhance visualizations with detailed analytics.
- Allow users to choose between multiple pre-trained models within the application.
This project is licensed under the MIT License - see the LICENSE file for details.
- Streamlit: Licensed under the Apache 2.0 License. For details, see Streamlit's GitHub repository.
- TensorFlow: Licensed under the Apache 2.0 License. For details, see TensorFlow's license.
- Plotly: Licensed under the MIT License. For details, see Plotly's GitHub repository.