Project Apero is a comprehensive web application—originally developed as a university project—designed to help individuals assess their stroke risk and receive personalized recommendations. It leverages data analysis, machine learning, and a conversational chatbot interface to provide actionable health insights.
- Project Description
- Screenshots
- Key Features
- Installation
- Data Overview
- Usage
- Project Structure
- Disclaimer
- Repository Visualization
- License
- Contact
Project Apero provides an intuitive interface for users to input personal health metrics—such as age, gender, BMI, blood glucose levels, etc.—and receive an estimated probability of experiencing a stroke. It then offers evidence-based recommendations and insights to help reduce stroke risk.
The application is built using:
- Streamlit for the web front end
- Scikit-Learn for machine learning model training and prediction
- Rasa for an integrated chatbot
- Docker for containerization and ease of deployment
Why "Apero"?
The project name is derived from a playful acronym referencing a healthy approach to “Assessing Personal Event Risks Online,” aiming to make preventative health measures more approachable.
-
Personalized Stroke Risk Assessment
- Input Health Metrics: Users provide personal data (e.g., age, gender, BMI, blood glucose levels, and more).
- Risk Probability: The machine learning model calculates a stroke risk score (0.0 to 1.0).
- Actionable Recommendations: Users receive suggestions tailored to their risk factors.
-
Interactive Data Analysis & Visualization
- Data Exploration: Filter and explore health-related metrics to reveal trends and correlations.
- Visual Insights: Interactive charts, heatmaps, and summary statistics enhance comprehension of stroke risk factors.
-
Conversational Chatbot Assistance
- Real-Time Queries: A Rasa-powered chatbot answers questions about stroke risk, data insights, and wellness tips.
- Guided Interaction: The chatbot can direct users to relevant pages, simplify data analysis steps, and deliver personalized feedback.
-
Robust Data Handling and Augmentation
- Outlier Detection: Statistical methods to detect and remove anomalies.
- Synthetic Data Generation: Augments the dataset by ~30% to improve model performance and generalization.
- Docker
Make sure Docker is installed and running on your machine. - Git
Required to clone the repository. - Python 3.9+ (Optional)
If you want to run or modify components without Docker.
-
Clone the Repository
git clone https://github.com/HlexNC/Project-Arepo.git cd Project-Arepo
-
Build and Run Docker Services
docker-compose up --build
- This command builds Docker images, trains machine learning models, and starts all necessary services.
- The initial startup may take several minutes as the environment sets up.
-
Access the Web Application
- Open a web browser and go to: http://localhost:8501.
- You will see the main page with navigation options for data analysis, personalized recommendations, and chatbot interaction.
Tip
If you want to run or modify components without Docker, refer to the comments in the docker-compose.yml
or optional instructions in the docs/
folder (if provided).
If you make changes to the chatbot’s training data or if the chatbot fails to respond:
-
Open a terminal inside the Rasa server container
docker exec -it rasa_server bash
-
Train the Rasa Model
rasa train
-
Restart Services
exit docker-compose down docker-compose up --build -d
We utilize the publicly available Stroke Prediction Dataset. This dataset includes key health metrics:
- Age, BMI, Glucose Levels
- Hypertension, Heart Disease
- Smoking Status, etc.
Note
This dataset is used strictly for educational and demonstration purposes.
- Outlier Detection: Employs a Z-score based method to filter anomalous data points.
- Synthetic Data Augmentation: Adds ~30% realistic synthetic data to boost model robustness and representation.
- Feature Scaling: Numerical features are normalized or standardized, aligning with best practices from Google’s ML Crash Course.
-
Navigate to "Personalized Recommendations"
From the sidebar, select Personalized Recommendations. -
Provide Personal Health Metrics
Enter details such as age, gender, BMI, glucose levels, hypertension status, etc. -
Receive Risk Probability & Recommendations
The system displays a stroke risk (0.0 to 1.0) and personalized tips (e.g., diet, exercise, medical follow-up).
-
Go to "Data Analysis"
Choose Data Analysis from the sidebar. -
Explore and Visualize
- Filter or query the dataset to inspect correlations and distributions.
- View generated charts, tables, or heatmaps.
-
Model Training & Evaluation
Train or re-train the underlying machine learning models (Logistic Regression, Random Forest, SVM) within the app. Evaluate performance via metrics like Accuracy, Precision, Recall, etc.
-
Select "Chatbot"
Click on Chatbot in the sidebar to open the conversational assistant interface. -
Interact with the Rasa-Powered Chatbot
- Ask questions about stroke risk or how to interpret certain metrics.
- Get immediate recommendations and clarifications on data analysis results.
Below is a simplified overview of the repository layout:
Project-Arepo/
├── actions/
│ ├── actions.py # Custom Rasa actions
│ ├── Dockerfile
│ └── requirements-actions.txt
├── data/
│ ├── data_analysis.py
│ ├── data_augmentation.py
│ ├── data_loader.py
│ ├── data_preprocessor.py
│ ├── raw/
│ │ └── healthcare-dataset-stroke-data.csv
│ └── processed/
├── src/
│ ├── app.py # Streamlit main entry point
│ ├── chatbot/
│ │ └── rasa_chatbot.py
│ └── web/
│ ├── home.py
│ ├── recommendations.py
│ ├── chatbot_page.py
│ └── data_analysis_page.py
├── models/
├── docs/
│ ├── img/
│ │ └── ASP_Banner.png
│ └── ...
├── docker-compose.yml
├── Dockerfile
├── requirements.txt
├── LICENSE
└── README.md
Warning
Project Apero is not a substitute for professional medical advice, diagnosis, or treatment.
The stroke risk assessments and recommendations are for educational and research purposes only. Always seek the advice of qualified healthcare professionals for any medical concerns.
Use of this application is entirely at your own risk, and the developers assume no liability for any actions taken based on its output.
Below is an automatically generated repository structure diagram. The diagram is updated whenever changes are pushed to the main branch:
This project is licensed under the GNU General Public License v3.
See the LICENSE file for details.
For inquiries, feedback, or suggestions:
- GitHub Issues: Project Apero Repository
- Email: Please refer to the repository maintainers’ profiles.
We welcome contributions! Feel free to open a pull request or start a discussion in our GitHub repository.