Second work of UB course "Introduction to Machine Learning" implementing classification with Lazy Learning and SVM
Eva Veli, Andras Kasa and Niklas Long Schiefelbein
- PyCharm IDE (Professional or Community Edition)
- Python 3.9 installed on your system
-
Open the project
work2
in PyCharm -
Open the terminal in PyCharm (View > Tool Windows > Terminal)
-
Optional: Verify current location being
work2
bypwd
-
Optional: Navigate to
work2
withcd
-
Create a virtual environment:
# Windows py -3.9 -m venv venv # macOS/Linux python3.9 -m venv venv
-
Activate the virtual environment:
# Windows venv\Scripts\activate # macOS/Linux source venv/bin/activate
In front of the input line in the terminal it should now say (venv)
With the virtual environment activated:
pip install -r requirements.txt
From here you can directly jump to Run app.py
With the virtual environment activated:
deactivate
The (venv)
in front of the terminal should be gone
For this, just follow the optional steps 3 and 4 from the Manual Virtual Environment Setup
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate
In front of the input line in the terminal it should now say (venv)
python app.py
The first execution takes more time than usual due to the initial compilation of the whole project. Once compiled, it prompts the user to provide an input. The user must decide whether to use the hepatitis
or the pen-based
dataset for the analysis. By simply pressing enter, the hepatitis dataset will be selected by default.
Now the entire project pipeline will execute, including data preprocessing, KNN and SVM analyses, various reduction techniques, and final report generation. Progress is displayed in the console, but due to frequent calculations and multithreading, following along in real-time may be difficult. It is recommended to refer to the final reports for evaluation. The program completes once the nemenyi test report
is generated.
For deeper insights please consider reading the report of the project.
work2/
├── classifiers/ # SVM and KNN classifiers
├── csv-results/ # Performance metrics and results
├── datasetsCBR/ # Dataset files
├── metrics/ # Performance metric calculations
├── preprocessing/ # Data preprocessing scripts
├── reduction_techniques/ # Instance reduction algorithms
├── reporting/ # Reporting and analysis scripts
├── reports/ # Generated reports
├── venv/ # Virtual environment
├── app.py # Main application script
├── README.md # This file
├── requirements.txt # Dependencies
└── utils.py # Utility functions