This project implements a linear regression model to predict house prices based on various features.
The project contains the following files and directories:
data/
: Directory containing the dataset file(s).lin_utils.py
: Utility module with functions for training and testing the linear regression model.train.py
: Script to train the linear regression model.test.py
: Script to test the trained model on new data.
- Python 3.x
- pandas 1.2.4
- numpy 1.23.5
- matplotlib 3.5.1
-
Model Training: Run the
train.py
script to train the linear regression model as well as prepare training data.python train.py
-
Model Testing: After training the model, run the
test.py
script to test the trained model on new data and evaluate its performance.python test.py
The project expects the dataset file data.csv
to be present in the data/
directory. The dataset should include the following columns:
sqft_lot
: Square footage of the lot.sqft_living
: Square footage of the living area.bathrooms
: Number of bathrooms.bedrooms
: Number of bedrooms.condition
: Condition rating of the house.price
: Sale price of the house.
The dataset was orginally sourced from Kaggle and is availible for download here.
After training the model for 50 iterations with an initial learning rate of 2.0, and momentum coeffient of 0.9, the model achived a Mean Percent Difference of 28.9% when compared to true values in the training set and 27.1% for the test set(as of 05/29/2023)
Contributions to this project are welcome. If you have any suggestions, improvements, or bug fixes, please submit a pull request.
This project is licensed under the MIT License.