Task:
Using Automated Surface Observation System data from NCDC, build a supervised classification algorithm to predict frozen vs. liquid precipitation type.
Notebooks:
DataProcessing
: Parses the METAR data and stores readings for all yeares in a single csv file.LogisticRegression
: Baseline model; scikit-learn'sLogisticRegression
with default parameters.GradientBoostedDecisionTree
: Implementation of sckikit-learn'sGradientBoostingClassifier
including hyperparameter optimisation.RandomForest
: Implementation of sckikit-learn'sRandomForestClassifier
including hyperparameter optimisation.
Results:
(All results are for test data)
Baseline model Brier Skill Score: 0.8607
Gradient Boosted Decision Tree Brier Skill Score: 0.9821
Random Forest Brier Skill Score: 0.9845
The best model was the Random Forest, with an improvement of 0.1238 in Brier Skill Score over the basline model!
Our presentations slideshow can be found here