-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SGD #118
Add SGD #118
Conversation
I should probably have mentioned this before, but we strictly follow the PEP8 style guidelines for our code. I know you're using PyCharm, and there's supposed to be a way to enable PEP8 checking while you're editing. |
@@ -35,6 +35,7 @@ | |||
from sklearn.preprocessing import StandardScaler | |||
from sklearn.svm.base import BaseLibLinear | |||
from sklearn.utils import shuffle as sk_shuffle | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for this extra blank line here.
To ensure that there are unit tests that use I also just realized you haven't add a @rescaled
class RescaledSGDRegressor(SGDRegressor):
pass All that's different with the |
Not that it has to be done for this PR, but one thing we also talked about as necessary for really being able to use |
and added the Rescaled SGDRegressor
@@ -6,7 +6,7 @@ task=evaluate | |||
train_location= | |||
test_location= | |||
featuresets=[["test_sparse"]] | |||
learners=['LogisticRegression', 'LinearSVC', 'SVC', 'MultinomialNB', 'DecisionTreeClassifier', 'RandomForestClassifier', 'GradientBoostingClassifier'] | |||
learners=['LogisticRegression', 'LinearSVC', 'SVC', 'MultinomialNB', 'DecisionTreeClassifier', 'RandomForestClassifier', 'GradientBoostingClassifier', 'SGDClassifier''] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've got an extra '
at the end of this line before the ]
that is causing this unit test to fail. Take that out and everything should be fine, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have corrected that error.
Hope that now works.
@@ -427,13 +444,13 @@ def __init__(self, model_type, probability=False, feature_scaling='none', | |||
elif self._model_type == 'SVR': | |||
self._model_kwargs['cache_size'] = 1000 | |||
self._model_kwargs['kernel'] = 'linear' | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like somewhere around here you're going to want to set self._model_kwargs['loss']
to either log
or modified_huber
for SGDClassifier
, because getting probabilities is only supported with those loss types, and we typically want the prediction probabilities. It's currently failing unit tests for this reason, as you can see here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oks.
Added. So now, our default loss function will be ‘log’
SGD Classifier and Regressor added
The default parameters are: