-
-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT, BUG, TST: incremental API cleaning #406
Conversation
and change it in the docs too
* Rename history_results_ => history_ * Provide complete model history, and make it public (otherwise boilerplate needed to formulate model_history_ from history_, looping over items in history and putting in dict, {model_id: hist})
This mirrors scikit-learn's cv_results_, with a one important distinction: this implementation only test on 1 training set. This means that there's a `test_score` key, not `mean_test_score`, or `test_score0`.
Before, BaseIncrementalSearchCV assumed _additional_calls returned one model and returned that to the user. Now, BaseIncrementalSearchCV chooses the model with the highest score returned by _additional_calls. This matters if desired to do a random search, or if `max_iter` is hit.
* MAINT: cleaner separation with _adapt and _stop_on_plateau functions (separates complex adaptive algorithm and stopping on plateau, and allows for overwriting _adapt for other adaptive algorithms that want to stop on plateau) * TST: implement tests for patience and tolerance parameters * MAINT: define "patience" to be the number of partial_fit calls, not the number of score calls
Merged master. |
How should we proceed here? I'd like to do a release sooner rather than later, but I'm not sure how much time we'll have to review (let alone work) on this PR in the short-term. Personally, I think that if we do decide to change the default search strategy, we'll be able to do it with a deprecation cycle (if we deem that necessary). So my main concern with releasing master as is, is if the default in master is "bad" and turns people off this estimator. I'm not especially worried about that though, so my vote would be to release soon, and update this after the relesae. |
There aren't many user-facing changes in this PR. I think it updating this could be a point release. It improves on the implementation ( The largest user facing issue is with the bug this PR solves (it removes an infinite loop if |
I think that infinite loop was fixed in #373 |
I don't think so. It's with |
Gotcha. I'll try splitting that bug fix out.
…On Tue, Oct 16, 2018 at 2:46 PM Scott Sievert ***@***.***> wrote:
I think that infinite loop was fixed in #373
<#373>
I don't think so. It's with decay_rate specifically, and involves the
property that for all x > 0, 1 / x**0 == 1.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#406 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIn_NtW4dfu3dE7cJgf2670s-rj2Aks5uljeDgaJpZM4XbkCU>
.
|
I don't think this PR fixes the diff --git a/tests/model_selection/test_incremental.py b/tests/model_selection/test_incremental.py
index 5d828b7..c52b06a 100644
--- a/tests/model_selection/test_incremental.py
+++ b/tests/model_selection/test_incremental.py
@@ -196,9 +196,10 @@ def test_explicit(c, s, a, b):
@gen_cluster(client=True)
def test_search_basic(c, s, a, b):
for decay_rate in {0, 1}:
- _test_search_basic(decay_rate, c, s, a, b)
+ yield _test_search_basic(decay_rate, c, s, a, b)
+@gen.coroutine
def _test_search_basic(decay_rate, c, s, a, b):
X, y = make_classification(n_samples=1000, n_features=5, chunks=(100, 5))
model = SGDClassifier(tol=1e-3, loss="log", penalty="elasticnet") |
Alrighty, I had a look and I think these changes look good. I like thinking about Planning to merge this evening. |
Thanks @stsievert! |
This PR
IncrementalSearchCV
where it would be stuck in an infinite loop whendecay_rate=0
.patience
to be the number of calls topartial_fit
tol
patience
testThis PR depends on #404. A clean diff can be found at stsievert#2.