Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the link in dask documentation #479

Closed
wants to merge 160 commits into from
Closed

Fix the link in dask documentation #479

wants to merge 160 commits into from

Conversation

MichaelSchroter
Copy link

Hi All,

I have found a page that would be similar to the missing link in #478. You could find it here.

Hope this is of any value.

Thanks

Michael

mrocklin and others added 30 commits September 10, 2018 08:11
Currently builds are using an older theme with some errors
* TST: Try numba RC

* Remove RC
* Remove remaining notebooks

* Updated examples
* DOC: Added IncrementalSearch to the api docs
* Adds pip upgrade to CI

* Set max version number for testpath

* Format with new release 18.9b0 of black

* Add LogisticRegression solver to fix docs build

* Removes filterwarnings from setup.cfg
* Support dataframes for k-means

Fixes #390
* Support dataframes in _partial.py::fit/predict

Previously these functions would fail on dask dataframes.
Now they coerce to dask arrays, and predict also converts back
* Don't use auto chunking with unknown chunk sizes

* add test
Previously we would pass around Estimator.predict methods.  These methods are
opaque to serialization heuristics used in dask.distributed that are used to
determine what should move and how to serialize it.

Now we pass around bare functions that take in estimators as parameters.

* switch out transform as well
* Allow compute=False in ParallelPostFit.score

* cleanup tests
* Rename history_results_ => history_
* Provide complete model history, and make it public
  (otherwise boilerplate needed to formulate model_history_ from history_,
  looping over items in history and putting in dict, {model_id: hist})
This mirrors scikit-learn's cv_results_, with a one important distinction: this implementation only test on 1 training set.

This means that there's a `test_score` key, not `mean_test_score`,
or `test_score0`.
Before, BaseIncrementalSearchCV assumed _additional_calls
returned one model and returned that to the user.

Now, BaseIncrementalSearchCV chooses the model with the highest
score returned by _additional_calls.

This matters if desired to do a random search, or if `max_iter` is hit.
* MAINT: cleaner separation with _adapt and _stop_on_plateau functions
  (separates complex adaptive algorithm and stopping on plateau,
   and allows for overwriting _adapt for other adaptive algorithms
   that want to stop on plateau)
* TST: implement tests for patience and tolerance parameters
* MAINT: define "patience" to be the number of partial_fit calls, not
  the number of score calls
stsievert and others added 26 commits February 19, 2019 21:17
MAINT: add distributed as a dependency
Change `da.atop` that has been replaced by `da.blockwise` in
Add CI job for oldest supported dependencies
* Add drop option to OneHotEncoder

* Update QuantileTransformer internals

* Fix commented out code

* Remove print lines in test

* Add sklearn version check for OneHotEncoder

* Add allowed tolerance for QuantileTransformer test

* Update OneHotEncoder drop sklearn version to 0.21.0

* Increase test data size for TestQuantileTransformer

* Increase QuantileTransformer test coverage

* Include transform in test
* update indexable() to just yield dask dataframes, as mentioned in issue #324
* Fix `high is out of bounds for int32` for k_means

Fixes #378
@TomAugspurger
Copy link
Member

TomAugspurger commented Mar 11, 2019 via email

@TomAugspurger
Copy link
Member

Superseded by #483

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.