-
-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the link in dask documentation #479
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Currently builds are using an older theme with some errors
* TST: Try numba RC * Remove RC
* Remove remaining notebooks * Updated examples
* DOC: Added IncrementalSearch to the api docs
* Adds pip upgrade to CI * Set max version number for testpath * Format with new release 18.9b0 of black * Add LogisticRegression solver to fix docs build * Removes filterwarnings from setup.cfg
* Support dataframes for k-means Fixes #390
* Support dataframes in _partial.py::fit/predict Previously these functions would fail on dask dataframes. Now they coerce to dask arrays, and predict also converts back
* Don't use auto chunking with unknown chunk sizes * add test
Previously we would pass around Estimator.predict methods. These methods are opaque to serialization heuristics used in dask.distributed that are used to determine what should move and how to serialize it. Now we pass around bare functions that take in estimators as parameters. * switch out transform as well
* Allow compute=False in ParallelPostFit.score * cleanup tests
and change it in the docs too
* Rename history_results_ => history_ * Provide complete model history, and make it public (otherwise boilerplate needed to formulate model_history_ from history_, looping over items in history and putting in dict, {model_id: hist})
This mirrors scikit-learn's cv_results_, with a one important distinction: this implementation only test on 1 training set. This means that there's a `test_score` key, not `mean_test_score`, or `test_score0`.
Before, BaseIncrementalSearchCV assumed _additional_calls returned one model and returned that to the user. Now, BaseIncrementalSearchCV chooses the model with the highest score returned by _additional_calls. This matters if desired to do a random search, or if `max_iter` is hit.
* MAINT: cleaner separation with _adapt and _stop_on_plateau functions (separates complex adaptive algorithm and stopping on plateau, and allows for overwriting _adapt for other adaptive algorithms that want to stop on plateau) * TST: implement tests for patience and tolerance parameters * MAINT: define "patience" to be the number of partial_fit calls, not the number of score calls
remove failing tests
MAINT: add distributed as a dependency
Change `da.atop` that has been replaced by `da.blockwise` in
Add CI job for oldest supported dependencies
Minor updates to .gitignore
* Add drop option to OneHotEncoder * Update QuantileTransformer internals * Fix commented out code * Remove print lines in test * Add sklearn version check for OneHotEncoder * Add allowed tolerance for QuantileTransformer test * Update OneHotEncoder drop sklearn version to 0.21.0 * Increase test data size for TestQuantileTransformer * Increase QuantileTransformer test coverage * Include transform in test
Looks like there are unrelated commits in this PR. You probably need to
fetch upstream, merge master, and push again.
…On Sun, Mar 10, 2019 at 6:02 PM MichaelSchroter ***@***.***> wrote:
Hi All,
I have found a page that would be similar to the missing link in #478
<#478>. You could find it here
<https://scikit-learn.org/0.15/modules/scaling_strategies.html>.
Hope this is of any value.
Thanks
Michael
------------------------------
You can view, comment on, or merge this pull request online at:
#479
Commit Summary
- update dask-sphinx-theme (#361)
- TST: Try numba RC (#363)
- Update README.rst
- RLS: 0.10.0
- ENH: IncrementalSearch (#356)
- Examples update (#369)
- Add IncrementalSearch to api.rst (#371)
- Adds sklearn version check for ColumnTransformer import (#374)
- [skip ci] Change links to dask.org (#375)
- Auto-rechunk input arrays (#377)
- Fix CI issues (#382)
- dask.pydata.org -> dask.org
- Update joblib documentation for scikit-learn 0.20 (#387)
- Bump sklearn min version to 0.20 (#392)
- Support dataframes for k-means (#393)
- Fixes ShuffleSplit random seed generation bug (#381)
- Support dataframes in _partial.py::fit/predict (#395)
- Don't use auto chunking with unknown chunk sizes (#398)
- Pass models in predict rather than their methods (#400)
- Allow compute=False in ParallelPostFit.score (#402)
- API: rename IncrementalSearch => IncrementalSearchCV
- API: rename history_results_, and format differently
- API: add cv_results_
- MAINT: allow _additional_calls to return multiple models
- [MRG+1] Poly trans: Issue #347 (#367)
- TST, MAINT: clean stopping on plateau (see notes below)
- BUG: ∞ loop in IncrementalSearchCV if decay_rate=0
- TST: perform basic search (decay_rate=0) in test_search_basic
- MAINT: collapse _adapt and _stop_on_plateau into one function
- Fix test formatting
- Closes #385 (#407)
- DOC: update changelog
- Use scipy to rank
- Replace cv_results asserts with sanity checks
- IncrementalSearch edge cases (#373)
- Filter warning from dask dataframe concat (#408)
- Merge branch 'search-api' into search-bug
- TST: if passive, return highest scoring model else sanity checks
- API: improve and clean IncrementalSearch API (#404)
- DOC: IncrementalSearchCV (#405)
- Merge remote-tracking branch 'upstream/master' into
stsievert-search-bug
- Merge remote-tracking branch 'upstream/master' into whatsnew
- lint
- fix link
- DOC: update changelog (#409)
- failing test
- maybe fix?
- Merge pull request #411 from TomAugspurger/decay-loop
- Merge remote-tracking branch 'upstream/master' into
stsievert-search-bug
- run tests
- Merge pull request #406 from stsievert/search-bug
- doc fixup
- Merge pull request #413 from TomAugspurger/doc-fixup
- RLS: 0.11.0
- typo [skip ci]
- Merge pull request #414 from TomAugspurger/typo
- Replace get with scheduler
- Roll back changes to test_scheduler_param
- try pinning CPython
- fix warnings
- Specify dask array chunksizes
- Merge pull request #418 from jrbourbeau/fix_get_tests
- Bug-Fix in Polynomial-Features
- make transformer params more general
- Minor LogisticRegression updates
- Merge pull request #422 from jrbourbeau/logistic_reg_cleanup
- Merge pull request #417 from datajanko/poly-bug-fix
- API: Lazy score, predict for IncrementalSearchCV
- lint
- Add version number to conf.py
- Use X.Y.Z version format
- Merge pull request #426 from jrbourbeau/docs_version
- Fix typo in docstring
- Merge pull request #429 from rmsare/docs-split-typo
- Bug: Handle a value not being passed for Y in euclidian distance.
- RFC: Adjusts regex test
- STY: Flake8
- RFC: Adjusts for pytest 4.0.0
- BUG: Changes order for make_column_transformer
- format conf.py
- ignore conf.py
- RFC: Comment out version
- RFC: Resolves FutureWarning
- Merge pull request #431 from thomasjpfan/issue/427
- allow_unknown_chunksizes=True in
dask_ml.compose.ColumnTransformer._hstack
- Use high-level graphs
- Include our dsk in the graphs
- avoid assert_true
- Changes for PR 437 comments.
- Trigger CI
- Merge pull request #439 from TomAugspurger/ci-fix
- Merge remote-tracking branch 'upstream/master' into ZEFR-INC-master
- black formatting
- isort
- removed dead code path, updated test to provide confirmation.
- updated test with sklearn 1-D check
- Merge pull request #437 from ZEFR-INC/master
- COMPAT: fix warning
- Merge pull request #441 from TomAugspurger/collections-warning
- Merge remote-tracking branch 'upstream/master' into
asgersoerensen-patch-1
- Added test for no Y
- Merge remote-tracking branch 'upstream/master' into ppf-incsearchcv
- lint
- Merge pull request #430 from asgersoerensen/patch-1
- Merge pull request #424 from TomAugspurger/ppf-incsearchcv
- fix typo: model_selectoin -> model_selection (#442)
- Update preprocessing.rst
- Merge pull request #445 from teoguso/master
- Updated joblib.rst
- Decrease n_splits
- Add SparseDtype case to test
- Merge pull request #450 from jrbourbeau/fix-sklearn-dev
- Merge pull request #449 from suamin/patch-1
- Fix input to sharedict.merge
- Merge pull request #455 from jrbourbeau/fix_partial_fit
- Add graphviz to dev environment
- Merge pull request #459 from jrbourbeau/add_graphviz
- Add oldest_supported CircleCI job
- Add conda list
- Add package uninstalls
- Update minimum versions
- Fix pandas SparseDtype failures
- "oldest supported" -> "earliest supported"
- Remove defaults conda channel
- Uninstall pypi numpy
- Remove type comments
- Update earliest scikit-learn from 0.20 to 0.20.0
- MAINT: loud warning if ImportError with IncrementalSearchCV
- TST: add circleci test for no distributed
- Revert "TST: add circleci test for no distributed"
- Pin NumPy version to avoid FutureWarnings in sklearn
- Update test_column_transformer
- TST: rework tests to standalone file
- Typo
- Change deprecated `da.atop` to `da.blockwise`
- Revert trying to make distributed a soft dependency
- MAINT: add distributed as a dependency
- MAINT: require specific version
- Change `da.atop` for `da.blockwise` in data.py
- remove failing tests
- Merge pull request #469 from TomAugspurger/test-fixup
- Merge branch 'master' into import-incremental
- Merge branch 'master' into patch-1
- Merge pull request #466 from stsievert/import-incremental
- Merge pull request #468 from jjerphan/patch-1
- Merge remote-tracking branch 'upstream/master' into update_ci
- Fix blockwise failing tests
- Add defaults channel back
- Move blockwise to _compat
- Sort imports with isort v4.3.8
- Merge pull request #461 from jrbourbeau/update_ci
- Minor updates to .gitignore [skip ci]
- Merge pull request #472 from jrbourbeau/update_gitignore
- RLS: 0.12.0
- Fix sklearn dev tests (#474)
- Fix imports for isort 4.3.10 (#476)
- update indexable() to just yield dask dataframes (issue #324) (#471)
- Fix #378: high is out of bounds for int32 for k_means (#462)
- Use scikit-learn nightly wheels (#477)
File Changes
- *M* .circleci/config.yml
<https://github.com/dask/dask-ml/pull/479/files#diff-0> (39)
- *M* .gitignore
<https://github.com/dask/dask-ml/pull/479/files#diff-1> (2)
- *M* .travis.yml
<https://github.com/dask/dask-ml/pull/479/files#diff-2> (1)
- *M* README.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-3> (4)
- *M* ci/environment-2.7.yml
<https://github.com/dask/dask-ml/pull/479/files#diff-4> (11)
- *M* ci/environment-3.6.yml
<https://github.com/dask/dask-ml/pull/479/files#diff-5> (17)
- *M* ci/install-circle.sh
<https://github.com/dask/dask-ml/pull/479/files#diff-6> (1)
- *M* dask_ml/__init__.py
<https://github.com/dask/dask-ml/pull/479/files#diff-7> (2)
- *M* dask_ml/_compat.py
<https://github.com/dask/dask-ml/pull/479/files#diff-8> (15)
- *M* dask_ml/_partial.py
<https://github.com/dask/dask-ml/pull/479/files#diff-9> (20)
- *M* dask_ml/_utils.py
<https://github.com/dask/dask-ml/pull/479/files#diff-10> (16)
- *M* dask_ml/cluster/__init__.py
<https://github.com/dask/dask-ml/pull/479/files#diff-11> (2)
- *M* dask_ml/cluster/k_means.py
<https://github.com/dask/dask-ml/pull/479/files#diff-12> (18)
- *M* dask_ml/cluster/spectral.py
<https://github.com/dask/dask-ml/pull/479/files#diff-13> (6)
- *M* dask_ml/compose/_column_transformer.py
<https://github.com/dask/dask-ml/pull/479/files#diff-14> (24)
- *M* dask_ml/decomposition/pca.py
<https://github.com/dask/dask-ml/pull/479/files#diff-15> (11)
- *M* dask_ml/decomposition/truncated_svd.py
<https://github.com/dask/dask-ml/pull/479/files#diff-16> (2)
- *M* dask_ml/feature_extraction/text.py
<https://github.com/dask/dask-ml/pull/479/files#diff-17> (16)
- *M* dask_ml/impute.py
<https://github.com/dask/dask-ml/pull/479/files#diff-18> (10)
- *M* dask_ml/linear_model/__init__.py
<https://github.com/dask/dask-ml/pull/479/files#diff-19> (6)
- *M* dask_ml/linear_model/glm.py
<https://github.com/dask/dask-ml/pull/479/files#diff-20> (30)
- *M* dask_ml/metrics/__init__.py
<https://github.com/dask/dask-ml/pull/479/files#diff-21> (7)
- *M* dask_ml/metrics/pairwise.py
<https://github.com/dask/dask-ml/pull/479/files#diff-22> (22)
- *M* dask_ml/metrics/regression.py
<https://github.com/dask/dask-ml/pull/479/files#diff-23> (2)
- *M* dask_ml/metrics/scorer.py
<https://github.com/dask/dask-ml/pull/479/files#diff-24> (18)
- *M* dask_ml/model_selection/__init__.py
<https://github.com/dask/dask-ml/pull/479/files#diff-25> (13)
- *M* dask_ml/model_selection/_incremental.py
<https://github.com/dask/dask-ml/pull/479/files#diff-26> (575)
- *M* dask_ml/model_selection/_search.py
<https://github.com/dask/dask-ml/pull/479/files#diff-27> (140)
- *M* dask_ml/model_selection/_split.py
<https://github.com/dask/dask-ml/pull/479/files#diff-28> (18)
- *M* dask_ml/model_selection/methods.py
<https://github.com/dask/dask-ml/pull/479/files#diff-29> (3)
- *M* dask_ml/model_selection/utils.py
<https://github.com/dask/dask-ml/pull/479/files#diff-30> (53)
- *M* dask_ml/model_selection/utils_test.py
<https://github.com/dask/dask-ml/pull/479/files#diff-31> (8)
- *M* dask_ml/naive_bayes.py
<https://github.com/dask/dask-ml/pull/479/files#diff-32> (2)
- *M* dask_ml/preprocessing/__init__.py
<https://github.com/dask/dask-ml/pull/479/files#diff-33> (25)
- *M* dask_ml/preprocessing/_encoders.py
<https://github.com/dask/dask-ml/pull/479/files#diff-34> (28)
- *M* dask_ml/preprocessing/data.py
<https://github.com/dask/dask-ml/pull/479/files#diff-35> (114)
- *M* dask_ml/preprocessing/label.py
<https://github.com/dask/dask-ml/pull/479/files#diff-36> (1)
- *M* dask_ml/wrappers.py
<https://github.com/dask/dask-ml/pull/479/files#diff-37> (141)
- *M* docs/source/changelog.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-38> (31)
- *M* docs/source/conf.py
<https://github.com/dask/dask-ml/pull/479/files#diff-39> (24)
- *M* docs/source/examples.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-40> (35)
- *A* docs/source/examples/.gitignore
<https://github.com/dask/dask-ml/pull/479/files#diff-41> (0)
- *D* docs/source/examples/tensorflow.ipynb
<https://github.com/dask/dask-ml/pull/479/files#diff-42> (410)
- *D* docs/source/examples/text-vectorization.ipynb
<https://github.com/dask/dask-ml/pull/479/files#diff-43> (202)
- *M* docs/source/hyper-parameter-search.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-44> (112)
- *M* docs/source/incremental.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-45> (25)
- *M* docs/source/index.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-46> (20)
- *M* docs/source/joblib.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-47> (70)
- *M* docs/source/modules/api.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-48> (18)
- *A*
docs/source/modules/generted/dask_ml.compose.ColumnTransformer.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-49> (16)
- *A*
docs/source/modules/generted/dask_ml.compose.make_column_transformer.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-50> (6)
- *M* docs/source/preprocessing.rst
<https://github.com/dask/dask-ml/pull/479/files#diff-51> (5)
- *M* setup.cfg
<https://github.com/dask/dask-ml/pull/479/files#diff-52> (6)
- *M* setup.py <https://github.com/dask/dask-ml/pull/479/files#diff-53>
(7)
- *M* tests/compose/test_column_transformer.py
<https://github.com/dask/dask-ml/pull/479/files#diff-54> (152)
- *M* tests/linear_model/test_stochastic_gradient.py
<https://github.com/dask/dask-ml/pull/479/files#diff-55> (2)
- *M* tests/metrics/test_metrics.py
<https://github.com/dask/dask-ml/pull/479/files#diff-56> (25)
- *M* tests/model_selection/dask_searchcv/test_model_selection.py
<https://github.com/dask/dask-ml/pull/479/files#diff-57> (9)
- *M*
tests/model_selection/dask_searchcv/test_model_selection_sklearn.py
<https://github.com/dask/dask-ml/pull/479/files#diff-58> (62)
- *M* tests/model_selection/test_incremental.py
<https://github.com/dask/dask-ml/pull/479/files#diff-59> (290)
- *M* tests/model_selection/test_split.py
<https://github.com/dask/dask-ml/pull/479/files#diff-60> (26)
- *M* tests/preprocessing/test_data.py
<https://github.com/dask/dask-ml/pull/479/files#diff-61> (108)
- *M* tests/preprocessing/test_encoders.py
<https://github.com/dask/dask-ml/pull/479/files#diff-62> (36)
- *M* tests/test_impute.py
<https://github.com/dask/dask-ml/pull/479/files#diff-63> (20)
- *M* tests/test_incremental.py
<https://github.com/dask/dask-ml/pull/479/files#diff-64> (12)
- *M* tests/test_kmeans.py
<https://github.com/dask/dask-ml/pull/479/files#diff-65> (15)
- *M* tests/test_parallel_post_fit.py
<https://github.com/dask/dask-ml/pull/479/files#diff-66> (42)
- *M* tests/test_partial.py
<https://github.com/dask/dask-ml/pull/479/files#diff-67> (25)
- *M* tests/test_pca.py
<https://github.com/dask/dask-ml/pull/479/files#diff-68> (52)
Patch Links:
- https://github.com/dask/dask-ml/pull/479.patch
- https://github.com/dask/dask-ml/pull/479.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#479>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIsGdUoeUomOwqctX4ASKBX2a5Ciwks5vVY8ggaJpZM4bnkAA>
.
|
Superseded by #483 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi All,
I have found a page that would be similar to the missing link in #478. You could find it here.
Hope this is of any value.
Thanks
Michael