Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add facility for adding a data front-end to a model implementation #76

Merged
merged 6 commits into from
Jan 4, 2021

Conversation

ablaom
Copy link
Member

@ablaom ablaom commented Dec 3, 2020

A PR to support JuliaAI/MLJBase.jl#492.

Adds:

  • reformat method/fallback
  • selectrows(model::Model, I, args...) -> data fallback

From the new doc-strings (edited):


MLJModelInterface.reformat(model, args...) -> data

Models optionally overload reformat to define transformations of
user-supplied data into some model-specific representation (e.g., from
a table to a matrix). When implemented, the MLJ user can avoid
repeating such transformations unnecessarily, and can additionally
make use of more efficient row subsampling, which is then based on the
model-specific representation of data, rather than the
user-representation. When reformat is overloaded,
selectrows(::Model, ...) must be as well (see
selectrows). Furthermore, the model fit method(s), and
operations, such as predict and transform, must be refactored to
act on the model-specific representions of the data.

To implement the reformat data front-end for a model, refer to
"Implementing a data front-end" in the MLJ
manual
.

selectrows(::Model, I, data...) -> sampled_data

A model overloads selectrows whenever it buys into the optional
reformat front-end for data preprocessing. See reformat
for details. The fallback assumes data is a tuple and calls
selectrows(X, I) for each X in data, returning the results in a
new tuple of the same length. This call makes sense when X is a
table, abstract vector or abstract matrix. In the last two cases, a
new object and not a view is returned.

@ablaom ablaom marked this pull request as draft December 3, 2020 23:03
@codecov-io
Copy link

codecov-io commented Dec 3, 2020

Codecov Report

Merging #76 (9c937d8) into dev (2c358e9) will decrease coverage by 0.46%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev      #76      +/-   ##
==========================================
- Coverage   99.04%   98.57%   -0.47%     
==========================================
  Files           9        9              
  Lines         209      211       +2     
==========================================
+ Hits          207      208       +1     
- Misses          2        3       +1     
Impacted Files Coverage Δ
src/MLJModelInterface.jl 100.00% <ø> (ø)
src/model_api.jl 83.33% <83.33%> (-16.67%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2c358e9...9c937d8. Read the comment docs.

@ablaom ablaom changed the title Preparation for MLJBase performance improvements Add facility for adding a data front-end to a model implementation Jan 4, 2021
@ablaom ablaom marked this pull request as ready for review January 4, 2021 03:33
@ablaom ablaom merged commit 1bd56dc into dev Jan 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants