-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Create include folder for C++ API distribution #1129
Conversation
@dantegd never mind my previous request. I think I have a working version after merging the latest of branch-0.11. Will update this PR soon. |
@@ -68,41 +68,6 @@ bool get_arg(char **begin, char **end, const std::string &arg) { | |||
return false; | |||
} | |||
|
|||
class cachingDeviceAllocator : public ML::deviceAllocator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most changes are just moving around headers... out of curiosity, why do these cachingDeviceAllocator blocks get removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not mistaken its because they were part of the examples instead of inside cuML C++ library itself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's true. cachingDeviceAllocator
is a wrapper over the corresponding allocator inside cub. We have cub as our dependency anyways. So, made little sense to keep it inside examples. Thus, moved this inside our include folders.
It also helped resolve a cuml -> ml-prims -> cuml circular dependency on header files.
In general, I like the approach. I can't say I read every single header. My only concern is that sometimes the python and c++ code directories differ arbitrarily, and C++ users now have to understand the python layout. Maybe we should (in a follow on change?) update the locations of C++ source files as well? I'm thinking like |
@JohnZed the problem with that and why I don't like the idea for the source files is that then folders can contain way more files in the source folders. For example cluster would include all the kmeans alongside all the dbscan folders, which would either require added folder sublevels (not ideal) or just living with mixed sources of different algorithms (even less ideal). One folder per algorithm source seems like the best way to avoid this mixing (with perhaps some exceptions). On the other hand where to put the header can be as simple as just asking one of the Python devs, or creating a first PR with the proposed API in its location which will then get feedback by us, which is a nice practice unto itself tbh |
Note: PR will require new |
@dantegd @teju85 This is no-longer blocked, packages are published: https://anaconda.org/rapidsai-nightly/libcumlprims/files |
rerun tests |
Thanks @mike-wendt ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read over all the utilities and cmake code but only spot-checked the headers + source files.
My comments are minor and address the python utilities. I don't think they are really functional problems, so I'm ok pushing then updating the source (as long as that really happens soon! ;))
PR closes #964 and proposes folder structure for C++ API
include
folder for distribution.Proposed folder structure is to mirror cuML’s Python API structure (which itself mirror’s Scikit-learn). CuDF has a flat structure, so this PR can be used to discuss what path we would prefer to take. Important observations:
include
folder (that forms the consumable client C++ API). Should improve clarity and consistency with the consumable Python API.src
folder to the same structure is such a good idea, many algorithms have multiple files (which could even have naming clashes), a good example aredbscan
andkmeans
which each have quite a few files of their own, which would make a hypotheticalcluster
folder there be very big unless we introduce even more nested-ness to our folder structure. So I propose we stick to one folder per algo for thesrc
folder, should help devs keep somewhat lean folders, and with the changes toCMakeLists
in this PR is transparent to both users and almost entirely transparent to devs.CUML_INCLUDE_DIR
cmake variable that makes it so that our codebase can include the headers ininclude
independent of its structure, which will also minimize code changes if we decide to move any API around in the future or the results of the discussion of this PR are that we prefer a flatinclude/cuml
foldersrc_prims
(i.e. cuml prims) are needed in the API, so those are installed at theINSTALL
step of CMake/make. This way the prims still stay header only independent of the algorithms, and the consumable C++ API has the needed contents for consumption.Tasks:
include
folder and sub foldersinclude
folders for initial discussion (Most have been moved, enough to start discussion in PR)CMakeLists
and header includes in C++ code to test that things workcuML
andprims
headers needed