-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile the accelerated code as a separate step #61
Conversation
… smaller value for CPUs
…_VawNormalizedWeightedTask out of the class
…d the vaw variant
…c_task_noclass.cpp
…ute_stripes_totals functions
This PR mostly includes wrappers that were needed to achieve the clear separation. |
* Create concrete implementation wrappers | ||
*/ | ||
|
||
template<> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't encountered this before, no type needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...is it because the called function is meta?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how you force the generation of a specific type when you do separate compilation.
At least in this case...
In general, you can have different implementations for different types, and this is how you specify what that is.
filled_embs, | ||
start_idx, stop_idx, n_samples, n_samples_r, | ||
lengths, embedded_proportions, embedded_counts, sample_total_counts, | ||
dm_stripes_buf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for consideration in this PR, and unsure whether this would be useful to pursue, but it seems like a lot of content here follows a similar pattern. I wonder whether it would be more maintainable to generate the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The templating is what "generates the code".
This is just a way to tell the compiler what variant we need.
Unfortunately, I cannot think of a simpler way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, but there is also a lot of pattern here. What I was thinking of, and this may be crazy, but jinja2 templates and just generate it with python -- I saw something like that here, but again, it may not actually make sense in this repo for many reasons
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right it would make longer-term maintenance easier.
(Even though it would make code harder to read)
Will merge this PR, and do the code generation variant as a follow-up PR.
The accelerated code was included in a unifrac_cmp before, mixing CPU-only OMP and acc code paths.
Now it it clearly separated.