Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-39357: [C++] Reduce function.h includes #39312

Merged
merged 5 commits into from
Dec 26, 2023

Conversation

zanmato1984
Copy link
Contributor

@zanmato1984 zanmato1984 commented Dec 20, 2023

Rationale for this change

As proposed in #36246 , by splitting function option structs from function.h, we can reduce the including of function.h. So that the total build time could be reduced.

The total parser time could be reduced from 722.3s to 709.7s. And the function.h along with its transitive inclusion of kernel.h don't show up in expensive headers any more.

The detailed analysis result before and after this PR are attached:
analyze-before.txt
analyze-after.txt

Disclaimer (quote from #36246 (comment)):

Note that the time diff is not absolute. The ClangBuildAnalyzer result differs from time to time. I guess it depends on the idle-ness of the building machine when doing the experiment. But the time reduction is almost certain, though sometimes more sometimes less. And the inclusion times of the questioning headers are reduced for sure, as shown in the attachments in my other comment.

What changes are included in this PR?

Move function option structs into own compute/options.h, and change including function.h to including options.h wherever fits.

Are these changes tested?

Build is testing.

Are there any user-facing changes?

There could be potential build failures for user code (quote from #36246 (comment)):

The header function.h remains in compute/api.h, with and without this PR. The proposed PR removes function.h from api_xxx.h (then includes options.h instead), as proposed in the initial description of this issue. This results in compile failures for user code which includes only compute/api_xxx.h but not compute/api.h, and meanwhile uses CallFunction which is declared in function.h.

But I think it's OK as described in #36246 (comment).

Copy link

⚠️ GitHub issue #36246 has been automatically assigned in GitHub to PR creator.

@zanmato1984 zanmato1984 changed the title GH-36246: [C++] Reduce function.h include GH-39357: [C++] Reduce function.h includes Dec 22, 2023
Copy link

⚠️ GitHub issue #39357 has been automatically assigned in GitHub to PR creator.

@zanmato1984 zanmato1984 marked this pull request as ready for review December 22, 2023 21:59
@zanmato1984
Copy link
Contributor Author

cc @pitrou @felipecrv

Copy link
Contributor

@felipecrv felipecrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice improvement! Only two comments about options.h.

Comment on lines 29 to 31
/// \addtogroup compute-functions
/// @{

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better name for this header is function_options.h to communicate better what the options are about and have it close to function.h when both are included.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And since api.h is a catch-all include for end-user's convenience, we should have #include "arrow/compute/function_options.h" in there.

Copy link
Contributor Author

@zanmato1984 zanmato1984 Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better name for this header is function_options.h to communicate better what the options are about and have it close to function.h when both are included.

Sure, will do.

And since api.h is a catch-all include for end-user's convenience, we should have #include "arrow/compute/function_options.h" in there.

Did you mean in compute/api.h? If so, it's already there (of course I will change the file name).

Or in api.h of arrow root include? If so, I don't think we should include compute stuff in there.

Would you please help to confirm? Appreciate it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function.h renamed to function_options.h.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I meant compute/api.h and for some reason I didn't notice the include was already there.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Dec 23, 2023
Comment on lines +18 to +20
// NOTE: API is EXPERIMENTAL and will change without going through a
// deprecation cycle.

Copy link
Contributor

@felipecrv felipecrv Dec 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What this means here?

Oh. I see it now. There is a comment like this in function.h as well :D

@felipecrv felipecrv merged commit cf44793 into apache:main Dec 26, 2023
37 of 38 checks passed
@felipecrv felipecrv removed the awaiting committer review Awaiting committer review label Dec 26, 2023
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit cf44793.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them.

clayburn pushed a commit to clayburn/arrow that referenced this pull request Jan 23, 2024
### Rationale for this change

As proposed in apache#36246 , by splitting function option structs from `function.h`, we can reduce the including of `function.h`. So that the total build time could be reduced.

The total parser time could be reduced from 722.3s to 709.7s. And the `function.h` along with its transitive inclusion of `kernel.h` don't show up in expensive headers any more.

The detailed analysis result before and after this PR are attached: 
[analyze-before.txt](https://github.com/apache/arrow/files/13756923/analyze-before.txt)
[analyze-after.txt](https://github.com/apache/arrow/files/13756924/analyze-after.txt)

Disclaimer (quote from apache#36246 (comment)):
> Note that the time diff is not absolute. The ClangBuildAnalyzer result differs from time to time. I guess it depends on the idle-ness of the building machine when doing the experiment. But the time reduction is almost certain, though sometimes more sometimes less. And the inclusion times of the questioning headers are reduced for sure, as shown in the attachments in my other comment.

### What changes are included in this PR?

Move function option structs into own `compute/options.h`, and change including `function.h` to including `options.h` wherever fits.

### Are these changes tested?

Build is testing.

### Are there any user-facing changes?

There could be potential build failures for user code (quote from apache#36246 (comment)):
> The header function.h remains in compute/api.h, with and without this PR. The proposed PR removes function.h from api_xxx.h (then includes options.h instead), as proposed in the initial description of this issue. This results in compile failures for user code which includes only compute/api_xxx.h but not compute/api.h, and meanwhile uses CallFunction which is declared in function.h.

But I think it's OK as described in apache#36246 (comment).

* Closes: apache#39357

Authored-by: zanmato <[email protected]>
Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
### Rationale for this change

As proposed in apache#36246 , by splitting function option structs from `function.h`, we can reduce the including of `function.h`. So that the total build time could be reduced.

The total parser time could be reduced from 722.3s to 709.7s. And the `function.h` along with its transitive inclusion of `kernel.h` don't show up in expensive headers any more.

The detailed analysis result before and after this PR are attached: 
[analyze-before.txt](https://github.com/apache/arrow/files/13756923/analyze-before.txt)
[analyze-after.txt](https://github.com/apache/arrow/files/13756924/analyze-after.txt)

Disclaimer (quote from apache#36246 (comment)):
> Note that the time diff is not absolute. The ClangBuildAnalyzer result differs from time to time. I guess it depends on the idle-ness of the building machine when doing the experiment. But the time reduction is almost certain, though sometimes more sometimes less. And the inclusion times of the questioning headers are reduced for sure, as shown in the attachments in my other comment.

### What changes are included in this PR?

Move function option structs into own `compute/options.h`, and change including `function.h` to including `options.h` wherever fits.

### Are these changes tested?

Build is testing.

### Are there any user-facing changes?

There could be potential build failures for user code (quote from apache#36246 (comment)):
> The header function.h remains in compute/api.h, with and without this PR. The proposed PR removes function.h from api_xxx.h (then includes options.h instead), as proposed in the initial description of this issue. This results in compile failures for user code which includes only compute/api_xxx.h but not compute/api.h, and meanwhile uses CallFunction which is declared in function.h.

But I think it's OK as described in apache#36246 (comment).

* Closes: apache#39357

Authored-by: zanmato <[email protected]>
Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] Includings of function.h are expensive
2 participants