Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create writer with arrow::ipc::IPCWriteOptions #4730

Merged
merged 2 commits into from
Dec 26, 2022
Merged

Create writer with arrow::ipc::IPCWriteOptions #4730

merged 2 commits into from
Dec 26, 2022

Conversation

askoa
Copy link
Contributor

@askoa askoa commented Dec 25, 2022

Which issue does this PR close?

Closes #4708

@github-actions github-actions bot added the core Core DataFusion crate label Dec 25, 2022
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @askoa

I wonder if it might be worth adding some test to ensure this API is not accidentally broken / removed as part of some future refactoring

@askoa
Copy link
Contributor Author

askoa commented Dec 26, 2022

I wonder if it might be worth adding some test to ensure this API is not accidentally broken / removed as part of some future refactoring

The API dos not add new functionality. It provides a bridge to access the compression function in arrow-rs. I don't think 'someone might remove it' is a good reason to add test. The person removing the code might also remove the test. If anyone ends up accidentally removing it, then the integration with arrow-ballista will fail. I don't see a need to add new test for this.

@alamb alamb merged commit 9331ee3 into apache:master Dec 26, 2022
@ursabot
Copy link

ursabot commented Dec 26, 2022

Benchmark runs are scheduled for baseline = 01d00fd and contender = 9331ee3. 9331ee3 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@Dandandan
Copy link
Contributor

I wonder if it might be worth adding some test to ensure this API is not accidentally broken / removed as part of some future refactoring

The API dos not add new functionality. It provides a bridge to access the compression function in arrow-rs. I don't think 'someone might remove it' is a good reason to add test. The person removing the code might also remove the test. If anyone ends up accidentally removing it, then the integration with arrow-ballista will fail. I don't see a need to add new test for this.

Thank you very mucht for adding this feature, no worries for the test!

In some cases, features might be lost by someone moving / refactoring the code as part of another change - this happened quite a few times before in arrow-rs / arrow-datafusion! Tests covering those code paths help avoiding issues like that. In ballista this might be only discovered once the version is updated (which may be only after a new version is released).

@askoa
Copy link
Contributor Author

askoa commented Dec 26, 2022

In some cases, features might be lost by someone moving / refactoring the code as part of another change - this happened quite a few times before in arrow-rs / arrow-datafusion! Tests covering those code paths help avoiding issues like that. In ballista this might be only discovered once the version is updated (which may be only after a new version is released).

I don't know the scenarios you are referring to. But during those refactoring the person might remove the test along with the code. So, in my opinion, catching when arrow-ballista consumes the changes is the best way to catch such issues rather than adding tests.

@alamb
Copy link
Contributor

alamb commented Dec 27, 2022

But during those refactoring the person might remove the test along with the code

This is definitely true. Tests such as @Dandandan and I are referring to do not guarantee code isn't broken.

However, I for one, when reviewing PRs look quite carefully at the tests that are changed / modified so having them act as a "second check"

I think it is fine for this PR to not have added tests, but I do think they serve an important purpose which is why I am belaboring this point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support compression in IPCWriter
4 participants