-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Arrow by flag + arrow docs #345
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @niclashedam, thanks for writing it up! I can confirm that it works (with slight adjustments, see below). The changes in build.sh
look good to me. However, I have a few comments on GettingStarted.md
:
- I had to install
libboost-dev
first (bysudo apt install libboost-dev
) to make it build. We should mentionlibboost-dev
as an optional dependency (if Arrow is desired) in the table inGettingStarted.md
. - I had to use
sudo make install
instead ofmake install
for Arrow. - Cloning the latest commit of the Arrow repo is a bit fragile, since any change they make could break our code. Could we use a particular release of theirs?
- Hitherto, we either (a) mention libs/tools as dependencies in the table in
GettingStarted.md
(we could mention Arrow as optional there, but I also see that the particular instructions need some space), or (b) download and build the dependencies in the build script.
We should make the instructions on installing Arrow more consistent with what we have so far. I'd say if we can go with a version of Arrow which can be installed via typical package repos, then we could mention Arrow as an optional dependency in the table. But as far as I understand, we need some newer features and therefore must build it from source. In that case, I would recommend integrating the download and build of Arrow intobuild.sh
(to be executed only if--arrow
is present). Ideally, we could omit thesudo make install
(see point 2) then, so that we don't need admin rights. This would make it much easier for users.
In general, some major changes to the build script are currently in development (see #328). So it's probably easier to wait for those to be merged and extend build.sh
afterwards.
Finally, I noticed that testing with Arrow support is not trivial at the moment, because test.sh
does not accept --arrow
. However, this is a more general problem which can be addressed later.
Hi @niclashedam, I also took a look into your changes. As Patrick already mentioned, there are upcomming changes to the script. Your current changes should fit in easily with my changes to the build script, but I can't speak for the upcomming changes of @corepointer of course. Nevertheless if you are going to add Arrow as dependency to the build script, you already can take a look into this documentation under "Adding a dependency" and the new version of the build script. Regards, |
Even though the changes in #328 and #359 are substantial, the changes from this PR can be adapted easily once the others are merged in. I held back separating fetch & build in #359, so the build script documentation form @EricMier should still be valid. |
Hi @niclashedam! |
@corepointer I'll give it a shot. I have a deadline next week, but I'll have time to fix this after that. I'll let you know if anything comes up! |
I have updated the branch to follow the new build system. Please let me know if something needs to be modified. I looked into the documentation. I think it is not so pedagogical for developers outside of the consortium. We know how it works because we work with it often, but it is quite different from systems I've worked with before. I also think it needs to specify some edge cases, such as handling optional dependencies. Furthermore, the build script must install some of its artefacts in thirdparty/installed. |
Have you had the time to look at this, @pdamme @corepointer ? 😄 |
Hi @niclashedam, sorry for the delay in handling this PR. In fact, PRs have been piling up a bit recently, since I was quite busy with a few other things. Please let us know if this is blocking you. Otherwise, I could probably find some time at the end of next week. In case someone else would like to have a look and test and merge it earlier, that would be fine by me, too. |
Hi @pdamme. It is not blocking me so no rush. I am just making sure it is not forgotten. |
Hey @niclashedam, I took a look and it LGTM! Is there any particular reason to export BUILD_ARROW ((414)? I see no subscript using this variable. Besides that I think we can merge your changes in. |
Hi @EricMier. I assume you are talking about the export on line 414, yes? I don't think it is necessary as long as we forward the variable to CMake, which I believe we do anyways. I will remove it. Thanks! By the way, do you know how I can invoke the Arrow/Parquet tests when Arrow is enabled? I don't think they are automatically invoked anymore due to the flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @niclashedam and @EricMier for continuing the discussion. I've just applied some final adjustments. Most importantly, following @corepointer's advice, Arrow is now built with ninja
instead of make
, which reduces the build time. LGTM now.
- Added deletion of Arrow download token with --cleanAll. - Removed superfluous cd command. - Changed Arrow build from make to ninja. - Should be faster (and is, on my system). - More consistent with the other dependencies. - Mentioned libboost-dev as dependency in GettingStarted.md.
Regarding the problem of telling |
Sounds good! Thanks for your time @pdamme, @corepointer and @EricMier! 💯 |
As promised in PR #331: Some docs for setting up Arrow and compiling it.
Is there anything else I should add or is this sufficient?