-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(hive): add apache hive package #29412
Conversation
71958ea
to
625509d
Compare
|
||
- name: Prepare HDFS directories | ||
runs: | | ||
echo "test:x:$(id -u):$(id -g):test user:/:/bin/sh:" >> /etc/passwd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tweak is there because Hive uses the user name from the passwd DB to set up the HDFS ownership.
hive.yaml
Outdated
LANG: en_US.UTF-8 | ||
JAVA_HOME: /usr/lib/jvm/java-1.8-openjdk | ||
HIVE_VERSION: 4.0.0 | ||
HADOOP_VERSION: 3.3.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hadoop 3.4.0 is not yet supported. Work to support it is in progress (apache/hive#5187).
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
…ugin Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
This reverts commit c759e38. Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
hive.yaml
Outdated
LANG: en_US.UTF-8 | ||
JAVA_HOME: /usr/lib/jvm/java-1.8-openjdk | ||
HIVE_VERSION: 4.0.0 | ||
HADOOP_VERSION: 3.3.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we need to explicitly set the version here, when we package 3.3 could we make sure we're pinned to this version so that there isn't drift?
I.E.,
dependencies:
runtime:
- hadoop=3.3.6
Same for Tez
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally makes sense, thanks @EyeCantCU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some minor things.
hive.yaml
Outdated
LANG: en_US.UTF-8 | ||
JAVA_HOME: /usr/lib/jvm/java-1.8-openjdk | ||
HIVE_VERSION: 4.0.0 | ||
HADOOP_VERSION: 3.3.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can replace this environment variable HADOOP_VERSION
with a melange variable.
vars:
hadoop-version: 3.3.6
then you use ${{vars.hadoop-version}}
and you don't have to repeat yourself in the runtime environment below (if they are intended to be the same thing).
hive.yaml
Outdated
pipeline: | ||
- name: Download Hadoop | ||
runs: | | ||
gpg --keyserver hkps://keyserver.ubuntu.com --recv-key CD32D773FF41C3F9E74BDB7FB362E1C021854B9D |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that using fetch
pipeline would be preferable here.
it will need updating any time HADOOP_VERSION needs updating (versus the gpg path that would accept it if it was signed), but fetch is used in more places, so we can improve that pipeline for any reason and get benefit everywhere.
then you can drop the gpg, gpg-agent, gnupg-dirmngr and curl from above.
test:
pipeline:
- uses: fetch
with:
uri: https://dlcdn.apache.org/hadoop/common/hadoop-${{vars.hadoop-version}}/hadoop-${{vars.hadoop-version}}.tar.gz
expected-sha256: f5195059c0d4102adaa7fff17f7b2a85df906bcb6e19948716319f9978641a04
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it! Thank you @smoser.
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
Signed-off-by: Massimiliano Giovagnoli <[email protected]>
This PR introduces the package for Apache Hive.
Related: #29167
Pre-review Checklist
For new package PRs only
endoflife.date
):There's no support policy available.