-
Notifications
You must be signed in to change notification settings - Fork 61
hbase
4.1. Starting Zookeeper on all nodes
4.2. Starting HDFS on all nodes, from the Master Node
4.3. Starting HBase
## Requirements- At least 4 EC2 Instances (see AWS Intro)
- Hadoop Intro
- Zookeeper
all-nodes:~$ wget http://apache.mirrors.pair.com/hbase/stable/hbase-1.1.2-bin.tar.gz -P ~/Downloads all-nodes:~$ sudo tar zxvf ~/Downloads/hbase*.gz -C /usr/local all-nodes:~$ sudo mv /usr/local/hbase* /usr/local/hbase
Add the following environment variables to the ~/.profile
export HBASE_HOME=/usr/local/hbase export PATH=$PATH:$HBASE_HOME/bin
Be sure to source the profile
all-nodes:~$. ~/.profile## Configure HBase on all Nodes
Edit the /usr/local/hbase/conf/hbase-site.xml
file on all the nodes
all-nodes:~$ sudo nano /usr/local/hbase/conf/hbase-site.xml
Change the bolded text to match your setup
<configuration> <property> <name>hbase.rootdir</name> <value>hdfs://master-public-dns:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>zookeeper1-private-DNS,zookeeper2-private-DNS,zookeeper3-private-DNS,zookeeper4-private-DNS</value> </property> </configuration>
For example, this last property might read:
<value>ip-172-31-17-115.ec2.internal,ip-172-31-17-114.ec2.internal,ip-172-31-17-113.ec2.internal,ip-172-31-17-112.ec2.internal</value>
Edit the /usr/local/hbase/conf/hbase-env.sh
file on all the nodes
all-nodes:~$ sudo nano /usr/local/hbase/conf/hbase-env.sh
Find the line below and remove the # to uncomment. Change it to /usr
# The java implementation to use. Java 1.7+ required. export JAVA_HOME=/usr
Find the line below and remove the # to uncomment. Change it to false so that HBase doesn’t manage Zookeeper (otherwise Zookeeper will stop when HBase stops)
# Tell HBase whether it should manage it's own instance of Zookeeper or not. export HBASE_MANAGES_ZK=false
Edit the /usr/local/hbase/conf/regionservers
file on all the nodes
all-nodes:~$ sudo nano /usr/local/hbase/conf/regionservers
Remove the localhost and replace it with the private DNS of ONLY the worker nodes. For example:
ip-172-31-17-114.ec2.internal ip-172-31-17-113.ec2.internal ip-172-31-17-112.ec2.internal
Edit the /usr/local/hbase/conf/backup-masters
file on all the nodes
all-nodes:~$ sudo nano /usr/local/hbase/conf/backup-masters
Remove the localhost and replace it with the private DNS of any worker nodes. These will take over if the master becomes unavailable. For example:
ip-172-31-17-114.ec2.internal
Change ownership of the HBase directory on all nodes
all-nodes:~$ sudo chown -R ubuntu $HBASE_HOME## Start HBase on the Cluster ### Starting Zookeeper on all nodes
Prior to starting HBase, make sure that Zookeeper is running correctly as described in the Zookeeper Dev, which you can check with:
all-nodes:~$ echo srvr | nc localhost 2181
If you don’t see an output, start Zookeeper with:
all-nodes:~$ sudo /usr/local/zookeeper/bin/zkServer.sh start### Starting HDFS on all nodes, from the Master Node
You’ll also need HDFS started, which you can check from the Master node:
master-node:~$ hdfs dfs -ls /
If you don’t see an output, start HDFS with:
master-node:~$ sudo $HADOOP_HOME/sbin/start-dfs.sh### Starting HBase
master-node:~$ sudo $HBASE_HOME/bin/start-hbase.sh
You can go to namenode-public-dns:16010 in your browser to check if it’s working
## Using HBase Begin using the shell by going thru the examples [here](http://hbase.apache.org/book.html#shell_exercises).Find out more about the Insight Data Engineering Fellows Program in New York and Silicon Valley, apply today, or sign up for program updates.
You can also read our engineering blog here.