+2

Bigdata Charmers Data Analytics With Sql Like

  • By Big Data Charmers
Channel Version Revision Published Runs on
latest/stable 6 6 15 Mar 2021
latest/edge 6 6 15 Mar 2021
juju deploy bigdata-charmers-data-analytics-with-sql-like
Show information
You will need Juju 2.9 to be able to run this command. Learn how to upgrade to Juju 2.9.

Platform:

Discuss this bundle

Share your thoughts on this charm with the community on discourse.

Join the discussion

Hortonworks HDP 2.1 HIVE, mysql, and Hadoop Cluster

This bundle is a 4 node Hadoop cluster designed to scale out. It contains the following units:

  • one Hadoop Master (yarn & hdfs) Node
  • one Hadoop compute Node
  • one Hive Node
  • one MySQL Node

Usage

Deploy the bundle, once you have a cluster running, ssh to the Hadoop Master node:

juju ssh yarn-hdfs-master/0
Smoke test HDFS admin functionality

As the HDFS user, create a /user/$CLIENT_USER on the hadoop file system. The steps below verifies/demonstrates HDFS functionality:

sudo su $HDFS_USER
hdfs dfs -mkdir -p /user/ubuntu
hdfs dfs -chown ubuntu:ubuntu /user/ubuntu
hdfs dfs -chmod -R 755 /user/ubuntu
exit
Smoke test YARN and Mapreduce

Run the test as the $CLIENT_USER, using Terasort and sort 10GB of data:

Hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-*.jar teragen 10000 /user/ubuntu/teragenout
Hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-*.jar terasort /user/ubuntu/teragenout /user/ubuntu/terasortout
Smoke test HDFS functionality from ubuntu user space

Delete the mapreduce output from hdfs:

hdfs dfs -rm -r /user/ubuntu/teragenout
HIVE + HDFS Usage:

Create an ssh session with the Hive server, switch to the Hive user, and start the Hive console:

juju ssh hdphive/0  
sudo su \$HIVE_USER  
hive

From the Hive console, create a table:

show databases;
create table test(col1 int, col2 string);
show tables;
exit;

Exit from the Hive user session:

exit

Change to the HDFS user, verify connection to the HDFS cluster, and that a test directory has been created on the remote HDFS cluster:

sudo su \$HDFS_USER
hadoop dfsadmin -report
hdfs dfs -ls /apps/hive/warehouse

Scale Out Usage

This bundle was designed to scale out. In order to increase the amount of slaves, you must add units, to add one unit:

juju add-unit compute-node

Or you can add multiple units at once:

juju add-unit -n4 compute-node

Contact Information

Amir Sanjar amir.sanjar@canonical.com

Upstream Project Name