Begin this summer..: August 2013

This tutorial will guide you through the steps of launching and expanding a hadoop cluster on Nimbus in One-Click. Being able to launch one-click Hadoop cluster allow researchers and developers analysis data in an easily and time-effectively manner. In addition one-click Hadoop cluster expansion make it possible for data analysists to flexibly add nodes to their cluster whenever more capacity is needed. The steps are as follows.

Get the source files from github.

git clone https://github.com/kyrameng/OneClickHadoopClusterOnNimbus.git

Copy the commands to bin directory and the cluster definition files to sample directory.

cp launch-hadoop-cluster.sh  your_nimbus_client/bin/
cp expand-hadoop-cluster.sh  your_nimbus_client/bin/
cp hadoop-cluster-template.xml  your_nimbus_client/samples/
cp hadoop-add-nodes.xml  your_nimbus_client/samples/

Launch a cluster using the following command.

bin/launch-hadoop-cluster.sh --cluster samples/hadoop-cluster-template.xml --nodes 1 --conf conf/hotel.conf --hours 1

--nodes specifies how many slave nodes you want to have in this cluster. This command will launch a stand alone master node for you.
--hours specifies how long this cluster will run.
--cluster specifies which cluster definition file to use.
--conf specifies the site where the cluster will be launched.

Output of the above command.

SSH known_hosts contained tilde:
  - '~/.ssh/known_hosts' --> '/home/meng/.ssh/known_hosts'

Requesting cluster.
  - master-node: image 'hadoop-50GB-scheduler.gz', 1 instance
  - slave-nodes: image 'hadoop-50GB-scheduler.gz', 1 instance

Context Broker:
    https://svc.uc.futuregrid.org:8443/wsrf/services/NimbusContextBroker

Created new context with broker.

Workspace Factory Service:
    https://svc.uc.futuregrid.org:8443/wsrf/services/WorkspaceFactoryService

Creating workspace "master-node"... done.
  - 149.165.148.157 [ vm-148-157.uc.futuregrid.org ]

Creating workspace "slave-nodes"... done.
  - 149.165.148.158 [ vm-148-158.uc.futuregrid.org ]

Launching cluster-042... done.

Waiting for launch updates.
  - cluster-042: all members are Running
  - wrote reports to '/home/meng/futuregrid/history/cluster-042/reports-vm'

Waiting for context broker updates.
  - cluster-042: contextualized
  - wrote ctx summary to '/home/meng/futuregrid/history/cluster-042/reports-ctx/CTX-OK.txt'
  - wrote reports to '/home/meng/futuregrid/history/cluster-042/reports-ctx'

SSH trusts new key for vm-148-157.uc.futuregrid.org  [[ master-node ]]

SSH trusts new key for vm-148-158.uc.futuregrid.org  [[ slave-nodes ]]
cluster-042
Hadoop-Cluster-Handle cluster99

Go to Hadoop Web UI to check your cluster status. e.g. 149.165.148.157:50030. Also this command will create a unique directory for every launched hadoop cluster to store their cluster definition files. Check your_nimbus_client/Hadoop-Cluster to explore your clusters.

The last line of output is the hadoop cluster handle of the newly launched cluster. We will use this information to expand this cluster in the future.

Use the following command to expand this cluster. Specify which cluster you want to add nodes to using the --handle option. "--nodes" option will specify how many slave nodes you want to add to a particular cluster.

bin/expand-hadoop-cluster.sh --conf conf/hotel.conf --nodes 1 --hours 1 --cluster samples/hadoop-add-nodes.xml --handle cluster99

Its output is as follows.

SSH known_hosts contained tilde:
  - '~/.ssh/known_hosts' --> '/home/meng/.ssh/known_hosts'

Requesting cluster.
  - newly-added-slave-nodes: image 'ubuntujaunty-hadoop-ctx-pub_v8.gz', 1 instance

Context Broker:
    https://svc.uc.futuregrid.org:8443/wsrf/services/NimbusContextBroker

Created new context with broker.

Workspace Factory Service:
    https://svc.uc.futuregrid.org:8443/wsrf/services/WorkspaceFactoryService

Creating workspace "newly-added-slave-nodes"... done.
  - 149.165.148.159 [ vm-148-159.uc.futuregrid.org ]


Launching cluster-043... done.

Waiting for launch updates.
  - cluster-043: all members are Running
  - wrote reports to '/home/meng/futuregrid/history/cluster-043/reports-vm'

Waiting for context broker updates.
  - cluster-043: contextualized
  - wrote ctx summary to '/home/meng/futuregrid/history/cluster-043/reports-ctx/CTX-OK.txt'
  - wrote reports to '/home/meng/futuregrid/history/cluster-043/reports-ctx'

SSH trusts new key for vm-148-159.uc.futuregrid.org  [[ newly-added-slave-nodes ]]

Begin this summer..

Monday, August 19, 2013

Mapping between EMR APIs and Yarn APIs

Thursday, August 8, 2013

One-Click Hadoop Cluster Launching and Expansion on Nimbus

Wednesday, August 7, 2013

CentOS Commands Notes