Infrastructure at your Service

Mehdi Bada

Deploy a Cloudera cluster with Terraform and Ansible in Azure – part 3

By July 10, 2018 Big Data, Cloud No Comments

After the deployment step with Terraform and the configuration/installation with Ansible, we will continue the installation of our Cloudera cluster with Cloudera Manager.

By following the below steps you will see how to install CDH on our hosts using Cloudera Manager.

Connection

First, Login to Cloudera manager URL.

Cloudera-Manager

When you connect to C.M for the first time, you need to accept the Cloudera Terms and Conditions.

Cloudera-Manager-Conditions

Then choose your desired edition of Cloudera. For this blog post, we will use the Data Hub trial edition (60 days trial).

C.M-Edition

 

Hosts discovery

In this step, provide the IP or hostname of all cluster machines you want to use. To complete this steps check if your /etc/hosts file of each cluster hosts is properly defined.

Cloudera-Manager-Hosts

When all hosts are reachable by Cloudera Manager server,

CDH Installation

This step is about the version of CDH to install in your cluster.

C.M-Installation_Methods

Use parcels installation.

By default, the parcel directory is /opt/cloudera/parcels. A best practice is to have a separate filesystem for /opt (at least 15GB), in order to separate the Cloudera installation to the /root filesystem.

If you don’t have a specific filesystem for /opt, you will have some performance impact on your server.

Java automatic installation

Since we install Java JDK automatically and properly with Ansible, we don’t need to check the box for java installation. Please skip this step.

CM-Java_install

Account details

In this step, we will provide the user account information to Cloudera Manager in order to install all components needed in all cluster hosts.

It’s not recommended to give the root access to Cloudera Manager but a dedicated user with sudo access. For our example we will use the user created during the installation part: dbi, with it associated password.

CM-Users1

Cloudera Installation – Install Agents

In this steps, Cloudera Manager will install and configure cloudera-scm-agent in all cluster hosts.

CM-Install-Agents

 

Cloudera Installation – Parcels installation

After cloudera-scm-agent installation and configuration, Cloudera Manager will install the CDH Parcel and additional parcels on all cluster hosts.

CM-Parcels2

Cloudera Installation – Hosts Inspector

In this step, the host’s inspector will check all cluster hosts requirement and notify you if you have any problem.

Note that, you can go through all validation section to understand all pre-requisites for Cloudera installation. You will get to know the complete checklists items Cloudera use to validate your host’s cluster.

CM-Hosts-Inspector

You can ignore the warning for now and resolve them after the installation. Click on Finish button and go to the next step.

Cluster Setup – Select Services

In this step, choose your services to install. For our sandbox environment we will only install Core Hadoop first.

CM-Services

 

Cluster Setup – Customize Role Assignments

Assign roles by hosts and click on continue.

CM-Roles

 

Cluster Setup – Setup databases

In this step, setup the remote databases for hive metastore, hue, Cloudera reports manager and oozie server.

CM-Databases

Test the connection and click on Continue .

Cluster Setup – Review changes

Ensure that you use the /data directory previously created with Terraform and Ansible.

CM-ReviewsUsers

 

Cluster Setup – Start services

CM-StartServices

 

Congratulations your Cloudera cluster is now installed and configured!

CM-End

CM-Dashboard

 

Leave a Reply

Mehdi Bada
Mehdi Bada

Consultant