For the last few months I have been building a prototype on top of an Apache Hadoop 1.0.4 cluster that I built from scratch out of six virtual machines running Ubuntu Server 12.04.2 LTS. It has been an interesting experience. Simply put, this is the actual learning process that every hacker goes through on every new project whether its a programming language, platform or technology. So now that I got a handle on the basics and I can take an earnest look at other peoples packaging.
Today I am checking out the current offering from Cloudera. I found the download named Clouder Manager 4.5 Free Edition, and proceeded with the installation. Of course I need to install it on a few nodes so I am back to setting up some more servers.
This time I decide to use my mac pro server configured with virtual box. I planned on running a three server cluster (cloud1,cloud2,cloud3) so I set it up and run into a few networking problems. I get my ops dept to fix my port to allow for multiple mac addresses. Here are some of the issues and solutions I encountered when setting up the environment:
For each cloned virtual server I needed to change (persistently) its host name and mac address. The tools ( virtual box in this case ) should have properly handled this. It did NOT. So I did the following hand job on each machine.
So my first installation was from my remote desktop linux to my cluster and it failed. I then decided to allocate another local instance (cloud0) and try again. The installer runs ok and i point my web browser at http://cloud0:7180, login as admin/admin and away we go:
This installer will deploy the following services on your cluster:
You are using Cloudera Manager (Free Edition) to install and configure your system.
I specify cloud[1-3] and get the following results:
|Expanded Query||Hostname (FQDN)||IP Address||Currently Managed||Result|
|cloud1||cloud1.ibi.com||172.30.240.110||No||Host ready: 9 ms response time.|
|cloud2||cloud2.ibi.com||172.30.240.111||No||Host ready: 7 ms response time.|
|cloud3||cloud3.ibi.com||172.30.240.112||No||Host ready: 16 ms response time.|
While it took a few tries I finally got the following:
So now It asks me decide which CDH4 services I should install. I pick core hadoop for my first attempt withan embedded PostgreSQL database setup:
|Database Host Name:||Database Type:||Database Name :||Username:||Password:|
and all defaults for the rest. 13 steps later and viola: