<

Installing and Configure single node Hadoop 2.2.0 on Oracle Linux

Published on
5,313 Points
2,113 Views
2 Endorsements
Last Modified:
Hello All,

In previous article we used Hortonworks sandbox to work with Hadoop. Now, lets think to create own single node Hadoop on Linux. Here we Install and Configure Apache Hadoop on UI based Oracle Linux.

I assume, you have VMware installed in your system so, As a first step download Oracle linux and follow steps to install it on VM. 

Hope, you have installed Linux on VMware workstation. Now time to Install and Configure Hadoop on it.

Before Installing Hadoop you needs to be installed some prerequisites.
 
  • Installing Java
 
Downaload Java jdk rpm file
What this installer do? It will install requied binaries to specifica location and set Java Home path as well. So no need to do it manually. To Install it follow the friendly steps.
JDK-Install.PNGJDK-Install2.PNG
If you wish to check version of installed java then run "java -version" command in terminal.

javaversion.PNG
 
  • Adding dedicated Hadoop system user.
 
In linux there is application KUser to create users and groups with mapping of both.
KUser1.PNG
KUser2.PNG
  • Configuring SSH access.

Login with dedicate hadoop user and follow the steps:

SSH is required to communicate with other node of Hadoop.
To do that RUN command ssh-keygen -t rsa -P "" in terminal with newly created user and follow steps.

It will ask to provide the file name in which to save the key, you just press enter so it will generate the key at ‘/home/dedicatedhadoopuser/ .ssh’ default path.

Enable SSH access to your local machine with this newly created key RUN command in terminal
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

The final step is to test the SSH setup by connecting to your local machine with the dedicated hadoop user. RUN command in terminal
ssh dedicatedhadoopuser@localhost

This will add localhost permanently to the list of known hosts
 
  • Disabling IPv6.

Open "/" root folder and goto path "/etc/" and open "sysctl.conf" file in gedit

At last add below text save and close.
 
#disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Open in new window

Install Hadoop

Now we are ready to install Hadoop. Please get it from here. Now extract file "hadoop-2.2..tar.gz" with "Extract Here" option

HadoopExtract.PNG
Create new folder "hadoop" to "/usr/local" and copy content of extracted folder to location.

Login to terminal with root user and run Below command to give access to hadoop folder to dedicated hadoop user.
 
sudo chown -R dedicatedhadoopuser:hadoop Hadoop

Open in new window


Congrated! we have completed one task of Installation.

Configure Hadoop

As a configuration steps we have to create/update below files:
Files 1 to 4 are available at "/usr/local/hadoop/etc/hadoop" folder, open it with gedit and last one is in root folder.
Note: If files are not exists then create it.
 
1. yarn-site.xml
At last copy below line
 
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

Open in new window


2. core-site.xml
At last copy below line
 
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

Open in new window


3. mapred-site.xml
At last copy below line
 
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

Open in new window


4. hdfs-site.xml
At last copy below line
 
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/yarn_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/yarn_data/hdfs/datanode</value>
</property>
</configuration>

Open in new window


Make sure namenode and datanode directories are created.

5. bashrc

open bashrc file using below command. It will open a file in editor.

bashrc.PNGConfigure paths, save and close file.
 
# Set Hadoop-related environment variables

export HADOOP_PREFIX=/usr/local/hadoop
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

# Native Path
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib"

#Java path
export JAVA_HOME='/usr/local/Java'

# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/sbin

Open in new window



Now time to restart VM.
 


We have configured Hadoop. Now we have to format node to use it and start service. 
To format run  hadoop namenode -format command in terminal. make sure you are at path "/usr/local/hadoop/bin".

Now are are ready to start all services. run start-all.sh command to terminal. It will start all dependant services of Hadoop.

I think you all are thinking how to check running services. No problem go to browser and check below urls:

1. http://localhost:50070
2. http://localhost:50090
Congrats! now you have installed and Configure Hadoop on Linux.

Note: Follow same step 3 to Download and Install JDBC driver for SQL Server in previous article.
 
2
Comment
  • 2
2 Comments
 
LVL 21

Author Comment

by:Alpesh Patel
Hi Taylor,

I think you have not read article carefully and precisely. You have check article contains screen shots and wordings are totally different. However, always some commands are common. If you think those are copied then I can't do anything.

Hope you read it again.

Thanks,

Alpesh
0
 
LVL 21

Author Comment

by:Alpesh Patel
Problem: I heard some of has issue to start name node and getting error "Unable to determine address of host - falling back to "localhost""


Resolution:

You have to make entry in hosts file at /etc/hosts
[IP Address]   [hostname]  

Open in new window



Now check you hostname (are you getting IP address or not) by below command.

$hostname -i

Open in new window


hostname: Unknown host

If you get Unknow host then you have to do more configuration in network file.

Make hostname entry to network file at /etc/sysconfig/network

HOSTNAME=[yourhostname]

Open in new window


Now reboot system.

Congrats! problem resolved.
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Join & Write a Comment

A query can call a function, and a function can call Excel, even though we are in Access. This is Part 2, and steps you through the VBA that "wraps" Excel functionality so we can use its worksheet functions in Access. The declaration statement de…
Learn the basics of Skype For Business in office 365

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month