Install hadoop on ubuntu 10.04 LTS

Install hadoop on ubuntu 10.04 LTS server installion.

Install ubuntu l0.04 LTS,

Update,upgrade and install ssh

#apt-get update

#apt-get upgrade

#apt-get install ssh

Install sun-6-java jdk

# See https://launchpad.net/~ferramroberto/

$ sudo apt-get install python-software-properties
$ sudo add-apt-repository ppa:ferramroberto/java

# Update the source list
$ sudo apt-get update

# Install Sun Java 6 JDK
$ sudo apt-get install sun-java6-jdk

$ sudo update-java-alternatives -s java-6-sun

Create new user hadoop and group

$ sudo addgroup hadoop

$ sudo adduser --ingroup hadoop hduser

Generate ssh key to auto login to manager

#su - hduser

#ssh-keygen -t rsa -P ""

#cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Disable ipv6 and reboot

#vim /etc/modprobe.d/blacklist

add new line in it

blacklist ipv6

#disable ipv6

net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

You have to reboot your machine in order to make the changes take effect.

You can check whether IPv6 is enabled on your machine

$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6

 0 means IPv6 is enabled, a value of 1 means

disabled (that’s what we want).

Hadoop Distributed File System (HDFS)

Image copy by javacodegeeks.com

Download hadoop from mirror site

$ cd /usr/local

$ sudo tar xzf hadoop-1.0.3.tar.gz

$ sudo mv hadoop-1.0.3 hadoop

$ sudo chown -R hduser:hadoop hadoop

Confirm java home folder

#ls -l `whereis javac`

Modify hadoop home folder hadoop-env.sh

#vim hadoop/conf/hadoop-env.sh

uncomment export JAVA_HOME and modify it

export JAVA_HOME = /usr/lib/jvm/java-6-openjdk/

Config hadoop config file

#vim hadoop/conf/core-site.xml

add these line into it

<name>hadoop.tmp.dir</name>

<value>/home/hadoop/hadoop/tmp/dir/hadoop-hadoop</value>

</property>

<name>fs.default.name</name>

<value>hdfs://localhost</value>

</property>

Config hadoop file with hdfs-site.xml

#vim hadoop/conf/hdfs-site.xml

add these line into it

<name>dfs.replication</name>

</property>

Config hadoop file with mapred-site.xml

#vim hadoop/conf/mapred-site.xml

add these line into it

<name>mapred.job.tracker</name>

<value>localhost:54311</value>

</property>

Formating the namenode

#hadoop/bin/hadoop namenode -format

Start cluster

#hadoop/bin/start-all.sh

Check hadoop process

#hadoop/bin/jps

Use netstat to check all service running status

#netstat -plten | grep java

Stop cluster

#hadoop/bin/stop-all.sh

Start cluster

#hadoop/bin/start-all.sh

Mkdir a folder for gutenberg and touch three files with contents

#mkdir /tmp/gutenberg

#cd /tmp/gutenberg

#vim 1.txt

#vim 2.txt

#vim 3.txt

Use hadoop fs copyFromLocal copy files to hdfs folder

#hadoop/bin/hadoop fs -copyFromLocal /tmp/gutenberg gutenber

Check hdfs folder content

#hadoop/bin/hadoop fs -ls

#hadoop/bin/hadoop fs -ls gutenberg

Use java wordcount to calculate the words number

#hadoop/bin/hadoop jar hadoop-mapred-examples-0.21.0.jar

wordcount gutenberg gutenberg-output

Hadoop Web Interfaces

http://localhost:50070/ – web UI of the NameNode daemon

http://localhost:50030/ – web UI of the JobTracker daemon

http://localhost:50060/ – web UI of the TaskTracker daemon

Vishal Vyas

Install hadoop on ubuntu 10.04 LTS

Vishal Vyas

0 comments:

Post a Comment

About

Follow Us!

Popular Posts

Search This Blog

Links

Contact Us