Install hadoop on ubuntu 10.04 LTS server installion.
Install ubuntu l0.04 LTS,
Update,upgrade and install ssh
#apt-get update
#apt-get upgrade
#apt-get install ssh
Install sun-6-java jdk
# See https://launchpad.net/~ferramroberto/
$ sudo apt-get install python-software-properties
$ sudo add-apt-repository ppa:ferramroberto/java
# Update the source list
$ sudo apt-get update
# Install Sun Java 6 JDK
$ sudo apt-get install sun-java6-jdk
$ sudo update-java-alternatives -s java-6-sun
Create new user hadoop and group
$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hduser
Generate ssh key to auto login to manager
#su - hduser
#ssh-keygen -t rsa -P ""
#cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Disable ipv6 and reboot
#vim /etc/modprobe.d/blacklist
add new line in it
blacklist ipv6
Or
#disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
You have to reboot your machine in order to make the changes take effect.
You can check whether IPv6 is enabled on your machine
$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
0 means IPv6 is enabled, a value of 1 means
disabled (that’s what we want).
Hadoop Distributed File System (HDFS)
Image copy by javacodegeeks.com |
Download hadoop from mirror site
$ cd /usr/local
$ sudo tar xzf hadoop-1.0.3.tar.gz
$ sudo mv hadoop-1.0.3 hadoop
$ sudo chown -R hduser:hadoop hadoop
Confirm java home folder
#ls -l `whereis javac`
Modify hadoop home folder hadoop-env.sh
#vim hadoop/conf/hadoop-env.sh
uncomment export JAVA_HOME and modify it
export JAVA_HOME = /usr/lib/jvm/java-6-openjdk/
Config hadoop config file
#vim hadoop/conf/core-site.xml
add these line into it
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop/tmp/dir/hadoop-hadoop</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost</value>
</property>
Config hadoop file with hdfs-site.xml
#vim hadoop/conf/hdfs-site.xml
add these line into it
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
Config hadoop file with mapred-site.xml
#vim hadoop/conf/mapred-site.xml
add these line into it
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
Formating the namenode
#hadoop/bin/hadoop namenode -format
Start cluster
#hadoop/bin/start-all.sh
Check hadoop process
#hadoop/bin/jps
Use netstat to check all service running status
#netstat -plten | grep java
Stop cluster
#hadoop/bin/stop-all.sh
Start cluster
#hadoop/bin/start-all.sh
Mkdir a folder for gutenberg and touch three files with contents
#mkdir /tmp/gutenberg
#cd /tmp/gutenberg
#vim 1.txt
#vim 2.txt
#vim 3.txt
Use hadoop fs copyFromLocal copy files to hdfs folder
#hadoop/bin/hadoop fs -copyFromLocal /tmp/gutenberg gutenber
Check hdfs folder content
#hadoop/bin/hadoop fs -ls
#hadoop/bin/hadoop fs -ls gutenberg
Use java wordcount to calculate the words number
#hadoop/bin/hadoop jar hadoop-mapred-examples-0.21.0.jar
wordcount gutenberg gutenberg-output
Hadoop Web Interfaces
http://localhost:50070/ – web UI of the NameNode daemon
http://localhost:50030/ – web UI of the JobTracker daemon
http://localhost:50060/ – web UI of the TaskTracker daemon
Vishal Vyas
0 comments:
Post a Comment