[Hadoop] To build a Hadoop environment (a single node cluster)

For the purpose of studying Hadoop, I have to build a testing environment to do. I found some resource links are good enough to build a single node cluster of Hadoop MapReduce as follows. And there are additional changes from my environment that I want to add some comments for my reference.


Login the user "hadoop"

$ sudo su - hadoop

Go to the location of Hadoop

$ /usr/local/hadoop

Add the variables in ~/.bashrc

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 export HADOOP_HOME=/usr/local/hadoop export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL

Modify $JAVA_HOME in etc/hadoop/hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64

Start dfs and yarn

$ sbin/start-dfs.sh
$ sbin/start-yarn.sh

Finally, we can try the Hadoop MapReduce example as follows:
$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar grep input output 'dfs[a-z.]+'


