Total Pageviews

Tuesday 7 April 2015

Running Hadoop examples on Cloudera Quickstart

Cloudera provides a very complete Quickstart VM, a downloadable image for VMWare, KVM or VirtualBox that contains everything to run a single-node Hadoop environment. It also includes the additional components like Hive, Zookeeper, etc.

1. Open a Terminal (Right-click on Desktop or click Terminal icon in the top toolbar)

2. Navigate to the Hadoop library directory:
cd /usr/lib/hadoop-mapreduce/


3. Execute the Hadoop jar command to run the WordCount example:
hadoop jar hadoop-mapreduce-examples.jar wordcount

4. The wordcount example complains that it needs input and output parameters.
 Usage: wordcount <in> <out>

5. Create one or more text files with a few words in it for testing, or use a log file:
echo "count these words for me hadoop" > /home/cloudera/file1
echo "hadoop counts words for me" > /home/cloudera/file2 

6. Create a directory on the HDFS file system:
hdfs dfs -mkdir /user/cloudera/input

7. Copy the files from local filesystem to the HDFS filesystem:
hdfs dfs -put /home/cloudera/file1 /user/cloudera/input
hdfs dfs -put /home/cloudera/file2 /user/cloudera/input

8. Run the Hadoop WordCount example with the input and output specified:
hadoop jar hadoop-mapreduce-examples.jar wordcount /user/cloudera/input /user/cloudera/output

9. Hadoop prints out a whole lot of logging information, after completion view the output directory:
hdfs dfs -ls /user/cloudera/output

10. Check the output file to see the results:
hdfs dfs -cat /user/cloudera/output/part-r-00000

4 comments:

  1. There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this.

    Big data training in velachery
    Hadoop training chennai velachery
    Hadoop training velachery

    ReplyDelete
  2. Cloud is one of the tremendous technology that any company in this world would rely on(Best Institute for Cloud Computing in Chennai). Using this technology many tough tasks can be accomplished easily in no time. Your content

    are also explaining the same(Cloud computing training in chennai). Thanks for sharing this in here. You are running a great blog, keep up this good work.

    ReplyDelete
  3. Appreciation for really being thoughtful and also for deciding on certain marvelous guides most people really want to be aware of.

    Cloud Training
    Software Testing Training
    Big Data Hadoop Admin Training

    ReplyDelete