Cloudera provides a very complete Quickstart VM, a downloadable image for VMWare, KVM or VirtualBox that contains everything to run a single-node Hadoop environment. It also includes the additional components like Hive, Zookeeper, etc.
1. Open a Terminal (Right-click on Desktop or click Terminal icon in the top toolbar)
2. Navigate to the Hadoop library directory:
cd /usr/lib/hadoop-mapreduce/
3. Execute the Hadoop jar command to run the WordCount example:
hadoop jar hadoop-mapreduce-examples.jar wordcount
4. The wordcount example complains that it needs input and output parameters.
Usage: wordcount <in> <out>
5. Create one or more text files with a few words in it for testing, or use a log file:
echo "count these words for me hadoop" > /home/cloudera/file1
echo "hadoop counts words for me" > /home/cloudera/file2
6. Create a directory on the HDFS file system:
hdfs dfs -mkdir /user/cloudera/input
7. Copy the files from local filesystem to the HDFS filesystem:
hdfs dfs -put /home/cloudera/file1 /user/cloudera/input
hdfs dfs -put /home/cloudera/file2 /user/cloudera/input
8. Run the Hadoop WordCount example with the input and output specified:
hadoop jar hadoop-mapreduce-examples.jar wordcount /user/cloudera/input /user/cloudera/output
9. Hadoop prints out a whole lot of logging information, after completion view the output directory:
hdfs dfs -ls /user/cloudera/output
10. Check the output file to see the results:
hdfs dfs -cat /user/cloudera/output/part-r-00000
1. Open a Terminal (Right-click on Desktop or click Terminal icon in the top toolbar)
2. Navigate to the Hadoop library directory:
cd /usr/lib/hadoop-mapreduce/
3. Execute the Hadoop jar command to run the WordCount example:
hadoop jar hadoop-mapreduce-examples.jar wordcount
4. The wordcount example complains that it needs input and output parameters.
Usage: wordcount <in> <out>
5. Create one or more text files with a few words in it for testing, or use a log file:
echo "count these words for me hadoop" > /home/cloudera/file1
echo "hadoop counts words for me" > /home/cloudera/file2
6. Create a directory on the HDFS file system:
hdfs dfs -mkdir /user/cloudera/input
7. Copy the files from local filesystem to the HDFS filesystem:
hdfs dfs -put /home/cloudera/file1 /user/cloudera/input
hdfs dfs -put /home/cloudera/file2 /user/cloudera/input
8. Run the Hadoop WordCount example with the input and output specified:
hadoop jar hadoop-mapreduce-examples.jar wordcount /user/cloudera/input /user/cloudera/output
9. Hadoop prints out a whole lot of logging information, after completion view the output directory:
hdfs dfs -ls /user/cloudera/output
10. Check the output file to see the results:
hdfs dfs -cat /user/cloudera/output/part-r-00000
There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this.
ReplyDeleteBig data training in velachery
Hadoop training chennai velachery
Hadoop training velachery
Cloud is one of the tremendous technology that any company in this world would rely on(Best Institute for Cloud Computing in Chennai). Using this technology many tough tasks can be accomplished easily in no time. Your content
ReplyDeleteare also explaining the same(Cloud computing training in chennai). Thanks for sharing this in here. You are running a great blog, keep up this good work.
Appreciation for really being thoughtful and also for deciding on certain marvelous guides most people really want to be aware of.
ReplyDeleteCloud Training
Software Testing Training
Big Data Hadoop Admin Training
ReplyDeleteHey, would you mind if I share your blog with my twitter group? There’s a lot of folks that I think would enjoy your content. Please let me know. Thank you.
Automation anywhere Training in Chennai | Best Automation anywhere Training Institute in Chennai
uipath training in chennai | Best uipath training Institute in chennai
Blueprism Training in Chennai | Best Blueprism Training Institute in Chennai
Rprogramming Training in Chennai | Best Rprogramming Training Institute in Chennai
Machine Learning training in chennai | Best Machine Learning training Institute in chennai