What is the difference and similarity between Normal MapReduce Job and a Sequential MapReduce Job?

When we execute a normal wordcount job, we use a MapReduce program to do so. It is not sequential. But in case of programs like shortest path analysis of large graphs we have to design a "sequential" MapReduce job. What is the basic difference or similarity between these two approaches of MapReduce programming.

Answers


As you mention that your wordcount job is not sequential, I suppose you are using the sample wordcount job where keys (words) are divided in map phase and processing (count) in reduce phase. Consequently, tasks could be split in different nodes and executed simultaneously.

I suggest you to read this tutorial: https://developer.yahoo.com/hadoop/tutorial/module4.html So that you will be able to realize that depending of number of available nodes the tasks are distributed even in the Map phase!

Regarding your "sequential" MapReduce job. I assume you meant that there are no way of dividing the processing to achieve the desired results. If this is the case, I suspect you won't achieve the best results from Hadoop MapReduce framework since your reduce phase will have to happen in a single node. However, if you do a quick search I'm sure you will be able to find algorithms for Graph processing such as Dijkstra designed for MapReduce.

Cheers Marco


Need Your Help