What is the method of Map and Reduce – How it works? (Part 2)

  • Home
  • /
  • Blog
  • /
  • What is the method of Map and Reduce – How it works? (Part 2)
Try Free SQL Trainer - learn by doing!
SQL queries made easy - Natural Questions to SQL Converter.

Read the 1st part of the article 

The Reduce Phase

Once input data is split into multiple datasets which have been analyzed by corresponding multiple map tasks, reduce phase begins. The input of the reduce phase is multiple map tasks which are given as input to multiple reduce tasks that also run in parallel. Finally the processed outputs from different map tasks are aggregated and consolidated into final consolidated result. The results are also stored in the HDFS by default.

Map and Reduce Phases Run in Parallel

We saw that the output from the map phase is fed as input to the reduce phase, however in practical these two phases are not exactly sequential. As soon as any of the map tasks is completed, the reduce task for that particular map task begins making map and reduce phases run in parallel to each other.