Vision

In this project, we plan to implement the MapReduce paradigm and demonstrate it using a couple of sample applications. It has: a client, a master and multiple workers (mappers/reducers). The client sends the task to the master, which then partitions the dataset and assigns it to the mappers and reducers to produce the final result. The user provides the input dataset as a file on the NFS(network file system). We plan to provide a GUI which depicts the state of the MapReduce system. Also fault tolerance is provided to handle failures. To validate our mapreduce API, we plan to test it with a couple of applications viz word count, inverted index.

Goals

  • To design and implement MapReduce API that enables building MapReduce applications.
  • To demonstrate the feasibility of our system using a couple of sample applications.