Author(s): D. Saravanan, K. Girija, B. Kumaragurubaran, T. Senthilkumar
Abstract: As computer systems create and collect growing amounts of data, analyze it becomes a basic part of improving the services provided by Internet companies. A vital property of the workloads method by Map Reduce applications is that they are often incremental by nature; i.e., Map Reduce jobs often run frequently with small changes in their input. In this paper, explain the architecture, implementation, and evaluation of a vital Map Reduce framework, named I2 map reduce framework, for incremental computations. I2 map reduce notice changes to the inputs and allow the automatic update of the outputs by employing an efficient, fine-grained result re-use mechanism. To attain efficiency without give up transparency, accept recent advances in the area of programming languages to identify methodically the shortcomings of task-level memorization approaches, and address them using numerous novel techniques such as a storage system to store the input of consecutive runs, a reduction phase that make the incremental computation of the reduce tasks more efficient, and a scheduling algorithm for Hadoop that is aware of the location of previously computed results.