MapReduce Projects .

Technology

MapReduce

MapReduce is a distributed programming model that splits massive datasets across commodity hardware for parallel processing, delivering high-speed scalability.

MapReduce is the core framework for distributed computing, originally engineered at Google by Jeffrey Dean and Sanjay Ghemawat in 2004 (Dean & Ghemawat, OSDI '04). It handles petabytes of data by requiring the user to define two functions: Map and Reduce. The Map function processes input key/value pairs to generate a new set of intermediate pairs, effectively filtering and sorting the data. The Reduce function then aggregates all intermediate values associated with the same key, performing a summary operation (e.g., a classic word count). The framework manages all complexity: partitioning, scheduling across thousands of commodity machines, inter-machine communication, and fault tolerance. The most popular open-source implementation is found within the Apache Hadoop ecosystem.

https://research.google.com/archive/mapreduce.html
1 project · 1 city

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects