Map Reduce - Data (Stream) - <key, value> pairs


The MapReduce framework operates exclusively on <key, value> pairs conceivably of different types.

The key and value classes have to implement:

  • the Writable interface (to be serializable)
  • the WritableComparable interface (to facilitate sorting)

See also MapReduce - InputFormat

Example of pipeline

Input and Output types of a MapReduce job forms a pipeline:

(input) <k1, v1> -> map -> <k2, v2> -> combine -> <k2, v2> -> reduce -> <k3, v3> (output)


Powered by ComboStrap