Map Reduce - Data (Stream) - <key, value> pairs

1 - About

The MapReduce framework operates exclusively on <key, value> pairs conceivably of different types.

The key and value classes have to implement:

  • the Writable interface (to be serializable)
  • the WritableComparable interface (to facilitate sorting)

See also MapReduce - InputFormat

3 - Example of pipeline

Input and Output types of a MapReduce job forms a pipeline:

(input) <k1, v1> -> map -> <k2, v2> -> combine -> <k2, v2> -> reduce -> <k3, v3> (output)


Data Science
Data Analysis
Data Science
Linear Algebra Mathematics

Powered by ComboStrap