Spark - Key-Value RDD

About

Spark supports Key-Value pairs RDD in Python trough a list of tuple.

A count of an RDD with tuple will return the number of tuples. A tuple can be seen as a row.

Articles Related

Construction

Spark RDD - (Creation|Construction|Initialization)

rdd = sc.parallelize([(1, 2), (3, 4)])

RDD: [(1, 2), (3, 4)]

Transformation

Some Key-Value Transformations

Action

Spark - Collect

Documentation / Reference

Java PairRDDFunctions Class