About
transformation function in RDD
Articles Related
List
Transformations | Description |
---|---|
filter | returns a new data set that's formed by selecting those elements of the source on which a function returns true. |
distinct([numTasks])) | returns a new data set that contains the distinct elements of the source data set. |
map and flatMap | returns a new distributed data set that's formed by passing each element of the source through a function. |
zip (optionally with index or id) | returning key-value pairs of the n element of each RDD: <math>\forall i\in \{0, \dots, N\} (rdd1_i,rdd2_i)</math> |
split | split data set |
pipe |