# Spark - (RDD) Transformation

### Table of Contents

## About

transformation function in RDD

## Articles Related

## List

Transformations | Description |
---|---|

filter | returns a new data set that's formed by selecting those elements of the source on which a function returns true. |

distinct([numTasks])) | returns a new data set that contains the distinct elements of the source data set. |

map and flatMap | returns a new distributed data set that's formed by passing each element of the source through a function. |

zip (optionally with index or id) | returning key-value pairs of the n element of each RDD: <math>\forall i\in \{0, \dots, N\} (rdd1_i,rdd2_i)</math> |

split | split data set |

pipe |