filter(func) returns a new data set (RDD) that's formed by selecting those elements of the source on which the function returns true.
rdd.filter(lambda x:x % 2 == 0)
[1,2,3,4] → [2,4]
lines = sc.textFile("...",4)
comments = lines.filter(isComment)
# where isComment is a funcion that return a boolean