Dataset is a interface to the Spark Engine added in Spark 1.6 that provides:
When running a SQL against Spark Thrift Server, the dataset interface is used in the background
A Dataset is a strongly typed collection of domain-specific objects.
Each Dataset also has an untyped view called a DataFrame, which is a Dataset of Row. A dataframe is then just a dataset.
A Dataset can be constructed from JVM objects and then manipulated using functional transformations (map, flatMap, filter, etc.).
val people = spark.read.parquet("...").as[Person]
Dataset<Person> people = spark.read().parquet("...").as(Encoders.bean(Person.class));