Spark - HDFS
HDFS in Spark.
If you plan to read and write from HDFS using Spark, there are two Hadoop configuration files that should be included on Spark’s classpath (???):
- hdfs-site.xml, which provides default behaviors for the HDFS client.
- core-site.xml, which sets the default filesystem name.
To make these files visible to Spark, set HADOOP_CONF_DIR in $SPARK_HOME/conf/spark-env.sh to a location containing the configuration files.