Table of Contents

ML - SparklingWater (h20 inside Spark)

About

Sparkling Water provides H2O's fast scalable machine learning engine inside Spark cluster.

Sparkling Water is distributed as a Spark application library which can be used by any Spark application.

Demo

Management

Installation

Docker

See https://github.com/h2oai/sparkling-water/tree/master/docker

Local Installation

unzip sparkling-water-2.2.17.zip
setx PATH=C:\sparkling-water-2.2.17\bin;%PATH%
setx SPARK_HOME=/pathToSpark
setx MASTER="local[*]" 

Artifacts (Jar File)

More see https://github.com/h2oai/sparkling-water#maven

Shell

The Sparkling shell encapsulates a regular Spark shell and append Sparkling Water library on the classpath via –jars option. The Sparkling Shell supports creation of an H2O cloud and execution of H2O algorithms.

sparkling-shell --conf "spark.executor.memory=1g"
import org.apache.spark.h2o._
val hc = H2OContext.getOrCreate(spark)

Documentation / Reference