About
The spark SQL server is the HiveServer2 in Hive 1.2.1. It's a Thrift JDBC/ODBC server
Articles Related
Version
- beeline from Spark or Hive 1.2.1
- Hive 1.2.1
Configuration
High availaibilty
There is not yet a service discovery (SPARK-19541)
Therefore, a load balancer must be put in front of two thrift server.
Management
Start
Linux
To start the JDBC/ODBC server, run the following in the Spark directory:
./sbin/start-thriftserver.sh
# From Hortonworks
./sbin/start-thriftserver.sh --master yarn-client --executor-memory 512m --hiveconf hive.server2.thrift.port=10015
Windows
cd %SPARK_HOME%\bin
spark-class2 org.apache.spark.deploy.SparkSubmit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal
Connection
Port
The port can be configured with the following conf parameter: –hiveconf hive.server2.thrift.port=10001
The start output gives you also the port (default:10000)
18/07/18 16:36:05 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
18/07/18 16:36:05 INFO ObjectStore: Initialized ObjectStore
18/07/18 16:36:05 INFO HiveMetaStore: 0: get_databases: default
18/07/18 16:36:05 INFO audit: ugi=gerard ip=unknown-ip-addr cmd=get_databases: default
18/07/18 16:36:05 INFO HiveMetaStore: 0: Shutting down the object store...
18/07/18 16:36:05 INFO audit: ugi=gerard ip=unknown-ip-addr cmd=Shutting down the object store...
18/07/18 16:36:05 INFO HiveMetaStore: 0: Metastore shutdown complete.
18/07/18 16:36:05 INFO audit: ugi=gerard ip=unknown-ip-addr cmd=Metastore shutdown complete.
18/07/18 16:36:05 INFO AbstractService: Service:ThriftBinaryCLIService is started.
18/07/18 16:36:05 INFO AbstractService: Service:HiveServer2 is started.
18/07/18 16:36:05 INFO HiveThriftServer2: HiveThriftServer2 started
18/07/18 16:36:05 INFO ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads
Driver UI
http://172.23.0.1:4040/jobs/ (default)
On HdInsight, you need to go to the Yarn UI to get the driver UI:
Headnode
Service for connecting to Spark SQL (Thrift/JDBC) is a Spark Thrift servers on the Head nodes (Example: Azure: Port:10002, Protocol: Thrift)
Azure HdInsight
It's the same than for Hive bust instead of containing httpPath=/hive2 it is httpPath/sparkhive2
- Gateway: jdbc:hive2://clustername.azurehdinsight.net:443/;ssl=true;transportMode=http;httpPath=/sparkhive2
- HeadNode: ''jdbc:hive2://headnodehost:10002/;transportMode=http
Example with beeline
beeline -u 'jdbc:hive2://headnodehost:10002/;transportMode=http'
Beeline
beeline
!connect jdbc:hive2://localhost:10000 nico ""
SET;
SHOW TABLES;