Table of Contents

List

Steps

# all hadoop technology are installed below the hadoop folder
cd /hadoop

# untar the archive
tar -xvf apache-hive-2.3.2-bin.tar.gz

# Rename the folder because we want to be able to select it when navigating the file system tree with one letter
mv apache-hive-2.3.2-bin/ hive-2.3.2

Sequence of API calls involved to make the first query

In HiveServer2:

  • The client creates a HiveConnection by initiating a transport connection (e.g., TCP connection) followed by an OpenSession API call to get a SessionHandle. The session is created from the server side.
  • The HiveStatement is executed (following JDBC standards) and an ExecuteStatement API call is made from the Thrift client. In the API call, SessionHandle information is passed to the server along with the query information.
  • The HS2 server receives the request and asks the driver (which is a CommandProcessor) for query parsing and compilation. The driver kicks off a background job that will talk to Hadoop and then immediately returns a response to the client. This is an asynchronous design of the ExecuteStatement API. The response contains an OperationHandle created from the server side.
  • The client uses the OperationHandle to talk to HS2 to poll the status of the query execution.