Hive - WebHCat (Rest API for HCatalog)

1 - About

WebHCat ((or Templeton) service is a REST operation based API for HCatalog.

WebHCat provides a service that you can use to run Hadoop MapReduce (or YARN), Pig, Hive jobs or perform Hive metadata operations using an HTTP (REST style) interface.

WebHCat is a REST interface for remote job execution, such as:

WebHCat translates the job submission requests into YARN applications, and returns a status derived from the YARN application status.

3 - Management

3.1 - DDL

3.2 - Check

  • On a HdInsight Azure cluster

curl -u admin:{HTTP PASSWD} https://{CLUSTERNAME}

3.3 - Log

in the /var/log/webhcat directory:

  • webhcat.log is the log4j log to which server writes logs. Each webhcat.log is rolled over daily, generating files named webhcat.log.YYYY-MM-DD.
  • webhcat-console.log is the stdout of the server when started
  • webhcat-console-error.log is the stderr of the server process

3.4 - Connection

To list the network connections to and from WebHCat:

netstat | grep 30111

30111 is the port WebHCat listens on. The number of open sockets should be less than 10.

4 - Support

4.1 - BadGateway (502 status code)

This is a generic message from gateway nodes.

5 - Documentation / Reference

Data Science
Data Analysis
Data Science
Linear Algebra Mathematics

Powered by ComboStrap