Hive - WebHCat (Rest API for HCatalog)

Card Puncher Data Processing

About

WebHCat ((or Templeton) service is a REST operation based API for HCatalog.

WebHCat provides a service that you can use to run Hadoop MapReduce (or YARN), Pig, Hive jobs or perform Hive metadata operations using an HTTP (REST style) interface.

WebHCat is a REST interface for remote job execution, such as:

WebHCat translates the job submission requests into YARN applications, and returns a status derived from the YARN application status.

Management

DDL

WebHCat DDL Resources

Check

  • On a HdInsight Azure cluster
curl -u admin:{HTTP PASSWD} https://{CLUSTERNAME}.azurehdinsight.net/templeton/v1/status?user.name=admin

Log

in the /var/log/webhcat directory:

  • webhcat.log is the log4j log to which server writes logs. Each webhcat.log is rolled over daily, generating files named webhcat.log.YYYY-MM-DD.
  • webhcat-console.log is the stdout of the server when started
  • webhcat-console-error.log is the stderr of the server process

Connection

To list the network connections to and from WebHCat:

netstat | grep 30111

30111 is the port WebHCat listens on. The number of open sockets should be less than 10.

Support

BadGateway (502 status code)

This is a generic message from gateway nodes.

Documentation / Reference





Discover More
Yarn Hortonworks
Hadoop - Sqoop

Sqoop is designed to: import tables from a database into HDFS. export HDFS data into a database Sqoop is a Hadoop command line program to (process/transfer) data between: structured (generally...
Mapreduce Pipeline
MapReduce - Job

A MapReduce Job is a running instance of an MapReduce application This job is a Yarn job as Yarn is the new Hadoop implementation of the Map reduce framework (v2). A map reduce application (/ job) is...



Share this page:
Follow us:
Task Runner