Hive - Metastore

Card Puncher Data Processing

About

All the metadata for Hive tables and partitions are accessed through the Hive Metastore.

Derby is the default database for the Hive metastore.

Management

schematool

See Hive - Schematool

Import / Export

Query

The metadata are stored in a apart database

use metastore; 
show tables; 
-- tables
select * from TBLS;

ER diagram

HiveMetaStore

Hive Metastore Er Diagram

Database

Metadata is persisted using JPOX ORM solution (Data Nucleus) so any database that is supported by it can be used by Hive.

Supported database

Derby

Apache Derby (JavaDB)

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:derby://{file_location}/metastore_db;create=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>org.apache.derby.jdbc.EmbeddedDriver</value>
  <!-- See also: org.apache.derby.jdbc.ClientDriver -->
  <description>Driver class name for a JDBC metastore</description>
</property>

<property>
  <name>hive.metastore.warehouse.dir</name>
  <!-- in local mode -->
  <value>file://${user.dir}/../build/ql/test/data/warehouse</value>
  <description>Data goes in here on your local filesystem.</description>
</property>

where: Hive - Warehouse

Derby is generally used as an local metastore but it can be use as a remote one. See HiveDerbyServerMode

MySQL

  • Create a hive user
create database hive;
create user 'hiveuser'@'%' IDENTIFIED BY 'hivepass';
GRANT all on *.* to 'hiveuser'@localhost identified by 'hivepass';
flush privileges;
  • hive-site.xml
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://{mysql_host}:${mysql_port}/{metastore_db}?createDatabaseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>{username}</value>
  <description>username to use against metastore database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>{password}</value>
  <description>password to use against metastore database</description>
</property>

SQL Server

  • Create a hive user
CREATE DATABASE Hive;

CREATE LOGIN metastore WITH
		PASSWORD = 'pwd',
		DEFAULT_DATABASE = Hive;
 
 
USE Hive;
DROP USER metastore;
DROP schema metastore;
CREATE USER metastore FOR LOGIN metastore;
CREATE SCHEMA metastore AUTHORIZATION metastore;
GRANT CONNECT TO metastore;
GRANT CREATE TABLE TO metastore;
GRANT CREATE VIEW TO metastore;
ALTER USER metastore WITH DEFAULT_SCHEMA = metastore;
  • hive-site.xml
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:sqlserver://{host}[:{port|1433}][;databaseName={database}]</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.microsoft.sqlserver.jdbc.SQLServerDriver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>{username}</value>
  <description>username to use against metastore database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>{password}</value>
  <description>password to use against metastore database</description>
</property>

Documentation / Reference





Discover More
Hive System Architecture
Hive - Architecture

Example of System architecture with Map/Reduce as Engine. where: (for UI) Job Tracker: ...
Card Puncher Data Processing
Hive - Configuration (Variable)

Configuration variable in Hive Hive follows the Hadoop configuration guideline. metastore configuration. See Metastore...
Card Puncher Data Processing
Hive - High Availability

The Hive metastore is stateless and thus there can be multiple instances to achieve High Availability. Using hive.metastore.uris it is possible to specify multiple remote metastores. Hive will use the...
Card Puncher Data Processing
Hive - Mode (Type of Metastore)

The mode of Hive refers to the type of metastore database: is an embedded database and is a remote database Running in local mode, means that Hive will use an embedded database (ie Derby)....
Card Puncher Data Processing
Hive - Schematool

Schematool is a offline command line tool to manage the metastore. This tool can be used to initialize the metastore schema for the current Hive version. Location: HOME/bin Hortonworks (/usr/hdp/current/hive-client/bin/hive...
Card Puncher Data Processing
Hive - Table-Level Statistics (Table/Partition/Column)

Statistics on the data of a table. Statistics are stored in the Hive Metastore Table-level Statistics (HIVE-1361)...
Card Puncher Data Processing
Hive - Warehouse

Hive metastore warehouse is the root location of the internal storage of data for Hive. The data of tables: created without file system location (ie internal table) loaded are saved in this directory....
Card Puncher Data Processing
Hive - hive-site.xml

The hive-site.xml is the global hive configuration file The file hive-default.xml.template contains the default values. with Hadoop: %HADOOP_CONF_DIR%\hive-site.xml or %HADOOP_HOME%\conf\hive-site.xml...
Idea Classpath Spark
Spark - Local Installation

A local installation is a spark installation on a single machine (generally a dev machine). The local master connection will start for you a local standalone spark installation on your machine. This...
Card Puncher Data Processing
Spark - Metastore

The Spark Metastore is based generally on Conf key Value Desc spark.sql.hive.caseSensitiveInferenceMode INFER_AND_SAVE Sets the action to take when a case-sensitive schema cannot be read...



Share this page:
Follow us:
Task Runner