Hive - Schematool

Card Puncher Data Processing

About

Schematool is a offline command line tool to manage the metastore.

This tool can be used to initialize the metastore schema for the current Hive version.

Syntax

Location:

  • HOME/bin
  • Hortonworks (/usr/hdp/current/hive-client/bin/hive or /usr/hdp/2.6.2.3-1/hive/bin)
usage: schemaTool
 -dbOpts <databaseOpts>             Backend DB specific options
 -dbType <databaseType>             Metastore database type
 -dryRun                            list SQL scripts (no execute)
 -help                              print this message
 -info                              Show config and schema details
 -initSchema                        Schema initialization
 -initSchemaTo <initTo>             Schema initialization to a version
 -passWord <password>               Override config file password
 -servers <serverList>              a comma-separated list of servers used
                                    in location validation
 -upgradeSchema                     Schema upgrade
 -upgradeSchemaFrom <upgradeFrom>   Schema upgrade from a version
 -userName <user>                   Override config file user name
 -validate                          Validate the database
 -verbose                           only print SQL statements

Management

Configuration

<property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:sqlserver://hisqldb.database.windows.net:1433;database=clus01</value>
    <description>
      JDBC connect string for a JDBC metastore.
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
    </description>
</property>
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.microsoft.sqlserver.jdbc.SQLServerDriver</value>
    <description>Driver class name for a JDBC metastore</description>
</property>
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>mine</value>
    <description>password to use against metastore database</description>
</property>
 <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>hi_hive</value>
    <description>Username to use against metastore database</description>
</property>

info

./schematool -dbType mssql -info 
Metastore connection URL:        jdbc:sqlserver://hisqldb.database.secure.windows.net;database=clus01;encrypt=true;trustServerCertificate=true;create=false;loginTimeout=300
Metastore Connection Driver :    com.microsoft.sqlserver.jdbc.SQLServerDriver
Metastore connection User:       hi_hive
Hive distribution version:       1.2.1000
Metastore schema version:        2.3.0
schemaTool completed

init

./schematool -dbType mssql -initSchema
Metastore connection URL:        jdbc:sqlserver://mydb.database.windows.net:1433;database=clus01
Metastore Connection Driver :    com.microsoft.sqlserver.jdbc.SQLServerDriver
Metastore connection User:       userName
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mssql.sql
Initialization script completed
schemaTool completed

Table list

for SQL Server:

SELECT
  tbl.name AS tbl_name,
  sch.name AS sch_name
FROM sys.tables tbl,
  sys.schemas sch
WHERE
  tbl.schema_id = sch.schema_id AND
  sch.name = 'hi_hive'
ORDER BY
  tbl.name;

Local Installation on Windows with Docker

Console used: cmder.

  • Download and unzip Hadoop distribution
  • Download and unzip Hive Distribution
  • File Structure Example
/d/hadoop/hadoop-2.8.1
d/hadoop/apache-hive-2.3.2-bin

  • Start a docker image with Java
SOURCE_MOUNT_POINT=$(cygpath -u $(readlink -f .))
docker run \
  -it \
  --rm \
  --mount type=bind,source=/$SOURCE_MOUNT_POINT,target=/wd \
  openjdk
  • Then
export HADOOP_HOME=/wd/hadoop-2.8.1/
cd /wd/apache-hive-2.3.2-bin/bin
./schematool

Support

Failed to load driver

org.apache.hadoop.hive.metastore.HiveMetaException: Failed to load driver
Underlying cause: java.lang.ClassNotFoundException : com.microsoft.jdbc.sqlserver.SQLServerDriver

It may be a connection problem. Verify your connection informations.

Documentation / Reference





Discover More
Hive Metastore Er Diagram
Hive - Metastore

All the metadata for Hive tables and partitions are accessed through the Hive Metastore. Derby is the default database for the Hive metastore. See ...



Share this page:
Follow us:
Task Runner