Hive - Dynamic Partition (DP)

About

DynamicPartitions (DP) is a partition type where the partitioning columns are only known at EXECUTION TIME (ie within a SQL statement)

Hive will create the partitions automatically.

The only difference is that DP columns do not have values, while SP columns do

Conf

Enable

SET hive.exec.dynamic.partition = true -- Needs to be set to true to enable dynamic partition inserts

Mode

  • set hive.exec.dynamic.partition.mode=strict - the user must specify at least one static partition in case the user accidentally overwrites all partitions
  • set hive.exec.dynamic.partition.mode=nonstrict - in nonstrict mode all partitions are allowed to be dynamic - you don't need any static partition column.

Example

Full Dynamic

INSERT OVERWRITE TABLE T PARTITION (ds, hr)
SELECT key, value, ds, hr FROM srcpart WHERE ds is not null and hr>10;

where:

  • the dynamic partition columns (ie ds, hr) must be specified:
    • last (among the columns in the SELECT statement)
    • in the same order (than in the PARTITION clause)

Static and Dynamic

Replace the partition 2010-03-03 (as the partition column order determines directory hierarchy, you cannot set the static value on the second column partition)

INSERT OVERWRITE TABLE T PARTITION (ds='2010-03-03', hr)
SELECT key, value, /*ds,*/ hr FROM srcpart WHERE ds is not null and hr>10;

Create table as (CTAS)

CREATE TABLE T (key int, value string) PARTITIONED BY (ds string, hr int) AS
SELECT key, value, ds, hr+1 hr1 FROM srcpart WHERE ds is not null and hr>10;

Documentation / Reference


Powered by ComboStrap