Hive - Column

About

Relation - Column in Hive Context

Statistic

Built-in

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VirtualColumns

Hive 0.8.0 provides support for two virtual columns:

  • INPUT__FILE__NAME is the input file's name for a mapper task.
select 
  INPUT__FILE__NAME, 
  key, 
  BLOCK__OFFSET__INSIDE__FILE 
from 
  src;
 
select 
  key, 
  count(INPUT__FILE__NAME) 
from 
  src 
group by key 
order by key;
  • BLOCK__OFFSET__INSIDE__FILE is the current global file position. For block compressed file, it is the current block's file offset, which is the current block's first byte's file offset.
select
  * 
from 
  src 
where
  BLOCK__OFFSET__INSIDE__FILE > 12000
order by key;

Powered by ComboStrap