Hive - Column

1 - About

Relation - Column in Hive Context

3 - Statistic

4 - Built-in

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VirtualColumns

Hive 0.8.0 provides support for two virtual columns:

  • INPUT__FILE__NAME is the input file's name for a mapper task.

select 
  INPUT__FILE__NAME, 
  key, 
  BLOCK__OFFSET__INSIDE__FILE 
from 
  src;
 
select 
  key, 
  count(INPUT__FILE__NAME) 
from 
  src 
group by key 
order by key;

  • BLOCK__OFFSET__INSIDE__FILE is the current global file position. For block compressed file, it is the current block's file offset, which is the current block's first byte's file offset.

select
  * 
from 
  src 
where
  BLOCK__OFFSET__INSIDE__FILE > 12000
order by key;


Data Science
Data Analysis
Statistics
Data Science
Linear Algebra Mathematics
Trigonometry

Powered by ComboStrap