About
lag and lead are window functions that provide access to a row at a given offset prior to/after the current row position.
- lag retrieves the value of a column in a previous row.
- lead retrieves the value of a column in a next row.
They provide access to a row at a given offset prior to/after the current row position.
Because the functions provide access to more than one row of a table at the same time without a self-join, they can enhance processing speed.
Syntax
The general syntax (oracle, sqlite, …) is.
lead(expr) over ( ..... )
lead(expr, offset) over ( ..... )
lead(expr, offset, default) over ( ..... )
-- or/and
lag(expr) over ( ..... )
lag(expr, offset) over ( ..... )
lag(expr, offset, default) over ( ..... )
where:
- value_expr is the sql expression that is applied on the row retrieved
- offset is the offset of the row retrieved (non-negative integer, default to 1). Note that offset may be 0 in this case, the expression is evaluated against the current row.
- default (default to NULL) is the default value returned if the row identified by offset does not exist. (ie when the current row is the last/first).
Example
How to retrieve the quantity sold from the previous month?
The example schema SH is used.
select CALENDAR_MONTH_DESC,
sum(quantity_sold),
lag(sum(quantity_sold),1) over (order by CALENDAR_MONTH_DESC) "Previous Month"
from
SH.sales,
SH.TIMES
where
SH.times.TIME_ID = SH.sales.TIME_ID and
SH.times.CALENDAR_YEAR = '2000'
group by CALENDAR_MONTH_DESC
order by CALENDAR_MONTH_DESC ASC
And you get :
CALENDAR_MONTH_DESC | SUM(QUANTITY_SOLD) | Previous Month |
---|---|---|
2000-01 | 44270 | |
2000-02 | 41218 | 44270 |
2000-03 | 38906 | 41218 |
2000-04 | 34962 | 38906 |
2000-05 | 40092 | 34962 |
2000-06 | 35976 | 40092 |
2000-07 | 37068 | 35976 |
2000-08 | 40738 | 37068 |
2000-09 | 40094 | 40738 |
2000-10 | 43084 | 40094 |
2000-11 | 39418 | 43084 |
2000-12 | 29466 | 39418 |
The little problem of this time solution against an ago function for a period-to-period comparison is that you don't have the previous value for the month 2000-01 because the value is not in the returned data set.
How to retrieve the end date from the previous row in a SCD2 table?
Sometimes, you need to update or control the structure of an SCD2 table. For this need, you need to retrieve the date of the previous rows. The lag function gives here a beautiful solution.
SELECT
BUSINESS_KEY AS BUSINESS_KEY
,
START_DATE AS START_DATE
,
END_DATE AS END_DATE
,
LAG(END_DATE,1) over (partition BY BUSINESS_KEY order by END_DATE ASC) + 1/(24*60*
60 ) AS THE_LAG
FROM
TABLE
ORDER BY
BUSINESS_KEY
,
END_DATE;
Result:
BUSINESS_KEY | START_DATE | END_DATE | THE_LAG |
---|---|---|---|
A101 | 01-01-1800 00:00:00 | 02-01-2011 23:59:59 | |
A101 | 03-01-2011 00:00:00 | 31-12-9999 00:00:00 | 03-01-2011 00:00:00 |
A204 | 01-01-1800 00:00:00 | 02-01-2011 23:59:59 | |
A204 | 03-01-2011 00:00:00 | 31-12-9999 00:00:00 | 03-01-2011 00:00:00 |
A308 | 01-01-1800 00:00:00 | 02-01-2011 23:59:59 | |
A308 | 03-01-2011 00:00:00 | 31-12-9999 00:00:00 | 03-01-2011 00:00:00 |
A401 | 01-01-1800 00:00:00 | 02-01-2011 23:59:59 | |
A401 | 03-01-2011 00:00:00 | 31-12-9999 00:00:00 | 03-01-2011 00:00:00 |
A404 | 01-01-1800 00:00:00 | 02-01-2011 23:59:59 | |
A404 | 03-01-2011 00:00:00 | 31-12-9999 00:00:00 | 03-01-2011 00:00:00 |