Step functions, are another way of fitting non-linearities. (especially popular in epidemiology and biostatistics)
Natural cut point
The piecewise constant functions are especially useful if they are some natural cut points that you want to use or/and are of interest.
what is the average income for somebody below the age of 35? You read it straight off the plot. This is often good food for summaries (in newspapers, reports, …)
Useful way of creating interactions that are easy to interpret.
Example of interaction variable (between Year and Age) in a linear model:
X_1 = I(Year < 2005) . Age X_2 = I(Year > 2005) . Age
where I is the R indicator function.
With this two variables, we will fit a different linear model as a function of age for the people who worked before 2005 and those after 2005. We will get two different linear functions in each age category. It's an easy way of seeing the effect of an interaction.
With polynomials, we have a single function for the whole range of the x variable. If I change a point on the left side, it could potentially change the fit on the right side for polynomials. But for step functions, a point only affects the fit in its partition and not the others.
The function of the sub-ranges must be seen as binary variable.
Is x less than 35?
- If yes, you make it 1.
- If not, you make it a 0.
You creates then a series of dummy variables (zero-one variables) representing each group and you just fit those with the linear model.