Forecasting through decomposition
TL;DR: In this post I illustrate a very simple forecasting technique with a toy dataset, namely, forecasting through decomposition. I use the seasonal_decompose
function in the statsmodels
package to do the decomposition. Then I use a naive forecasting technique and calculate the prediction interval by hand.
First of all, why would we want to use decomposition?
A lot of times we are dealing with time series that have some underlying patterns that can be interpreted independently from each other. We can tease out these patterns if we break down the series into several components. Studying and treating the individual components separately might lead to better understanding as well as better forecasting results.
Additive and multiplicative decomposition
FPP2 Chapter 6.1 introduces two simple decomposition models, additive and multiplicative decomposition:
For additive decomposition, we can write
$$y_t = S_t + T_t + R_t,$$
where $y_t$ is the data, $S_t$ is the seasonal component, $T_t$ is the trend-cycle component, and $R_t$ is the remainder component, all at period $t$.
Alternatively, a multiplicative decomposition would be written as
$$y_t = S_t \times T_t \times R_t$$
The additive decomposition is the most appropriate if the magnitude of the seasonal fluctuations, or the variation around the trend-cycle, does not vary with the level of the time series. When the variation in the seasonal pattern, or the variation around the trend-cycle, appears to be proportional to the level of the time series, then a multiplicative decomposition is more appropriate. Multiplicative decompositions are common with economic time series.
We have seen in the previous notebook how to estimate the trend-cycle component using moving averages. We can also estimate the seasonal component by hand (as described here as the classical decomposition method). But we can also use a ready-implemented function to do a more sophisticated version of that.
In Python we can use the seasonal_decompose
function in statsmodels
to automate the decomposition and create all three components. With the extrapolate_trend
parameter, we can specify a strategy to ensure that there are no NaN values in the trend or remainder component (remember that depending on the size of the MA window, data points at the edge of the x-axis might not have a corresponding trend value, and thus also no remainder value).
Let’s run through an example with a toy dataset (cycling data from Seattle, the same that was used in the previous notebook).
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose
sns.set()
Let’s use both the additive and the multiplicative decomposition on our dataset and see the results:
hourly = pd.read_csv('https://raw.githubusercontent.com/jakevdp/SeattleBike/master/FremontHourly.csv',
index_col='Date', parse_dates=True)
hourly.columns = ['northbound', 'southbound']
hourly['total'] = hourly['northbound'] + hourly['southbound']
daily = hourly.resample('d').sum()
result_add = seasonal_decompose(daily['total'], model='additive', extrapolate_trend='freq')
result_mul = seasonal_decompose(daily['total'], model='multiplicative', extrapolate_trend='freq')
Additive decomposition plots
plt.rcParams.update({'figure.figsize': (15,10)})
result_add.plot().suptitle('Additive decomposition', x=0.2, fontweight='bold')
plt.show()
Observations:
- The trend line seems to follow the original timeseries adaquately.
- The seasonal component has captured the weekly fluctuations.
- The remainders are roughly centered around 0. However, we see that the variance seems to increase towards the end of the timeseries.
Multiplicative decomposition plots
Let’s see what the multiplicative plot looks like:
result_mul.plot()
plt.show()
Observations:
- The remainders are centered around 1. This is as expected for the multiplicative model.
- Actually the variance seems more even (read: not dependent on x) in the multiplicative plot than in the additive plot. This might be a better decomposition than the additive one for our dataset.
Forecasting
Using decomposition for forecasting (FPP2 Chapter 6.8):
Assuming an additive decomposition, the decomposed time series can be written as
$$y_t = \hat{S}_t + \hat{A}_t$$
where $\hat{A}_t = \hat{T}_t + \hat{R}_t$ is the seasonally adjusted component [made up of the trend component and remainder component].
Or, if a multiplicative decomposition has been used, we can write
$$y_t = \hat{S}_t \times \hat{A}_t$$
where $\hat{A}_t = \hat{T}_t \times \hat{R}_t$.
To forecast a decomposed time series, we forecast the seasonal component, $\hat{S}_t$, and the seasonally adjusted component $\hat{A}_t$, separately. It is usually assumed that the seasonal component is unchanging, or changing extremely slowly, so it is forecast by simply taking the last year of the estimated component. In other words, a seasonal naïve method is used for the seasonal component.
To forecast the seasonally adjusted component, any non-seasonal forecasting method may be used.
So for each new time period in our forecast horizon, we will forecast two values, one for the seasonally adjusted component, and one for the seasonal component. Combining these two forecast values (according to the decomposition model we chose), we’ll arrive at the final forecast value for the given time period.
Let’s go ahead and do this for our dataset. We will choose the multiplicative model for making our forecasts:
df_reconstructed = pd.concat(
[result_mul.seasonal,
result_mul.trend,
result_mul.resid,
result_mul.trend * result_mul.resid,
result_mul.observed], axis=1)
df_reconstructed.columns = ['seasonal', 'trend', 'remainders', 'seasonal_adj', 'actual_values']
df_reconstructed.tail()
seasonal | trend | remainders | seasonal_adj | actual_values | |
---|---|---|---|---|---|
Date | |||||
2014-05-27 | 1.238517 | 3453.428571 | 1.176490 | 4062.924130 | 5032.0 |
2014-05-28 | 1.231516 | 3558.000000 | 0.914707 | 3254.526163 | 4008.0 |
2014-05-29 | 1.169167 | 2943.693878 | 1.332783 | 3923.304594 | 4587.0 |
2014-05-30 | 1.086529 | 2841.387755 | 1.577132 | 4481.243770 | 4869.0 |
2014-05-31 | 0.573298 | 2739.081633 | 1.838489 | 5035.772436 | 2887.0 |
Now, let’s do a forecast for 14 days using some simple methods. First forecast the seasonal component $\hat{S}$ using the same value, but from the period earlier (seasonal naive method):
df_forecast = df_reconstructed.iloc[-14:,:]
df_forecast = df_forecast.set_index(df_forecast.index.shift(14))
df_forecast = df_forecast.drop('actual_values', axis=1)
df_forecast[['trend', 'remainders', 'seasonal_adj']] = np.nan
df_forecast
seasonal | trend | remainders | seasonal_adj | |
---|---|---|---|---|
Date | ||||
2014-06-01 | 0.527062 | NaN | NaN | NaN |
2014-06-02 | 1.173911 | NaN | NaN | NaN |
2014-06-03 | 1.238517 | NaN | NaN | NaN |
2014-06-04 | 1.231516 | NaN | NaN | NaN |
2014-06-05 | 1.169167 | NaN | NaN | NaN |
2014-06-06 | 1.086529 | NaN | NaN | NaN |
2014-06-07 | 0.573298 | NaN | NaN | NaN |
2014-06-08 | 0.527062 | NaN | NaN | NaN |
2014-06-09 | 1.173911 | NaN | NaN | NaN |
2014-06-10 | 1.238517 | NaN | NaN | NaN |
2014-06-11 | 1.231516 | NaN | NaN | NaN |
2014-06-12 | 1.169167 | NaN | NaN | NaN |
2014-06-13 | 1.086529 | NaN | NaN | NaN |
2014-06-14 | 0.573298 | NaN | NaN | NaN |
The seasonally adjusted values $\hat{A}$ are forecast using simply the last trend and last remainder values we saw in the given time series (naive method). Of course, we could use a more complex technique here as well (e.g. some exponential smoothing method, or a non-seasonal ARIMA model etc.), but let’s keep it simple for now.
df_forecast['trend'] = df_reconstructed.loc[df_reconstructed.index[-1], 'trend']
df_forecast['remainders'] = df_reconstructed.loc[df_reconstructed.index[-1], 'remainders']
df_forecast['seasonal_adj'] = df_forecast['trend'] * df_forecast['remainders']
df_forecast['forecast'] = df_forecast['seasonal_adj'] * df_forecast['seasonal']
df_forecast
seasonal | trend | remainders | seasonal_adj | forecast | |
---|---|---|---|---|---|
Date | |||||
2014-06-01 | 0.527062 | 2739.081633 | 1.838489 | 5035.772436 | 2654.162830 |
2014-06-02 | 1.173911 | 2739.081633 | 1.838489 | 5035.772436 | 5911.549989 |
2014-06-03 | 1.238517 | 2739.081633 | 1.838489 | 5035.772436 | 6236.889021 |
2014-06-04 | 1.231516 | 2739.081633 | 1.838489 | 5035.772436 | 6201.632715 |
2014-06-05 | 1.169167 | 2739.081633 | 1.838489 | 5035.772436 | 5887.661183 |
2014-06-06 | 1.086529 | 2739.081633 | 1.838489 | 5035.772436 | 5471.511315 |
2014-06-07 | 0.573298 | 2739.081633 | 1.838489 | 5035.772436 | 2887.000000 |
2014-06-08 | 0.527062 | 2739.081633 | 1.838489 | 5035.772436 | 2654.162830 |
2014-06-09 | 1.173911 | 2739.081633 | 1.838489 | 5035.772436 | 5911.549989 |
2014-06-10 | 1.238517 | 2739.081633 | 1.838489 | 5035.772436 | 6236.889021 |
2014-06-11 | 1.231516 | 2739.081633 | 1.838489 | 5035.772436 | 6201.632715 |
2014-06-12 | 1.169167 | 2739.081633 | 1.838489 | 5035.772436 | 5887.661183 |
2014-06-13 | 1.086529 | 2739.081633 | 1.838489 | 5035.772436 | 5471.511315 |
2014-06-14 | 0.573298 | 2739.081633 | 1.838489 | 5035.772436 | 2887.000000 |
Note that the seasonally adjusted forecast values are the same values for all 14 days, but since they are multiplied with the seasonal component’s forecast values (i.e. they are ‘reseasoned’), the final forecast values differ from day to day. Now, let’s put the two dataframes together in order to make a plot with the 14-day forecast (full dataset and zoomed into the most recent couple of weeks).
df_plot = pd.concat([df_reconstructed, df_forecast], axis=0)
plt.rcParams.update({'figure.figsize': (20,7)})
df_plot[['actual_values', 'forecast']].plot()
df_plot.iloc[-50:,:][['actual_values', 'forecast']].plot()
plt.show()
As the plots above show, with this very simple decomposition model and the naive forecasting method we achieved a reasonable forecast. Obviously the periodic pattern repeats itself after every 7 days since the only variation we have in our forecast is coming from the weekly pattern in the seasonal component. It seems intuitive that this forecast will become more and more off, the further into the future we go. Let’s see how this growing uncertainty can be captured with a prediction interval.
Prediction interval around point forecasts
From FPP Chapter 3.5:
A prediction interval gives an interval within which we expect $y_t$ to lie with a specified probability. For example, assuming that the forecast errors are normally distributed, a 95% prediction interval for the $h$-step forecast is
$$\hat{y}_{T+h|T} \pm 1.96 * \hat{\sigma}_h,$$
where $\hat{\sigma}_h$ is an estimate of the standard deviation of the $h$-step forecast distribution.
Note that the prediction interval is different from a confidence interval in meaning. The confidence interval quantifies the uncertainty of an unknown population parameter based on observed data. The prediction interval quantifies the uncertainty of a single predicted, not-yet-observed data point; it doesn’t necessarily say anything about the parameter of the underlying population (read more here and here).
[For simple forecast methods,] it is possible to mathematically derive the forecast standard deviation under the assumption of uncorrelated residuals. If $\hat{\sigma}_h$ denotes the standard deviation of the $h$-step forecast distribution, and $\hat{\sigma}$ is the residual standard deviation, then we can use the following expressions.
[…]
Naïve forecasts:
$$\hat{\sigma}_h = \hat{\sigma} \sqrt{h}$$
Seasonal naïve forecasts:
$$\hat{\sigma}_h = \hat{\sigma} \sqrt{k+1},$$ where $k$ is the integer part of $(h−1)/m$ and $m$ is the seasonal period.
[…]
We construct the upper and lower bounds of the prediction interval the same way as the point forecasts. I.e. we first compute the upper and lower bounds for the seasonally adjusted data $\hat{S}$ by following the formulas above (estimating $\hat{\sigma}_h$ for the naive method), and then we ‘reseason’ the bounds by multiplying with the corresponding seasonal forecast value.
Following the example in FPP Chapter 3.8, we also ignore the uncertainty in the forecast of the seasonal component $\hat{S}$. The rationale is that the uncertainty in the seasonal component is usually much smaller than for the seasonally adjusted data, so it’s a reasonable approximation to make.
So first let’s compute the residual standard deviation $\hat{\sigma}$, which we will use to compute $\hat{\sigma}_h$:
residuals = df_reconstructed['actual_values'] - df_reconstructed['seasonal'] * df_reconstructed['trend']
resid_std = residuals.std()
resid_std
473.47147746617435
df_forecast['h'] = range(1, 15)
df_forecast['std_h'] = resid_std * np.sqrt(df_forecast['h'])
We first compute the interval bounds for the seasonally adjusted data and then combine them with the forecast values of the seasonal component:
df_forecast['lower_adj'] = (df_forecast['seasonal_adj'] - 1.96 * df_forecast['std_h'])
df_forecast['upper_adj'] = (df_forecast['seasonal_adj'] + 1.96 * df_forecast['std_h'])
df_forecast['lower'] = df_forecast['lower_adj'] * df_forecast['seasonal']
df_forecast['upper'] = df_forecast['upper_adj'] * df_forecast['seasonal']
df_forecast[['h', 'std_h', 'lower', 'upper']]
h | std_h | lower | upper | |
---|---|---|---|---|
Date | ||||
2014-06-01 | 1 | 473.471477 | 2165.047404 | 3143.278255 |
2014-06-02 | 2 | 669.589785 | 4370.913567 | 7452.186412 |
2014-06-03 | 3 | 820.076655 | 4246.158659 | 8227.619384 |
2014-06-04 | 4 | 946.942955 | 3915.929525 | 8487.335905 |
2014-06-05 | 5 | 1058.714409 | 3461.544923 | 8313.777443 |
2014-06-06 | 6 | 1159.763528 | 3001.683237 | 7941.339393 |
2014-06-07 | 7 | 1252.687782 | 1479.398890 | 4294.601110 |
2014-06-08 | 8 | 1339.179570 | 1270.735494 | 4037.590166 |
2014-06-09 | 9 | 1420.414432 | 2643.366604 | 9179.733375 |
2014-06-10 | 10 | 1497.248276 | 2602.329270 | 9871.448773 |
2014-06-11 | 11 | 1570.327240 | 2411.222784 | 9992.042646 |
2014-06-12 | 12 | 1640.153310 | 2129.138034 | 9646.184332 |
2014-06-13 | 13 | 1707.125689 | 1836.022821 | 9106.999809 |
2014-06-14 | 14 | 1771.568051 | 896.351420 | 4877.648580 |
fig, ax = plt.subplots(1,1)
ax.plot(df_plot[-50:].index, df_plot[-50:]['actual_values'])
ax.plot(df_forecast.index, df_forecast['forecast'], linestyle='--')
ax.fill_between(df_forecast.index, df_forecast['lower'], df_forecast['upper'], color='k', alpha=.2)
plt.show()
Notice how the prediction interval is wider in the second week compared to the first week. This exactly captures our intuition that the point forecasts will be more uncertain further in the future.
This notebook follows the book 'Forecasting: Principles and Practice', Chapter 6. You can view and download this post as a Jupyter notebook on Github Gist.