Jul 30, 2020

Forecasting through decomposition

TL;DR: In this post I illustrate a very simple forecasting technique with a toy dataset, namely, forecasting through decomposition. I use the seasonal_decompose function in the statsmodels package to do the decomposition. Then I use a naive forecasting technique and calculate the prediction interval by hand.

First of all, why would we want to use decomposition?

A lot of times we are dealing with time series that have some underlying patterns that can be interpreted independently from each other. We can tease out these patterns if we break down the series into several components. Studying and treating the individual components separately might lead to better understanding as well as better forecasting results.

Additive and multiplicative decomposition

FPP2 Chapter 6.1 introduces two simple decomposition models, additive and multiplicative decomposition:

For additive decomposition, we can write

$y_t = S_t + T_t + R_t,$

where $y_t$ is the data, $S_t$ is the seasonal component, $T_t$ is the trend-cycle component, and $R_t$ is the remainder component, all at period $t$ .

Alternatively, a multiplicative decomposition would be written as

$y_t = S_t \times T_t \times R_t$

The additive decomposition is the most appropriate if the magnitude of the seasonal fluctuations, or the variation around the trend-cycle, does not vary with the level of the time series. When the variation in the seasonal pattern, or the variation around the trend-cycle, appears to be proportional to the level of the time series, then a multiplicative decomposition is more appropriate. Multiplicative decompositions are common with economic time series.

We have seen in the previous notebook how to estimate the trend-cycle component using moving averages. We can also estimate the seasonal component by hand (as described here as the classical decomposition method). But we can also use a ready-implemented function to do a more sophisticated version of that.

In Python we can use the seasonal_decompose function in statsmodels to automate the decomposition and create all three components. With the extrapolate_trend parameter, we can specify a strategy to ensure that there are no NaN values in the trend or remainder component (remember that depending on the size of the MA window, data points at the edge of the x-axis might not have a corresponding trend value, and thus also no remainder value).

Let’s run through an example with a toy dataset (cycling data from Seattle, the same that was used in the previous notebook).

%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd

from statsmodels.tsa.seasonal import seasonal_decompose

sns.set()

Let’s use both the additive and the multiplicative decomposition on our dataset and see the results:

hourly = pd.read_csv('https://raw.githubusercontent.com/jakevdp/SeattleBike/master/FremontHourly.csv', 
	index_col='Date', parse_dates=True)
hourly.columns = ['northbound', 'southbound']
hourly['total'] = hourly['northbound'] + hourly['southbound']
daily = hourly.resample('d').sum()

result_add = seasonal_decompose(daily['total'], model='additive', extrapolate_trend='freq')
result_mul = seasonal_decompose(daily['total'], model='multiplicative', extrapolate_trend='freq')

Additive decomposition plots

plt.rcParams.update({'figure.figsize': (15,10)})
result_add.plot().suptitle('Additive decomposition', x=0.2, fontweight='bold')
plt.show()

png

Observations:

The trend line seems to follow the original timeseries adaquately.
The seasonal component has captured the weekly fluctuations.
The remainders are roughly centered around 0. However, we see that the variance seems to increase towards the end of the timeseries.

Multiplicative decomposition plots

Let’s see what the multiplicative plot looks like:

result_mul.plot()
plt.show()

png

Observations:

The remainders are centered around 1. This is as expected for the multiplicative model.
Actually the variance seems more even (read: not dependent on x) in the multiplicative plot than in the additive plot. This might be a better decomposition than the additive one for our dataset.

Forecasting

Using decomposition for forecasting (FPP2 Chapter 6.8):

Assuming an additive decomposition, the decomposed time series can be written as

$y_t = \hat{S}_t + \hat{A}_t$

where $\hat{A}_t = \hat{T}_t + \hat{R}_t$ is the seasonally adjusted component [made up of the trend component and remainder component].

Or, if a multiplicative decomposition has been used, we can write

$y_t = \hat{S}_t \times \hat{A}_t$

where $\hat{A}_t = \hat{T}_t \times \hat{R}_t$ .

To forecast a decomposed time series, we forecast the seasonal component, $\hat{S}_t$ , and the seasonally adjusted component $\hat{A}_t$ , separately. It is usually assumed that the seasonal component is unchanging, or changing extremely slowly, so it is forecast by simply taking the last year of the estimated component. In other words, a seasonal naïve method is used for the seasonal component.

To forecast the seasonally adjusted component, any non-seasonal forecasting method may be used.

So for each new time period in our forecast horizon, we will forecast two values, one for the seasonally adjusted component, and one for the seasonal component. Combining these two forecast values (according to the decomposition model we chose), we’ll arrive at the final forecast value for the given time period.

Let’s go ahead and do this for our dataset. We will choose the multiplicative model for making our forecasts:

df_reconstructed = pd.concat(
    [result_mul.seasonal, 
     result_mul.trend, 
     result_mul.resid, 
     result_mul.trend * result_mul.resid, 
     result_mul.observed], axis=1)
df_reconstructed.columns = ['seasonal', 'trend', 'remainders', 'seasonal_adj', 'actual_values']
df_reconstructed.tail()

	seasonal	trend	remainders	seasonal_adj	actual_values
Date
2014-05-27	1.238517	3453.428571	1.176490	4062.924130	5032.0
2014-05-28	1.231516	3558.000000	0.914707	3254.526163	4008.0
2014-05-29	1.169167	2943.693878	1.332783	3923.304594	4587.0
2014-05-30	1.086529	2841.387755	1.577132	4481.243770	4869.0
2014-05-31	0.573298	2739.081633	1.838489	5035.772436	2887.0

Now, let’s do a forecast for 14 days using some simple methods. First forecast the seasonal component $\hat{S}$ using the same value, but from the period earlier (seasonal naive method):

df_forecast = df_reconstructed.iloc[-14:,:]
df_forecast = df_forecast.set_index(df_forecast.index.shift(14))
df_forecast = df_forecast.drop('actual_values', axis=1)
df_forecast[['trend', 'remainders', 'seasonal_adj']] = np.nan
df_forecast

	seasonal	trend	remainders	seasonal_adj
Date
2014-06-01	0.527062	NaN	NaN	NaN
2014-06-02	1.173911	NaN	NaN	NaN
2014-06-03	1.238517	NaN	NaN	NaN
2014-06-04	1.231516	NaN	NaN	NaN
2014-06-05	1.169167	NaN	NaN	NaN
2014-06-06	1.086529	NaN	NaN	NaN
2014-06-07	0.573298	NaN	NaN	NaN
2014-06-08	0.527062	NaN	NaN	NaN
2014-06-09	1.173911	NaN	NaN	NaN
2014-06-10	1.238517	NaN	NaN	NaN
2014-06-11	1.231516	NaN	NaN	NaN
2014-06-12	1.169167	NaN	NaN	NaN
2014-06-13	1.086529	NaN	NaN	NaN
2014-06-14	0.573298	NaN	NaN	NaN

The seasonally adjusted values $\hat{A}$ are forecast using simply the last trend and last remainder values we saw in the given time series (naive method). Of course, we could use a more complex technique here as well (e.g. some exponential smoothing method, or a non-seasonal ARIMA model etc.), but let’s keep it simple for now.

df_forecast['trend'] = df_reconstructed.loc[df_reconstructed.index[-1], 'trend']
df_forecast['remainders'] = df_reconstructed.loc[df_reconstructed.index[-1], 'remainders']
df_forecast['seasonal_adj'] = df_forecast['trend'] * df_forecast['remainders']
df_forecast['forecast'] = df_forecast['seasonal_adj'] * df_forecast['seasonal']
df_forecast

	seasonal	trend	remainders	seasonal_adj	forecast
Date
2014-06-01	0.527062	2739.081633	1.838489	5035.772436	2654.162830
2014-06-02	1.173911	2739.081633	1.838489	5035.772436	5911.549989
2014-06-03	1.238517	2739.081633	1.838489	5035.772436	6236.889021
2014-06-04	1.231516	2739.081633	1.838489	5035.772436	6201.632715
2014-06-05	1.169167	2739.081633	1.838489	5035.772436	5887.661183
2014-06-06	1.086529	2739.081633	1.838489	5035.772436	5471.511315
2014-06-07	0.573298	2739.081633	1.838489	5035.772436	2887.000000
2014-06-08	0.527062	2739.081633	1.838489	5035.772436	2654.162830
2014-06-09	1.173911	2739.081633	1.838489	5035.772436	5911.549989
2014-06-10	1.238517	2739.081633	1.838489	5035.772436	6236.889021
2014-06-11	1.231516	2739.081633	1.838489	5035.772436	6201.632715
2014-06-12	1.169167	2739.081633	1.838489	5035.772436	5887.661183
2014-06-13	1.086529	2739.081633	1.838489	5035.772436	5471.511315
2014-06-14	0.573298	2739.081633	1.838489	5035.772436	2887.000000

Note that the seasonally adjusted forecast values are the same values for all 14 days, but since they are multiplied with the seasonal component’s forecast values (i.e. they are ‘reseasoned’), the final forecast values differ from day to day. Now, let’s put the two dataframes together in order to make a plot with the 14-day forecast (full dataset and zoomed into the most recent couple of weeks).

df_plot = pd.concat([df_reconstructed, df_forecast], axis=0)

plt.rcParams.update({'figure.figsize': (20,7)})
df_plot[['actual_values', 'forecast']].plot()
df_plot.iloc[-50:,:][['actual_values', 'forecast']].plot()
plt.show()

png

As the plots above show, with this very simple decomposition model and the naive forecasting method we achieved a reasonable forecast. Obviously the periodic pattern repeats itself after every 7 days since the only variation we have in our forecast is coming from the weekly pattern in the seasonal component. It seems intuitive that this forecast will become more and more off, the further into the future we go. Let’s see how this growing uncertainty can be captured with a prediction interval.

Prediction interval around point forecasts

From FPP Chapter 3.5:

A prediction interval gives an interval within which we expect $y_t$ to lie with a specified probability. For example, assuming that the forecast errors are normally distributed, a 95% prediction interval for the $h$ -step forecast is

$\hat{y}_{T+h|T} \pm 1.96 * \hat{\sigma}_h,$

where $\hat{\sigma}_h$ is an estimate of the standard deviation of the $h$ -step forecast distribution.

Note that the prediction interval is different from a confidence interval in meaning. The confidence interval quantifies the uncertainty of an unknown population parameter based on observed data. The prediction interval quantifies the uncertainty of a single predicted, not-yet-observed data point; it doesn’t necessarily say anything about the parameter of the underlying population (read more here and here).

[For simple forecast methods,] it is possible to mathematically derive the forecast standard deviation under the assumption of uncorrelated residuals. If $\hat{\sigma}_h$ denotes the standard deviation of the $h$ -step forecast distribution, and $\hat{\sigma}$ is the residual standard deviation, then we can use the following expressions.

[…]

Naïve forecasts:

$\hat{\sigma}_h = \hat{\sigma} \sqrt{h}$

Seasonal naïve forecasts:

$\hat{\sigma}_h = \hat{\sigma} \sqrt{k+1},$ where $k$ is the integer part of $(h−1)/m$ and $m$ is the seasonal period.

[…]

We construct the upper and lower bounds of the prediction interval the same way as the point forecasts. I.e. we first compute the upper and lower bounds for the seasonally adjusted data $\hat{S}$ by following the formulas above (estimating $\hat{\sigma}_h$ for the naive method), and then we ‘reseason’ the bounds by multiplying with the corresponding seasonal forecast value.

Following the example in FPP Chapter 3.8, we also ignore the uncertainty in the forecast of the seasonal component $\hat{S}$ . The rationale is that the uncertainty in the seasonal component is usually much smaller than for the seasonally adjusted data, so it’s a reasonable approximation to make.

So first let’s compute the residual standard deviation $\hat{\sigma}$ , which we will use to compute $\hat{\sigma}_h$ :

residuals = df_reconstructed['actual_values'] - df_reconstructed['seasonal'] * df_reconstructed['trend']

resid_std = residuals.std()
resid_std

473.47147746617435

df_forecast['h'] = range(1, 15)
df_forecast['std_h'] = resid_std * np.sqrt(df_forecast['h'])

We first compute the interval bounds for the seasonally adjusted data and then combine them with the forecast values of the seasonal component:

df_forecast['lower_adj'] = (df_forecast['seasonal_adj'] - 1.96 * df_forecast['std_h'])
df_forecast['upper_adj'] = (df_forecast['seasonal_adj'] + 1.96 * df_forecast['std_h'])
df_forecast['lower'] = df_forecast['lower_adj'] * df_forecast['seasonal']
df_forecast['upper'] = df_forecast['upper_adj'] * df_forecast['seasonal']

df_forecast[['h', 'std_h', 'lower', 'upper']]

	h	std_h	lower	upper
Date
2014-06-01	1	473.471477	2165.047404	3143.278255
2014-06-02	2	669.589785	4370.913567	7452.186412
2014-06-03	3	820.076655	4246.158659	8227.619384
2014-06-04	4	946.942955	3915.929525	8487.335905
2014-06-05	5	1058.714409	3461.544923	8313.777443
2014-06-06	6	1159.763528	3001.683237	7941.339393
2014-06-07	7	1252.687782	1479.398890	4294.601110
2014-06-08	8	1339.179570	1270.735494	4037.590166
2014-06-09	9	1420.414432	2643.366604	9179.733375
2014-06-10	10	1497.248276	2602.329270	9871.448773
2014-06-11	11	1570.327240	2411.222784	9992.042646
2014-06-12	12	1640.153310	2129.138034	9646.184332
2014-06-13	13	1707.125689	1836.022821	9106.999809
2014-06-14	14	1771.568051	896.351420	4877.648580

fig, ax = plt.subplots(1,1)
ax.plot(df_plot[-50:].index, df_plot[-50:]['actual_values'])
ax.plot(df_forecast.index, df_forecast['forecast'], linestyle='--')
ax.fill_between(df_forecast.index, df_forecast['lower'], df_forecast['upper'], color='k', alpha=.2)
plt.show()

png

Notice how the prediction interval is wider in the second week compared to the first week. This exactly captures our intuition that the point forecasts will be more uncertain further in the future.

This notebook follows the book 'Forecasting: Principles and Practice', Chapter 6. You can view and download this post as a Jupyter notebook on Github Gist.