Forecasting through decomposition

TL;DR: In this post I illustrate a very simple forecasting technique with a toy dataset, namely, forecasting through decomposition. I use the seasonal_decompose function in the statsmodels package to do the decomposition. Then I use a naive forecasting technique and calculate the prediction interval by hand.

First of all, why would we want to use decomposition?

A lot of times we are dealing with time series that have some underlying patterns that can be interpreted independently from each other. We can tease out these patterns if we break down the series into several components. Studying and treating the individual components separately might lead to better understanding as well as better forecasting results.

Additive and multiplicative decomposition

FPP2 Chapter 6.1 introduces two simple decomposition models, additive and multiplicative decomposition:

For additive decomposition, we can write

$$y_t = S_t + T_t + R_t,$$

where $y_t$ is the data, $S_t$ is the seasonal component, $T_t$ is the trend-cycle component, and $R_t$ is the remainder component, all at period $t$.

Alternatively, a multiplicative decomposition would be written as

$$y_t = S_t \times T_t \times R_t$$

The additive decomposition is the most appropriate if the magnitude of the seasonal fluctuations, or the variation around the trend-cycle, does not vary with the level of the time series. When the variation in the seasonal pattern, or the variation around the trend-cycle, appears to be proportional to the level of the time series, then a multiplicative decomposition is more appropriate. Multiplicative decompositions are common with economic time series.

We have seen in the previous notebook how to estimate the trend-cycle component using moving averages. We can also estimate the seasonal component by hand (as described here as the classical decomposition method). But we can also use a ready-implemented function to do a more sophisticated version of that.

In Python we can use the seasonal_decompose function in statsmodels to automate the decomposition and create all three components. With the extrapolate_trend parameter, we can specify a strategy to ensure that there are no NaN values in the trend or remainder component (remember that depending on the size of the MA window, data points at the edge of the x-axis might not have a corresponding trend value, and thus also no remainder value).

Let’s run through an example with a toy dataset (cycling data from Seattle, the same that was used in the previous notebook).

%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd

from statsmodels.tsa.seasonal import seasonal_decompose

sns.set()

Let’s use both the additive and the multiplicative decomposition on our dataset and see the results:

hourly = pd.read_csv('https://raw.githubusercontent.com/jakevdp/SeattleBike/master/FremontHourly.csv', 
	index_col='Date', parse_dates=True)
hourly.columns = ['northbound', 'southbound']
hourly['total'] = hourly['northbound'] + hourly['southbound']
daily = hourly.resample('d').sum()
result_add = seasonal_decompose(daily['total'], model='additive', extrapolate_trend='freq')
result_mul = seasonal_decompose(daily['total'], model='multiplicative', extrapolate_trend='freq')

Additive decomposition plots

plt.rcParams.update({'figure.figsize': (15,10)})
result_add.plot().suptitle('Additive decomposition', x=0.2, fontweight='bold')
plt.show()

png

Observations:

  • The trend line seems to follow the original timeseries adaquately.
  • The seasonal component has captured the weekly fluctuations.
  • The remainders are roughly centered around 0. However, we see that the variance seems to increase towards the end of the timeseries.

Multiplicative decomposition plots

Let’s see what the multiplicative plot looks like:

result_mul.plot()
plt.show()

png

Observations:

  • The remainders are centered around 1. This is as expected for the multiplicative model.
  • Actually the variance seems more even (read: not dependent on x) in the multiplicative plot than in the additive plot. This might be a better decomposition than the additive one for our dataset.

Forecasting

Using decomposition for forecasting (FPP2 Chapter 6.8):

Assuming an additive decomposition, the decomposed time series can be written as

$$y_t = \hat{S}_t + \hat{A}_t$$

where $\hat{A}_t = \hat{T}_t + \hat{R}_t$ is the seasonally adjusted component [made up of the trend component and remainder component].

Or, if a multiplicative decomposition has been used, we can write

$$y_t = \hat{S}_t \times \hat{A}_t$$

where $\hat{A}_t = \hat{T}_t \times \hat{R}_t$.

To forecast a decomposed time series, we forecast the seasonal component, $\hat{S}_t$, and the seasonally adjusted component $\hat{A}_t$, separately. It is usually assumed that the seasonal component is unchanging, or changing extremely slowly, so it is forecast by simply taking the last year of the estimated component. In other words, a seasonal naïve method is used for the seasonal component.

To forecast the seasonally adjusted component, any non-seasonal forecasting method may be used.

So for each new time period in our forecast horizon, we will forecast two values, one for the seasonally adjusted component, and one for the seasonal component. Combining these two forecast values (according to the decomposition model we chose), we’ll arrive at the final forecast value for the given time period.

Let’s go ahead and do this for our dataset. We will choose the multiplicative model for making our forecasts:

df_reconstructed = pd.concat(
    [result_mul.seasonal, 
     result_mul.trend, 
     result_mul.resid, 
     result_mul.trend * result_mul.resid, 
     result_mul.observed], axis=1)
df_reconstructed.columns = ['seasonal', 'trend', 'remainders', 'seasonal_adj', 'actual_values']
df_reconstructed.tail()
seasonal trend remainders seasonal_adj actual_values
Date
2014-05-27 1.238517 3453.428571 1.176490 4062.924130 5032.0
2014-05-28 1.231516 3558.000000 0.914707 3254.526163 4008.0
2014-05-29 1.169167 2943.693878 1.332783 3923.304594 4587.0
2014-05-30 1.086529 2841.387755 1.577132 4481.243770 4869.0
2014-05-31 0.573298 2739.081633 1.838489 5035.772436 2887.0

Now, let’s do a forecast for 14 days using some simple methods. First forecast the seasonal component $\hat{S}$ using the same value, but from the period earlier (seasonal naive method):

df_forecast = df_reconstructed.iloc[-14:,:]
df_forecast = df_forecast.set_index(df_forecast.index.shift(14))
df_forecast = df_forecast.drop('actual_values', axis=1)
df_forecast[['trend', 'remainders', 'seasonal_adj']] = np.nan
df_forecast
seasonal trend remainders seasonal_adj
Date
2014-06-01 0.527062 NaN NaN NaN
2014-06-02 1.173911 NaN NaN NaN
2014-06-03 1.238517 NaN NaN NaN
2014-06-04 1.231516 NaN NaN NaN
2014-06-05 1.169167 NaN NaN NaN
2014-06-06 1.086529 NaN NaN NaN
2014-06-07 0.573298 NaN NaN NaN
2014-06-08 0.527062 NaN NaN NaN
2014-06-09 1.173911 NaN NaN NaN
2014-06-10 1.238517 NaN NaN NaN
2014-06-11 1.231516 NaN NaN NaN
2014-06-12 1.169167 NaN NaN NaN
2014-06-13 1.086529 NaN NaN NaN
2014-06-14 0.573298 NaN NaN NaN

The seasonally adjusted values $\hat{A}$ are forecast using simply the last trend and last remainder values we saw in the given time series (naive method). Of course, we could use a more complex technique here as well (e.g. some exponential smoothing method, or a non-seasonal ARIMA model etc.), but let’s keep it simple for now.

df_forecast['trend'] = df_reconstructed.loc[df_reconstructed.index[-1], 'trend']
df_forecast['remainders'] = df_reconstructed.loc[df_reconstructed.index[-1], 'remainders']
df_forecast['seasonal_adj'] = df_forecast['trend'] * df_forecast['remainders']
df_forecast['forecast'] = df_forecast['seasonal_adj'] * df_forecast['seasonal']
df_forecast
seasonal trend remainders seasonal_adj forecast
Date
2014-06-01 0.527062 2739.081633 1.838489 5035.772436 2654.162830
2014-06-02 1.173911 2739.081633 1.838489 5035.772436 5911.549989
2014-06-03 1.238517 2739.081633 1.838489 5035.772436 6236.889021
2014-06-04 1.231516 2739.081633 1.838489 5035.772436 6201.632715
2014-06-05 1.169167 2739.081633 1.838489 5035.772436 5887.661183
2014-06-06 1.086529 2739.081633 1.838489 5035.772436 5471.511315
2014-06-07 0.573298 2739.081633 1.838489 5035.772436 2887.000000
2014-06-08 0.527062 2739.081633 1.838489 5035.772436 2654.162830
2014-06-09 1.173911 2739.081633 1.838489 5035.772436 5911.549989
2014-06-10 1.238517 2739.081633 1.838489 5035.772436 6236.889021
2014-06-11 1.231516 2739.081633 1.838489 5035.772436 6201.632715
2014-06-12 1.169167 2739.081633 1.838489 5035.772436 5887.661183
2014-06-13 1.086529 2739.081633 1.838489 5035.772436 5471.511315
2014-06-14 0.573298 2739.081633 1.838489 5035.772436 2887.000000

Note that the seasonally adjusted forecast values are the same values for all 14 days, but since they are multiplied with the seasonal component’s forecast values (i.e. they are ‘reseasoned’), the final forecast values differ from day to day. Now, let’s put the two dataframes together in order to make a plot with the 14-day forecast (full dataset and zoomed into the most recent couple of weeks).

df_plot = pd.concat([df_reconstructed, df_forecast], axis=0)
plt.rcParams.update({'figure.figsize': (20,7)})
df_plot[['actual_values', 'forecast']].plot()
df_plot.iloc[-50:,:][['actual_values', 'forecast']].plot()
plt.show()

png

png

As the plots above show, with this very simple decomposition model and the naive forecasting method we achieved a reasonable forecast. Obviously the periodic pattern repeats itself after every 7 days since the only variation we have in our forecast is coming from the weekly pattern in the seasonal component. It seems intuitive that this forecast will become more and more off, the further into the future we go. Let’s see how this growing uncertainty can be captured with a prediction interval.

Prediction interval around point forecasts

From FPP Chapter 3.5:

A prediction interval gives an interval within which we expect $y_t$ to lie with a specified probability. For example, assuming that the forecast errors are normally distributed, a 95% prediction interval for the $h$-step forecast is

$$\hat{y}_{T+h|T} \pm 1.96 * \hat{\sigma}_h,$$

where $\hat{\sigma}_h$ is an estimate of the standard deviation of the $h$-step forecast distribution.

Note that the prediction interval is different from a confidence interval in meaning. The confidence interval quantifies the uncertainty of an unknown population parameter based on observed data. The prediction interval quantifies the uncertainty of a single predicted, not-yet-observed data point; it doesn’t necessarily say anything about the parameter of the underlying population (read more here and here).

[For simple forecast methods,] it is possible to mathematically derive the forecast standard deviation under the assumption of uncorrelated residuals. If $\hat{\sigma}_h$ denotes the standard deviation of the $h$-step forecast distribution, and $\hat{\sigma}$ is the residual standard deviation, then we can use the following expressions.

[…]

Naïve forecasts:

$$\hat{\sigma}_h = \hat{\sigma} \sqrt{h}$$

Seasonal naïve forecasts:

$$\hat{\sigma}_h = \hat{\sigma} \sqrt{k+1},$$ where $k$ is the integer part of $(h−1)/m$ and $m$ is the seasonal period.

[…]

We construct the upper and lower bounds of the prediction interval the same way as the point forecasts. I.e. we first compute the upper and lower bounds for the seasonally adjusted data $\hat{S}$ by following the formulas above (estimating $\hat{\sigma}_h$ for the naive method), and then we ‘reseason’ the bounds by multiplying with the corresponding seasonal forecast value.

Following the example in FPP Chapter 3.8, we also ignore the uncertainty in the forecast of the seasonal component $\hat{S}$. The rationale is that the uncertainty in the seasonal component is usually much smaller than for the seasonally adjusted data, so it’s a reasonable approximation to make.

So first let’s compute the residual standard deviation $\hat{\sigma}$, which we will use to compute $\hat{\sigma}_h$:

residuals = df_reconstructed['actual_values'] - df_reconstructed['seasonal'] * df_reconstructed['trend']
resid_std = residuals.std()
resid_std
473.47147746617435
df_forecast['h'] = range(1, 15)
df_forecast['std_h'] = resid_std * np.sqrt(df_forecast['h'])

We first compute the interval bounds for the seasonally adjusted data and then combine them with the forecast values of the seasonal component:

df_forecast['lower_adj'] = (df_forecast['seasonal_adj'] - 1.96 * df_forecast['std_h'])
df_forecast['upper_adj'] = (df_forecast['seasonal_adj'] + 1.96 * df_forecast['std_h'])
df_forecast['lower'] = df_forecast['lower_adj'] * df_forecast['seasonal']
df_forecast['upper'] = df_forecast['upper_adj'] * df_forecast['seasonal']
df_forecast[['h', 'std_h', 'lower', 'upper']]
h std_h lower upper
Date
2014-06-01 1 473.471477 2165.047404 3143.278255
2014-06-02 2 669.589785 4370.913567 7452.186412
2014-06-03 3 820.076655 4246.158659 8227.619384
2014-06-04 4 946.942955 3915.929525 8487.335905
2014-06-05 5 1058.714409 3461.544923 8313.777443
2014-06-06 6 1159.763528 3001.683237 7941.339393
2014-06-07 7 1252.687782 1479.398890 4294.601110
2014-06-08 8 1339.179570 1270.735494 4037.590166
2014-06-09 9 1420.414432 2643.366604 9179.733375
2014-06-10 10 1497.248276 2602.329270 9871.448773
2014-06-11 11 1570.327240 2411.222784 9992.042646
2014-06-12 12 1640.153310 2129.138034 9646.184332
2014-06-13 13 1707.125689 1836.022821 9106.999809
2014-06-14 14 1771.568051 896.351420 4877.648580
fig, ax = plt.subplots(1,1)
ax.plot(df_plot[-50:].index, df_plot[-50:]['actual_values'])
ax.plot(df_forecast.index, df_forecast['forecast'], linestyle='--')
ax.fill_between(df_forecast.index, df_forecast['lower'], df_forecast['upper'], color='k', alpha=.2)
plt.show()

png

Notice how the prediction interval is wider in the second week compared to the first week. This exactly captures our intuition that the point forecasts will be more uncertain further in the future.

This notebook follows the book 'Forecasting: Principles and Practice', Chapter 6. You can view and download this post as a Jupyter notebook on Github Gist.