Hierarchical Time Series Reconciliation¶
Reconciliation is a post-processing method to adjust the forecasts to be coherent. Given base forecasts
In the previous section, we discussed the summing matrix
If we forecast different levels independently, the forecasts we get
are not necessarily coherent. However, if we can choose a proper
From the usage,
It is clear that
However, this is not the only
To generate the coherent forecasts
Walmart Sales in Stores
We reuse the example of the Walmart sales data. The base forecasts for all levels are
The simplest mapping to the bottom-level forecasts is
where
are the bottom-level forecasts to be transformed into coherent forecasts.
In this simple method, our mapping matrix
Using this
The last step is to apply the summing matrix
so that
In summary, our coherent forecasts for each level are
The
Results like
Component Form
We rewrite
using the component form
There is more than one
Apart from these intuitive methods, Wickramasuriya et al. (2017) proposed the MinT method to find the optimal
with
where
Note that
MinT is easy to calculate but it assumes that the forecasts are unbiased. To relieve this constraint, Van Erven & Cugliari (2013) proposed a game-theoretic method called GTOP4. In deep learning, Rangapuram et al. (2021) developed an end-to-end model for coherent probabilistic hierarchical forecasts2. For these advanced topics, we redirect the readers to the original papers.
MinT Examples¶
Theories¶
To see how the MinT method works, we calculate a few examples based on equation
In the following examples, we observe that the lower variance of the reconciled forecast error
For a 2-level hierarchical forecasting problem, the shape of the
We visualize the matrix
import sympy as sp
import numpy as np
import seaborn as sns
class MinTMatrices:
def __init__(self, levels: int):
self.levels = levels
@property
def s(self):
s_ident_diag = np.diag([1] * (self.levels - 1)).tolist()
return sp.Matrix(
[
[1] * (self.levels - 1),
] + s_ident_diag
)
@property
def w_diag_elements(self):
return tuple(
sp.Symbol(f"W_{i}")
for i in range(1, self.levels + 1)
)
@property
def w(self):
return sp.Matrix(np.diag(self.w_diag_elements).tolist())
@property
def p_left(self):
return sp.Inverse(
sp.MatMul(sp.Transpose(self.s), sp.Inverse(self.w), self.s)
)
@property
def p_right(self):
return (
sp.MatMul(sp.Transpose(self.s), sp.Inverse(self.w))
)
@property
def p(self):
return sp.MatMul(self.p_left, self.p_right)
@property
def s_p(self):
return sp.MatMul(self.s, self.p)
@property
def s_p_numerical(self):
return sp.lambdify(
self.w_diag_elements,
self.s_p
)
def visualize_s_p(self, w_elements, ax):
sns.heatmap(self.s_p_numerical(*w_elements), annot=True, cbar=False, ax=ax)
ax.grid(False)
ax.set(xticklabels=[], yticklabels=[])
ax.tick_params(bottom=False, left=False)
ax.set_title(f"$W_{{diag}} = {w_elements}$")
return ax
mtm_3 = MinTMatrices(levels=3)
print(
f"s: {sp.latex(mtm_3.s)}\n"
f"p: {sp.latex(mtm_3.p.as_explicit())}\n"
f"s_p: {sp.latex(mtm_3.s_p.as_explicit())}\n"
)
# 2 bottom series, in total three series
mtm_3.s
mtm_3.p
mtm_3.s_p.as_explicit()
w_elements = [
(1,1,1),
(2,1,1)
]
fig, axes = plt.subplots(nrows = 1, ncols=2, figsize=(4 * 2, 4))
for idx, w in enumerate(w_elements):
mtm_3.visualize_s_p(w, axes[idx])
fig.show()
Implementations
There are different methods to get the covariance matrix
method | Note | |
---|---|---|
OLS | More weight on the higher levels in the hierarchy | |
Structual Scaling | Less weight on higher levels compared to OLS |
Real-world Data¶
Code
The code for this subsection can be found in this notebook (also available on Google Colab).
We use a small subset of the M5 competition data to show that MinT works by shifting the values on different hierarchies.
date | CA | TX | WI | CA_1 | CA_2 | CA_3 | CA_4 | TX_1 | TX_2 | TX_3 | WI_1 | WI_2 | WI_3 | Total |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2011-01-29 00:00:00 | 14195 | 9438 | 8998 | 4337 | 3494 | 4739 | 1625 | 2556 | 3852 | 3030 | 2704 | 2256 | 4038 | 32631 |
2011-01-30 00:00:00 | 13805 | 9630 | 8314 | 4155 | 3046 | 4827 | 1777 | 2687 | 3937 | 3006 | 2194 | 1922 | 4198 | 31749 |
2011-01-31 00:00:00 | 10108 | 6778 | 6897 | 2816 | 2121 | 3785 | 1386 | 1822 | 2731 | 2225 | 1562 | 2018 | 3317 | 23783 |
2011-02-01 00:00:00 | 11047 | 7381 | 6984 | 3051 | 2324 | 4232 | 1440 | 2258 | 2954 | 2169 | 1251 | 2522 | 3211 | 25412 |
2011-02-02 00:00:00 | 9925 | 5912 | 3309 | 2630 | 1942 | 3817 | 1536 | 1694 | 2492 | 1726 | 2 | 1175 | 2132 | 19146 |
We apply a simple LightGBM model using Darts. The forecasts are not coherent.
Applying MinT method, we reached coherent forecasts for all levels. The following charts shows the example for the top two levels.
Each step was adjusted differently since the forecasted values are different. To see how exactly the forecasted are adjusted to reach coherency, we plot out the difference between the reconciled results and the original forecasts,
Tools and Packages¶
Darts and hierarchicalforecast from Nixtla provide good support for reconciliations.
-
Hyndman, R.J., & Athanasopoulos, G. (2021) Forecasting: principles and practice, 3rd edition, OTexts: Melbourne, Australia. OTexts.com/fpp3. Accessed on 2022-11-27. ↩↩
-
Rangapuram SS, Werner LD, Benidis K, Mercado P, Gasthaus J, Januschowski T. [End-to-End ↩↩↩
-
Wickramasuriya SL, Athanasopoulos G, Hyndman RJ. Optimal forecast reconciliation for hierarchical and grouped time series through trace minimization. Journal of the American Statistical Association 2019; 114: 804–819. ↩↩
-
Erven T van, Cugliari J. Game-Theoretically optimal reconciliation of contemporaneous hierarchical time series forecasts. In: Modeling and stochastic learning for forecasting in high dimensions. Springer International Publishing, 2015, pp 297–317. ↩