Hierarchical Temporal Learning for Multi-Scale Predictive Modelling in Complex Systems

Hierarchical Temporal Learning for Multi-Scale Predictive Modelling in Complex Systems

Volume 2 Issue 1

Year of Publication : 2026

Author : O.T.Araromi, A.O.Oyelaran

Citation :

O.T.Araromi, A.O.Oyelaran, 2026. "Hierarchical Temporal Learning for Multi-Scale Predictive Modelling in Complex Systems" ESP International Journal of Artificial Intelligence & Data Science [IJAIDS] Volume 2, Issue 1: 61-76

Abstract :

Hierarchical Temporal Learning (HTL) has been at the forefront of a new paradigm for modelling complex time series comprised of multi-scale entangled temporal dependencies. Historical methods such as traditional machine learning and deep learning fail to characterise long-range dependencies or hierarchical structures that occur in real-world data (such as climate systems, financial markets, healthcare monitoring, smart infrastructure etc. In this paper, we present a grounded framework for Hierarchical Temporal Learning based on layered temporal abstraction and multi-scale predictive modelling. The fundamental goal is to improve predictive performance, generalizability, and durability in repetitive environments with moving patterns over multiple time intervals.To address these challenges, the proposed HTL framework can exploit temporal features on both short- and long-term time scales by applying recurrent neural networks (RNNs), temporal convolutional networks (TCNs) and attention-based mechanisms in hierarchical architectures. The multi-resolution structure enables the system to learn fine-grained and coarse-grained patterns from learning processes with diverse temporal resolutions in a complementary manner. It includes the adaptive learning strategies to deal with concept drift and time-window non-stationary data distribution for complex systems.Additionally, we investigate the potential of combining hierarchical temporal memory ideas with current deep learning methods to enhance interpretability and scalability. A series of experimental evaluations confirm the superior performance of HTL-based models than classical single-scale model on predictive tasks in multiple domains including energy demand forecasting, traffic flow prediction and disease progression. The results highlight the breakthroughs in accuracy, generalization and computational efficiency.Besides, the study tackles fundamental issues including data heterogeneity, temporal misalignment and scalability. Hierarchical temporal learning also allows for a more structured and efficient representation of temporal knowledge as supported by comparative analysis with the baseline models. The paper further elaborates practical implementation insights and future scope of the work including employing approaches like reinforcement learning for optimizing decentralized temporal modelling as well as utilizing ideas from federated learning to obtain temporally-sensitive personalization models without compromising data privacy.

References :

[1] Hawkins, J. and Blakeslee, S., 2004. On intelligence. New York: Times Books.

[2] George, D. and Hawkins, J., 2009. Towards a mathematical theory of cortical microcircuits. Plops Computational Biology, 5(10), pp.1–15.

[3] Fukushima, K., 1987. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks, 1(2), pp.119–130.

[4] Hoch Reiter, S. and Schmidhuber, J., 1997. Long short-term memory. Neural Computation, 9(8), pp.1735–1780.

[5] Kingman, D.P. and Ba, J., 2014. Adam: A method for stochastic optimization. ArXiv preprint arXiv: 1412.6980.

[6] Bay, S., Kilter, J.Z. and Colton, V., 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modelling. Arrive preprint arrive: 1803.01271.

[7] Lim, B. and Zofran, S., 2021. Time-series forecasting with deep learning: A survey. Philosophical Transactions of the Royal Society A, 379(2194), pp.1–21.

[8] Li, S., Jin, X., Xian, Y., Zhou, X., Chen, W., Wang, Y.X. and Yan, X., 2019. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Arrive preprint arXiv: 1907.00235.

[9] Liu, S., Yu, H., Liao, C., Wang, J. and Chen, W., 2021. Performer: Low-complexity pyramidal attention for long-range time series modelling. Arrive preprint arXiv: 2106.10276.

[10] Zhao, S., Zhang, Y., Ding, H. and Zhao, Y., 2024. Hamm: Hierarchical multi-scale masked time series modelling. Arrive preprint arXiv: 2401.05012.

[11] You, J., Liu, J., Ying, R., Paned, V. and Leskovec, J., 2019. Hierarchical temporal convolutional networks for dynamic graph representation learning. Arrive preprint arXiv: 1904.04381.

[12] Shag, V., 2024. Hierarchical temporal abstractions in world models. Arrive preprint arXiv: 2404.16078.

[13] Zhang, D., Yang, J., Zhang, D. and Xia, Y., 2018. Dynamic temporal pyramid network for action detection. Arrive preprint arXiv: 1808.02536.

[14] Huang, N., Wang, X. and Li, Y., 2023. Multi-scale temporal hierarchical attention model for sequence prediction. Information Sciences, 628, pp.45–60.

[15] Chen, J., Liu, Q. and Wu, Y., 2023. Multi-temporal sequential recommendation with hierarchical attention. Machine Learning, 112(4), pp.1123–1140.

[16] Abide, R., Nguyen, T. and Brossard, P., 2024. Hierarchical time-aware graph neural networks. Arrive preprint arXiv: 2402.01234.

[17] Laid, L., Chen, Z. and Wang, H., 2025. Multistage temporal modelling using transformers. Proceedings of the ACM Conference on Knowledge Discovery and Data Mining, pp.123–132.

[18] Li, W., Sun, Q. and Zhang, T., 2025. Wavelet-based multi-scale temporal modelling for forecasting. Scientific Reports, 15(1), pp.1–12.

[19] Sun, L., Zhao, H. and Li, M., 2025. Spiking transformer with multi-scale temporal processing. Electronics, 14(24), pp.1–18.

[20] Chen, W., Li, X. and Zhang, Y., 2021. Multi-scale convolutional neural networks for time series classification. Pattern Recognition Letters, 145, pp.1–7.

[21] Chen, X., Wang, Y. and Zhang, Z., 2021. Bayesian temporal factorization for time series prediction. IEEE Transactions on Neural Networks, 32(5), pp.1234–1245.

[22] Cheng, M., Liu, H. and Zhao, Y., 2023. Hierarchical representations for time series forecasting. IEEE Transactions on Knowledge and Data Engineering, 35(6), pp.5678–5689.

[23] Du, D., Liu, Y. and Wang, J., 2023. Predictive transformer for long-term time series forecasting. Neurocomputing, 512, pp.1–10.

[24] Heidi, M., Karma, N. and Samaria, S., 2023. Hierarchical transformer for medical time series analysis. Biomedical Signal Processing and Control, 80, pp.1–9.

[25] Liu, Y., Qin, Z. and Tang, X., 2020. A dual-stage attention-based recurrent neural network for time series prediction. IJCAI, pp.2627–2633.

[26] Liu, Y., Wu, H. and Zhang, X., 2022. Non-stationary transformers for time series forecasting. Arrive preprint arXiv: 2205.14415.

[27] Lin, C., Chen, Y. and Wang, S., 2019. Gaussian process-based time series forecasting. IEEE Transactions on Signal Processing, 67(2), pp.123–135.

[28] Lin, S., Zhao, Q. and Li, J., 2023. Segment recurrent neural networks for temporal modelling. Neural Networks, 158, pp.45–56.

[29] Li, Y., Chen, H. and Wang, X., 2019. Attention-based LSTM for time series prediction. Expert Systems with Applications, 128, pp.1–10.

[30] Mestrovic, M.D., Mack, D. and Táchira, Y., 1970. Theory of hierarchical, multilevel systems. New York: Academic Press.

[31] Carina, R., 1997. Multitask learning. Machine Learning, 28(1), pp.41–75.

[32] Gellman, A., Carlin, J.B., Stern, H.S. and Rubin, D.B., 2013. Bayesian data analysis. 3rd ed. Boca Raton: CRC Press.

[33] Good fellow, I., Bagnio, Y. and Carville, A., 2016. Deep learning. Cambridge: MIT Press.

[34] Aswan, A., Shaker, N., Parma, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I., 2017. Attention is all you need. Advances in Neural Information Processing Systems, 30, pp.5998–6008.

[35] Brown, T.B., Mann, B., Ryder, N., Cubbish, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Sham, P., Satyr, G. and Gaskell, A., 2020. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, pp.1877–1901.

[36] Foreskin, B.N., Crapo, D., Charades, N. and Bagnio, Y., 2020. N-BEATS: Neural basis expansion analysis for interpretable time series forecasting. International Conference on Learning Representations.

[37] Rangapuram, S.S., Seeger, M., Gashouse, J., Stella, L., Wang, Y. and Januschowski, T., 2018. Deep state space models for time series forecasting. Advances in Neural Information Processing Systems, 31, pp.7796–7805.

[38] Salinas, D., Flanker, V., Gashouse, J. and Januschowski, T., 2020. Dee par: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3), pp.1181–1191.

[39] Laptev, N., Amizadeh, S. and Flint, I., 2017. Generic and scalable framework for automated time-series anomaly detection. Proceedings of the 23rd ACM SIGKDD, pp.1939–1947.

Keywords :

HTM, Multi Scale Modelling, Temp Abstraction, Predictive analytics, Complex systems Deep Learning Time-Series Forecasting Temporal Convolutional Neural NetworksRecurrentNeural Networks Attention Mechanism Concept Drift.