
Multi-horizon time series forecasting often involves heterogeneous inputs, including static covariates, known future inputs, and past observed time-varying variables. This paper introduces the Temporal Fusion Transformer (TFT), an attention-based deep learning architecture designed to improve forecasting performance while providing interpretable insights into temporal dynamics. TFT combines recurrent layers for local temporal processing with interpretable self-attention layers for capturing long-term dependencies. It also uses static covariate encoders, variable selection networks, gating mechanisms, and quantile outputs to handle different input types and produce prediction intervals. The authors evaluate TFT on multiple real-world datasets, including electricity, traffic, retail, and volatility forecasting tasks, and show that it outperforms several existing multi-horizon forecasting benchmarks. The paper also demonstrates interpretability use cases, including identifying important variables, visualizing persistent temporal patterns, and detecting significant regime changes.