Exploring the limitations of transformer models for metocean forecasting

Abstract

Transformer models have been widely applied across various domains, often treating spatio-temporal data as video-like sequences due to the success of generative video prediction. However, this paper argues that transformers are not always optimal for spatio-temporal data with long forecast horizons and strong periodicity. Focusing on metocean forecasting, specifically sea ice, ocean, and atmospheric data, the study evaluates transformer-based models against convolutional neural networks (CNNs). For long-term sea ice forecasting in the Arctic, transformers such as TimeSformer and SwinLSTM failed to capture annual dynamics, including summer melt. In contrast, a lightweight CNN baseline outperformed existing state-of-the-art numerical and data-driven forecasts, improving error metrics by up to 30%. Similarly, in atmospheric bias correction, CNNs proved superior, reducing errors in Global Forecast System fields by 20% relative to transformers. The narrative shifts with ocean forecasting, where transformer models enhanced by contrastive pre-training achieved comprehensive superiority. They significantly reduced errors across all ocean variables, including a 40% reduction for mixed layer depth. These three case studies demonstrate that transformer limitations exist but are conditional rather than absolute, while CNNs remain the appropriate choice when data is limited or fine spatial structure is critical.

Publication
Journal of Computational Science, 99:102917
Mikhail Borisov
Mikhail Borisov
Researcher, PhD student

TBA

Stanislava Vostrikova
Stanislava Vostrikova
Junior researcher, MSc student

TBA

Viktor Golikov
Viktor Golikov
Researcher, PhD student

TBA

Mikhail Krinitskiy
Mikhail Krinitskiy
Head of lab

Current research interests are machine learning and deep learning of various flavours applied in Earth Sciences started with observational applications, now shifted to generic data mining and natural processes modeling. The main applications are in Atmospheric sciences, including remote sensing, and also in Ocean sciences. There are also some applications in geochemistry and paleoreconstruction applications. Lecturing masters courses “Machine learning for Earth Sciences” and “Deep learning for Earth Sciences,” a.k.a. ML4ES and DL4ES (Rus.) in Moscow Institute of Physics and Technology and in Lomonosov Moscow State University.