Lightweight transformer for multivariate long-term time series forecasting using sharpness-aware minimization and channel-wise attention. Identifies attention as responsible for poor generalization in forecasting transformers and addresses it via SAM optimization. Surpasses TSMixer by 14.33% on average with 4x fewer parameters. Presented as an oral at ICML 2024 from Huawei Paris Noah's Ark Lab.

Outputs 2

SAMformer

model

SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention

paper

arXiv: 2402.10198

time-seriesforecastingtransformerefficiencyopen-source