AI & CognitionarXiv2026-07-01Skeptical (25)

Research Paper

TiRex-2: Generalizing TiRex to Multivariate Data and Streaming

Patrick Podest, Marco Pichler, Elias Bürger, Levente Zólyomi, Bernhard Voggenberger, Wilhelm Berghammer, Daniel Klotz, Sebastian Böck, Günter Klambauer, Sepp Hochreiter

We introduce TiRex-2, a recurrent xLSTM-based time series foundation model that generalizes the univariate TiRex to multivariate forecasting with both past and future covariates. Real-world forecasting is inherently sequential: observations arrive continuously, variables evolve jointly, and a subset of covariates is known ahead of time. Existing Transformer-based time series foundation models capture cross-variate dependencies but incur quadratic complexity in context length and require full-history recomputation as new observations arrive. TiRex-2 addresses these limitations through a memory-centric recurrent design that operates at constant per-patch cost under streaming. The model combines a bidirectional time mixer with an asymmetric grouped-attention variate mixer, enabling the integration of future-known covariates while preserving strict causality over target variables. To our knowledge, this is the first time series foundation model that achieves this combination of properties. To support scalable multivariate pretraining, we propose a synthetic coupling pipeline that composes diverse multivariate samples on the fly from large univariate corpora. Empirically, TiRex-2 achieves state-of-the-art zero-shot performance on GIFT-Eval and fev-bench, remains stable when streamed to arbitrary context lengths, and maintains constant inference cost per patch. The model uses 38.4M active parameters in univariate mode, with an additional 44.1M parameters activated for multivariate forecasting.

Open Source

Research Brief

TiRex-2 introduces an efficient, recurrent xLSTM-based time series foundation model for streaming multivariate forecasting, capable of integrating past and future covariates while maintaining strict causality and constant inference cost.

This paper presents TiRex-2, a new generation of time series foundation model designed to overcome limitations of current Transformer-based approaches, particularly in handling continuous, multivariate, and streaming data. Unlike existing models that struggle with long context lengths and require reprocessing entire histories, TiRex-2 employs a memory-centric, recurrent architecture that processes data patch-by-patch with constant computational cost. It combines a unique bidirectional time mixer with an asymmetric grouped-attention variate mixer, allowing it to incorporate future-known information (covariates) without compromising the causal integrity of its predictions on target variables. To facilitate its training on diverse multivariate data, the authors developed a novel synthetic pipeline that generates complex multivariate samples from simpler, large univariate datasets. Empirically, TiRex-2 demonstrates leading zero-shot performance on established benchmarks, proving stable under extended streaming conditions, and maintaining its promised constant inference efficiency.

Potential Applications

Real-time financial market prediction (stock prices, forex) where multiple indicators evolve jointly and some future economic announcements are known.
Predictive maintenance for industrial machinery, integrating sensor data streams from various components along with scheduled maintenance events.
Energy grid load forecasting, considering historical consumption, real-time generation, and future weather forecasts or scheduled events.
Supply chain optimization, forecasting demand and logistics with multivariate data streams and known future marketing campaigns or holidays.

25/100

Paper Trustworthiness Index

High Skepticism

High Skepticism / Self-Published

This document should be treated with critical skepticism. It contains unverified scientific claims or was self-published.

Verified AI Assessment: This credibility analysis was generated by Gemini 2.5 Flash analyzing the full paper text, references, and metadata.

Core Pillars Breakdown

Author & Institutional Track Record

0 / 25

The abstract does not provide any information about the authors, their affiliations, or their previous work, making it impossible to assess their track record from the provided text.

Technical Rigor & Methodology

25 / 30

The abstract describes a specific architecture (xLSTM-based, recurrent, bidirectional time mixer, asymmetric grouped-attention variate mixer) designed to address known complexities (quadratic cost of Transformers, causality, streaming). It outlines a novel pretraining pipeline and claims state-of-the-art zero-shot performance on specific benchmarks (GIFT-Eval, fev-bench) with stability and constant inference cost, suggesting a well-structured technical approach and evaluation.

Reproducibility & Openness

0 / 25

The abstract does not mention the availability of code, datasets, model weights, or any other resources that would allow for independent reproduction of the results. No URLs or repository links are provided.

Community Vetting & Peer Review

0 / 20

The abstract does not specify if the paper has been peer-reviewed, accepted at a conference, or published in a journal. It could be a preprint, thus lacking formal community vetting at this stage.

Detailed Evidence Assessment

Verified Evidence & Citations

TiRex-2 is a recurrent xLSTM-based time series foundation model.

“Abstract: "We introduce TiRex-2, a recurrent xLSTM-based time series foundation model..."”

It generalizes the univariate TiRex to multivariate forecasting with both past and future covariates.

“Abstract: "...generalizes the univariate TiRex to multivariate forecasting with both past and future covariates."”

TiRex-2 operates at constant per-patch cost under streaming conditions.

“Abstract: "TiRex-2 addresses these limitations through a memory-centric recurrent design that operates at constant per-patch cost under streaming."”

The model combines a bidirectional time mixer with an asymmetric grouped-attention variate mixer.

“Abstract: "The model combines a bidirectional time mixer with an asymmetric grouped-attention variate mixer..."”

It enables the integration of future-known covariates while preserving strict causality over target variables.

“Abstract: "...enabling the integration of future-known covariates while preserving strict causality over target variables."”

A synthetic coupling pipeline is proposed for scalable multivariate pretraining.

“Abstract: "To support scalable multivariate pretraining, we propose a synthetic coupling pipeline that composes diverse multivariate samples on the fly from large univariate corpora."”

TiRex-2 achieves state-of-the-art zero-shot performance on GIFT-Eval and fev-bench.

“Abstract: "Empirically, TiRex-2 achieves state-of-the-art zero-shot performance on GIFT-Eval and fev-bench..."”

The model remains stable when streamed to arbitrary context lengths and maintains constant inference cost per patch.

“Abstract: "...remains stable when streamed to arbitrary context lengths, and maintains constant inference cost per patch."”

It uses 38.4M active parameters in univariate mode and an additional 44.1M for multivariate forecasting.

“Abstract: "The model uses 38.4M active parameters in univariate mode, with an additional 44.1M parameters activated for multivariate forecasting."”

Uncertainties & Omissions

• Omission:No specific details on the datasets used for evaluation beyond naming 'GIFT-Eval' and 'fev-bench'.

• Omission:No quantitative results (e.g., specific error metrics) for 'state-of-the-art zero-shot performance'.

• Omission:No direct comparisons to specific baseline models are detailed, only general reference to 'existing Transformer-based time series foundation models'.

• Omission:No author information, institutional affiliations, or funding sources.

• Omission:No indication of peer-review status or publication venue.

• Omission:No codebase or data repository links provided for reproducibility.

• Uncertainty:The robustness and generalizability of the proposed 'synthetic coupling pipeline' for generating diverse multivariate samples.

• Uncertainty:The practical limits of 'arbitrary context lengths' in real-world scenarios, considering factors like memory usage for model parameters and internal states, despite constant per-patch cost.

• Uncertainty:The performance implications and trade-offs of the 'asymmetric grouped-attention variate mixer' compared to other variate mixing strategies.