← Back to blog

explainx / blog

Google TimesFM 2.5: The Open-Source Time Series Foundation Model Explained

Google Research's TimesFM 2.5 is a 200M-parameter, decoder-only foundation model for zero-shot time series forecasting. 16k context, quantile heads, LoRA fine-tuning, and agent skills—here's everything you need to know.

·8 min read·Yash Thakker
Google ResearchTime SeriesFoundation ModelsMachine LearningForecastingOpen Source
Google TimesFM 2.5: The Open-Source Time Series Foundation Model Explained

Zero-shot time series forecasting has been an open problem for years: how do you build a model that generalizes across domains—retail demand, energy load, financial prices, climate data—without retraining on each new dataset? Google Research's TimesFM is one of the most serious answers yet, and the 2.5 release makes it smaller, faster, and more capable all at once.

TimesFM 2.5 was presented at ICML 2024 (as TimesFM 1.0) and has since been iterated to the current 2.5 version, which ships with a 16k context window, continuous quantile forecasting, LoRA fine-tuning support, and an AI agent skill. It has 23,000+ GitHub stars, runs on PyPI, Hugging Face, BigQuery ML, Google Sheets, and Vertex AI.

Here's everything you need to know.


What TimesFM Actually Is

TimesFM is a decoder-only transformer applied to time series. The decoder-only architecture—the same paradigm used by GPT-class language models—turns out to be remarkably well-suited to forecasting: you feed in a context window of historical values and autoregressively generate future values, one step at a time.

The key insight from the original ICML 2024 paper is that this approach enables zero-shot generalization: a model trained on diverse time series data can forecast on new time series it has never seen, without any fine-tuning. Just like a language model can answer questions about topics it was never explicitly trained on, TimesFM can produce reasonable forecasts for new domains by drawing on patterns learned across its training distribution.

This makes TimesFM fundamentally different from classical forecasting models (ARIMA, Prophet, ETS) which are fit from scratch on each new series, and from most deep learning forecasting models (N-BEATS, TFT, PatchTST) which are typically trained on a specific dataset before evaluation.


TimesFM 2.5: What Changed

TimesFM 2.5 (released September 2025) is a significant architectural evolution from 2.0.

FeatureTimesFM 2.0TimesFM 2.5
Parameters500M200M
Context length2,04816,384
Quantile forecastingLimitedContinuous quantile head (30M params)
Frequency indicatorRequiredRemoved
Inference speedBaselineFlax version for faster inference
Covariate supportNoXReg (added Oct 2025)
Fine-tuningNoLoRA via HuggingFace PEFT (added Apr 2026)
Agent skillsNoSKILL.md (added Mar 2026)

The parameter reduction (500M → 200M) while simultaneously extending context (2,048 → 16,384) is the headline architectural achievement. Google Research achieved this by improving the efficiency of the attention mechanism and the representation of time series patches.

The removal of the frequency indicator—a feature in 2.0 that required users to specify whether their data was daily, weekly, monthly—simplifies the interface and removes a common source of user error.


Core Capabilities

Zero-Shot Forecasting

The primary use case. Feed in historical data, get a point forecast:

import torch
import numpy as np
import timesfm

torch.set_float32_matmul_precision("high")

model = timesfm.TimesFM_2p5_200M_torch.from_pretrained(
    "google/timesfm-2.5-200m-pytorch"
)

point_forecast, quantile_forecast = model.forecast(
    horizon=12,
    inputs=[
        np.linspace(0, 1, 100),       # first time series
        np.sin(np.linspace(0, 20, 67)),  # second time series
    ],
)

point_forecast.shape    # (2, 12)
quantile_forecast.shape # (2, 12, 10): mean + 10th to 90th quantiles

No training required. No frequency specification required. Feed in raw values, get back a 12-step-ahead forecast with full quantile coverage.

Continuous Quantile Forecasting

The optional 30M quantile head gives you calibrated probabilistic forecasts—not just a point estimate, but the full distribution of plausible future values. This is critical for decision-making under uncertainty: you need to know not just the expected demand but the 90th percentile to size safety stock, or the 10th percentile to stress-test a financial model.

TimesFM 2.5's quantile head produces continuous quantile estimates up to a 1,000-step horizon. Enable it in the ForecastConfig:

model.compile(
    timesfm.ForecastConfig(
        max_context=1024,
        max_horizon=256,
        normalize_inputs=True,
        use_continuous_quantile_head=True,
        force_flip_invariance=True,
        infer_is_positive=True,
        fix_quantile_crossing=True,  # ensures quantile monotonicity
    )
)

The fix_quantile_crossing=True flag is important: it prevents the common problem where estimated quantiles cross each other (i.e., the 90th percentile is lower than the 80th), which would be physically impossible.

Covariate Support via XReg

Time series rarely move in isolation. Demand for a product responds to promotions. Energy load responds to temperature. Web traffic responds to marketing campaigns.

TimesFM 2.5 added covariate support via XReg (October 2025), allowing you to incorporate known future values of related variables to improve forecast accuracy. This closes a gap that made TimesFM's zero-shot approach less practical for production use cases with rich contextual signals.

Fine-Tuning with LoRA

Zero-shot forecasting is impressive, but domain adaptation matters. TimesFM 2.5 now supports fine-tuning via LoRA (Low-Rank Adaptation) through HuggingFace Transformers and PEFT (April 2026).

LoRA lets you fine-tune a large model by training only a small number of additional parameters, keeping the original weights frozen. For TimesFM, this means you can adapt the base model to your specific domain (say, retail demand forecasting for your specific product categories) without retraining 200M parameters from scratch.

Fine-tuning examples are in timesfm-forecasting/examples/finetuning/ in the GitHub repo.


Where to Run TimesFM

Python Package (Recommended for Developers)

# PyTorch backend
pip install timesfm[torch]

# Flax backend (faster inference)
pip install timesfm[flax]

# With XReg covariate support
pip install timesfm[xreg]

The Flax version was added after the September 2025 launch specifically for faster inference—if you're running high-throughput forecasting, Flax is worth testing against the PyTorch version.

Hugging Face

Weights are hosted at google/timesfm-2.5-200m-pytorch (and the Flax equivalent). Load via from_pretrained as shown in the code examples above.

BigQuery ML

For enterprise data teams that live in SQL, TimesFM is available in BigQuery ML with SQL-native forecasting:

SELECT
  *
FROM
  ML.FORECAST(MODEL `myproject.mydataset.timesfm_model`,
    STRUCT(30 AS horizon, 0.8 AS confidence_level))

This gives BigQuery users access to a frontier foundation model without leaving their existing data infrastructure. No Python environment management, no model serving—just SQL queries against your existing BigQuery tables.

Google Sheets

TimesFM is embedded in Google Sheets via a function-based interface, making it accessible to non-engineers who work with time series data in spreadsheets. This is the most accessible deployment path but also the most constrained in terms of configuration options.

Vertex AI Model Garden

A Dockerized TimesFM endpoint is available in Vertex AI Model Garden for production deployment with managed scaling. This is the recommended path for teams that need to serve TimesFM at production load without managing their own infrastructure.


TimesFM as an AI Agent Tool

Since March 2026, TimesFM has a SKILL.md—a machine-readable description of the model's capabilities and how to call it, compatible with Claude Code and other MCP-enabled agent systems.

This matters because time series forecasting often needs to happen as part of a larger agentic workflow:

  • An agent analyzing a business problem identifies a trend to forecast
  • The agent calls TimesFM as a tool to generate the forecast
  • The agent incorporates the forecast result into its broader analysis

The SKILL.md enables exactly this. An AI agent with access to TimesFM can treat forecasting as a native capability—passing in time series data, getting back point and quantile forecasts, and using those forecasts as inputs to further reasoning.

This is part of a broader trend: foundation models from specialized domains (vision, speech, genomics, time series) becoming tools in the repertoire of general-purpose AI agents, rather than standalone products.


How TimesFM Compares

vs. Classical Methods (ARIMA, Prophet, ETS)

Classical methods require fitting from scratch on each series. For a retail company with 100,000 SKUs, that's 100,000 separate model fits. TimesFM makes a single pass over the historical data with no fitting required.

Trade-off: Classical methods can be more accurate for series with strong, domain-specific patterns when sufficient history is available. TimesFM's zero-shot advantage is most valuable when you have limited history, many series, or frequently changing data.

vs. Deep Learning Forecasting (TFT, N-BEATS, PatchTST, iTransformer)

These models are trained on specific datasets and often outperform zero-shot models on their target distributions. But they require labeled training data and regular retraining as distributions shift.

TimesFM's advantage is generalization speed: you can get a reasonable forecast on a new series immediately. The deep learning advantage is in-domain accuracy when you have rich training data.

vs. Lag-Llama, Moirai, MOMENT (Other Foundation Forecasting Models)

TimesFM competes directly with other time series foundation models. The differentiation:

  • Lag-Llama: Open-source, research-focused, univariate probabilistic forecasting
  • Moirai (Salesforce): Broad multivariate support, any-variate approach
  • MOMENT (CMU): Encoder-only architecture, strong on anomaly detection and imputation as well as forecasting
  • TimesFM: Decoder-only architecture (matches LLM paradigm), Google production support (BigQuery, Sheets, Vertex), 16k context, explicit agent skills

Google's production deployment (BigQuery ML, Google Sheets) is a meaningful differentiator—most other foundation forecasting models remain purely research artifacts. TimesFM is one of the few that has been deployed at scale in commercial products.


Getting Started

Quick start (5 minutes):

pip install timesfm[torch]
import numpy as np
import timesfm

model = timesfm.TimesFM_2p5_200M_torch.from_pretrained(
    "google/timesfm-2.5-200m-pytorch"
)

# Replace with your actual time series
my_series = np.array([10, 12, 11, 14, 13, 15, 14, 16, 15, 17])

point_forecast, quantile_forecast = model.forecast(
    horizon=6,
    inputs=[my_series],
)

print("6-step forecast:", point_forecast[0])
print("90th percentile:", quantile_forecast[0, :, -1])

For production use: Start with Vertex AI Model Garden if you need managed scaling, or BigQuery ML if your data lives in BigQuery. PyPI install is the right choice for data science workflows in Python notebooks or pipelines.

For fine-tuning: Check timesfm-forecasting/examples/finetuning/ for the LoRA example. You'll need HuggingFace Transformers and PEFT installed alongside timesfm.

For agentic workflows: The SKILL.md makes TimesFM callable from Claude Code and compatible agent systems. Load it as a tool and pass time series data directly to the forecasting endpoint.


Key Links

  • GitHub: google-research/timesfm
  • Hugging Face: google/timesfm-2.5-200m-pytorch
  • Paper (ICML 2024): A decoder-only foundation model for time-series forecasting
  • PyPI: pip install timesfm[torch]
  • BigQuery ML: Available in all BigQuery regions

TimesFM 2.5 is one of the more production-ready foundation models in the time series space, combining solid zero-shot benchmarks with genuine enterprise deployment paths. The transition from 500M to 200M parameters while extending context eightfold is the right direction—smaller, faster, more capable. The LoRA fine-tuning and agent skill additions mean it fits into modern ML workflows without requiring specialized integration work.

Live Bootcamp6 weeks

Complete AI Builder Bootcamp

Claude, Python automation & full-stack — 12 live sessions with Yash Thakker.

View bootcamp

The Complete AI Builder Bootcamp is the best AI development course for learning Claude AI, prompt engineering, Python automation, and full-stack web development. This intensive 6-week live bootcamp teaches you how to build AI-powered applications using Claude Projects, Claude Artifacts, Claude Code, and the complete Claude ecosystem. You'll master prompt engineering techniques, learn to create custom Claude connectors and MCP integrations, build Python automation workflows, develop full-stack websites with AI assistance, and create AI marketing agents.

The bootcamp includes 12 live Zoom sessions with Yash Thakker, founder of AISOLO Technologies and instructor to 350,000+ students. You'll build 8+ portfolio projects including AI playbooks, full-stack note-taking applications, Python automation scripts, marketing agents, and personal portfolio websites. The curriculum covers AI fundamentals, Claude Projects and Artifacts, Claude Co-work, Claude plugins and skills, Claude Code for Python development, full-stack development, AI marketing, and capstone projects.

Students receive 1-year access to all recordings, permanent Discord community access, a certificate of completion, and personalized career guidance. All enrollments include a 7-day money-back guarantee. This is the most comprehensive Claude AI bootcamp available, taking students from zero AI knowledge to expert AI builder in 6 weeks.

Related posts