otheranalytics-data

The Guardian

by jbenton

Unlock The Guardian's articles with 17 tools for keyword analysis, SEO competition, and qualitative data research.

Integrates with The Guardian's Open Platform API to search articles, retrieve full content, browse sections, and perform analytical operations like author profiling and topic trend analysis with 17 specialized tools including Long Read discovery and content timeline analysis.

github stars

19

17 specialized tools1.9+ million articles since 1999Requires Guardian API key

best for

  • / Journalism researchers and media analysts
  • / Historical research on news events
  • / Content creators studying Guardian coverage
  • / Academic research on media trends

capabilities

  • / Search Guardian articles by keywords, dates, and sections
  • / Retrieve full article text and metadata
  • / Browse Guardian's 50,000+ editorial tags
  • / Find related articles through shared tags
  • / Analyze author profiles and publication patterns
  • / Track topic trends over time

what it does

Provides access to The Guardian's complete archive of 1.9+ million articles since 1999 through their Open Platform API. Enables searching, content retrieval, and analytical operations on Guardian journalism data.

about

The Guardian is a community-built MCP server published by jbenton that provides AI assistants with tools and capabilities via the Model Context Protocol. Unlock The Guardian's articles with 17 tools for keyword analysis, SEO competition, and qualitative data research. It is categorized under other, analytics data.

how to install

You can install The Guardian in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

license

MIT

The Guardian is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

readme

Guardian MCP Server

An MCP server that connects an LLM to the archives (since 1999) of The Guardian, including the full text of all articles — more than 1.9 million of them. Useful for real-time headlines, journalism analysis, and historical research.

Installation

A Guardian Open Platform API key is required. You can get one here: https://open-platform.theguardian.com/access/

The Guardian offers generous API access for non-commercial use of the archives, including up to 1 call/second and 500 calls/day. (See the full Terms & Conditions. Commercial use requires a different license.)

To install:

npx guardian-mcp-server

Sample MCP client configuration:

{
  "mcpServers": {
    "guardian": {
      "command": "npx",
      "args": ["guardian-mcp-server"],
      "env": {
        "GUARDIAN_API_KEY": "your-key-here"
      }
    }
  }
}

Tool reference

guardian_search: search the archive for articles

Use thedetail_level parameter to determine the size of the API response and optimize performance: minimal (headlines only), standard (headlines, summaries, and metadata), or full (all content, including full article text).

{
  "query": "climate change",
  "section": "environment", 
  "detail_level": "minimal",
  "from_date": "2024-01-01",
  "order_by": "newest"
}

guardian_get_article: retrieve individual articles

{
  "article_id": "https://www.theguardian.com/politics/2024/dec/01/example", 
  "truncate": false  // full content by default
}

guardian_search_tags: search through The Guardian's 50,000-plus hand-assigned tags

guardian_find_related: find articles similar to an article (via shared tags)

guardian_get_article_tags: returns tags assigned to any article

{
  "article_id": "politics/2024/example"
}

guardian_lookback: historical search by date

guardian_content_timeline: analyze Guardian content on a particular topic over a defined period

{
  "query": "artificial intelligence",
  "from_date": "2024-01-01",
  "to_date": "2024-06-30", 
  "interval": "month"
}

guardian_top_stories_by_date: estimates editorial importance; The Guardian's API doesn't natively return data to differentiate between Page 1 stories and inside briefs, and this tries to hack a ranking together

{
  "date": "2016-06-24",  // Brexit referendum day
  "story_count": 5
}

guardian_topic_trends: compare multiple topics over time with correlation analysis and competitive rankings

{
  "topics": ["artificial intelligence", "climate change", "brexit"],
  "from_date": "2023-01-01",
  "to_date": "2024-12-31",
  "interval": "quarter"
}

guardian_author_profile: generate profiles of Guardian journalists and what they cover

{
  "author": "George Monbiot",
  "analysis_period": "2024"
}

guardian_longread: search The Long Read series, the paper's home for longform features

guardian_browse_section: browse recent articles from specific sections

guardian_get_sections: fetch all available Guardian sections

guardian_search_by_length: filter articles by word count

guardian_search_by_author: search articles by byline

guardian_recommend_longreads: get personalized Long Read recommendations based on interest

{
  "count": 3,
  "context": "I'm researching technology, especially AI",
  "topic_preference": "digital culture"
}

License

MIT license.