browser-automationsearch-web

LSD Web Data Extraction

by lsd-so

LSD Web Data Extraction lets you scrape any website with ease. Perform web page scraping and manipulate data using commu

Provides web data extraction and manipulation capabilities through the LSD programming language, enabling structured data retrieval from websites, web searches, and community-created extraction patterns without complex scraping code.

github stars

3

No complex scraping code requiredCommunity-shared extraction patternsPostgreSQL-compatible database access

best for

  • / Data analysts collecting web data for research
  • / Developers building web scraping workflows
  • / Market researchers gathering competitive intelligence

capabilities

  • / Extract structured data from websites
  • / Perform web searches and retrieve results
  • / Access community-created extraction patterns
  • / Create custom extraction workflows with LSD language
  • / Query extracted data through PostgreSQL-compatible interface

what it does

Extracts structured data from websites using the LSD programming language, letting you scrape web pages and search results without writing complex scraping code.

about

LSD Web Data Extraction is an official MCP server published by lsd-so that provides AI assistants with tools and capabilities via the Model Context Protocol. LSD Web Data Extraction lets you scrape any website with ease. Perform web page scraping and manipulate data using commu It is categorized under browser automation, search web.

how to install

You can install LSD Web Data Extraction in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

license

MIT

LSD Web Data Extraction is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

readme

LSD MCP

This is the updated MCP server for LSD. The reason behind this update is to effectively leverage dynamic tools that are defined as trips using our SDK.

Contents

Getting started

Authenticating

This is to connect the running MCP server with your account by using our SDK.

The reason for using the terms user and password is because what you're connecting to is our postgres compatible database.

Configuration file

In your home directory, write a JSON to a file named .lsd with the properties user and password with your email and an API key from your profile.

{
  "user": "<you@email.domain>",
  "password": "<api_key>"
}

Environment variables

Alternatively, you can set the environment variables LSD_USER and LSD_PASSWORD.

$ export LSD_USER='you@email.domain'
$ export LSD_PASSWORD='<api_key>'

Important: If you run into errors when taking this approach, check the environment variables set are accessible from the PATH or process the MCP client is invoking it from.

Using an MCP registry

Pulse

https://www.pulsemcp.com/servers/lsd-so-internetdata

More coming soon.

From source

  1. Clone this repository
$ git clone https://github.com/lsd-so/mcp.git
  1. If you're using Claude desktop, update your claude_desktop_config.json file (here's a guide for creating it).
{
  "mcpServers": {
    // other MCP servers configured here...
    "lsd": {
      "command": "node",
      "args": [
	    "/<path>/<to>/mcp/build/index.js"
      ]
    }
  }
}

Example of usage

Interaction

Screen recording of using the lsd_research prompt

Extraction

Screen recording of using the lsd_research prompt

Extending capabilities with LSD

For scenarios where you'd like to teach the MCP client a "skill", you can do so with an LSD trip (what is that?).

What is a trip?

A "trip" is a published module consisting of an LSD program whether it was derived by interacting with our local browser or by directly publishing a trip.

From the bicycle browser

From the Bicycle browser you can derive LSD by using our "click language", this can be activated by clicking on the transcriber icon in the top right:

Screen recording of clicking on the transcriber icon

Or by pressing Command+k (or Ctrl+k for Linux/Windows). Once you've done so, you can interactively "pluck" repeating containers as well as fields of interest:

Screen recording of the transcriber flow

With the generated LSD, you can edit the aliases like so:

Screen recording of editing LSD code

After which you can publish using the language.

Using the language

From the workbench, simply edit to publish a trip.

A screen recording of filling out trip details

Extending capabilities with TypeScript

Check out the internetdata SDK that's used under the hood to bridge with the web. Or, alternatively, get started using the create-your-internet shorthand.

$ yarn create your-internet

Or, if you prefer npm

$ npm create your-internet