explainx.ainewsletter3.4k

browser-automation

Selenium▌

by angiejones

Automate web browser actions efficiently using Selenium WebDriver for robust testing with Selenium on Python and seamles

Automates web browser actions with Selenium WebDriver.

github stars

★ 373

0 commentsdiscussion

Both formats append explainx.ai attribution and the canonical URL for this MCP server listing.

Works with major browsers including SafariNo manual scripting required - just tell the AI what to do10+ browser interaction tools

best for

/ AI agents performing web-based tasks and workflows
/ Automated testing of web applications
/ Web scraping and data extraction from interactive sites
/ Browser-based automation without manual scripting

capabilities

/ Launch Chrome, Firefox, Edge, or Safari browsers
/ Navigate to URLs and click elements on web pages
/ Fill forms and type text into input fields
/ Extract text content from web page elements
/ Perform drag-and-drop and hover interactions
/ Execute right-clicks and double-clicks on elements

what it does

Automates web browsers through Selenium WebDriver, allowing AI agents to click buttons, fill forms, navigate pages, and interact with websites programmatically.

about

Selenium is a community-built MCP server published by angiejones that provides AI assistants with tools and capabilities via the Model Context Protocol. Automate web browser actions efficiently using Selenium WebDriver for robust testing with Selenium on Python and seamles It is categorized under browser automation. This server exposes 14 tools that AI clients can invoke during conversations and coding sessions.

how to install

You can install Selenium in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

license

MIT

Selenium is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

readme

MCP Selenium Server

A Model Context Protocol (MCP) server for Selenium WebDriver — browser automation for AI agents.

<a href="https://glama.ai/mcp/servers/@angiejones/mcp-selenium"> <img width="380" height="200" src="https://glama.ai/mcp/servers/@angiejones/mcp-selenium/badge" alt="Selenium MCP server" /> </a>

Setup

<details open> <summary><strong>Goose (Desktop)</strong></summary>

Paste into your browser address bar:

goose://extension?cmd=npx&arg=-y&arg=%40angiejones%2Fmcp-selenium%40latest&id=selenium-mcp&name=Selenium%20MCP&description=automates%20browser%20interactions

</details> <details> <summary><strong>Goose (CLI)</strong></summary>

goose session --with-extension "npx -y @angiejones/mcp-selenium@latest"

</details> <details> <summary><strong>Claude Code</strong></summary>

claude mcp add selenium -- npx -y @angiejones/mcp-selenium@latest

</details> <details> <summary><strong>Cursor / Windsurf / other MCP clients</strong></summary>

{
  "mcpServers": {
    "selenium": {
      "command": "npx",
      "args": ["-y", "@angiejones/mcp-selenium@latest"]
    }
  }
}

</details>

Example Usage

Tell the AI agent of your choice:

Open Chrome, go to github.com/angiejones, and take a screenshot.

The agent will call Selenium's APIs to start_browser, navigate, and take_screenshot. No manual scripting or explicit directions needed.

Supported Browsers

Chrome, Firefox, Edge, and Safari.

Safari note: Requires macOS. Run sudo safaridriver --enable once and enable "Allow Remote Automation" in Safari → Settings → Developer. No headless mode.

<details> <summary><strong>Tools</strong></summary>

start_browser

Launches a browser session.

Parameter	Type	Required	Description
browser	string	Yes	`chrome`, `firefox`, `edge`, or `safari`
options	object	No	`{ headless: boolean, arguments: string[] }`

navigate

Navigates to a URL.

Parameter	Type	Required	Description
url	string	Yes	URL to navigate to

interact

Performs a mouse action on an element.

Parameter	Type	Required	Description
action	string	Yes	`click`, `doubleclick`, `rightclick`, or `hover`
by	string	Yes	Locator strategy: `id`, `css`, `xpath`, `name`, `tag`, `class`
value	string	Yes	Value for the locator strategy
timeout	number	No	Max wait in ms (default: 10000)

send_keys

Types text into an element. Clears the field first.

Parameter	Type	Required	Description
by	string	Yes	Locator strategy
value	string	Yes	Locator value
text	string	Yes	Text to enter
timeout	number	No	Max wait in ms (default: 10000)

get_element_text

Gets the text content of an element.

Parameter	Type	Required	Description
by	string	Yes	Locator strategy
value	string	Yes	Locator value
timeout	number	No	Max wait in ms (default: 10000)

get_element_attribute

Gets an attribute value from an element.

Parameter	Type	Required	Description
by	string	Yes	Locator strategy
value	string	Yes	Locator value
attribute	string	Yes	Attribute name (e.g., `href`, `value`, `class`)
timeout	number	No	Max wait in ms (default: 10000)

press_key

Presses a keyboard key.

Parameter	Type	Required	Description
key	string	Yes	Key to press (e.g., `Enter`, `Tab`, `a`)

upload_file

Uploads a file via a file input element.

Parameter	Type	Required	Description
by	string	Yes	Locator strategy
value	string	Yes	Locator value
filePath	string	Yes	Absolute path to the file
timeout	number	No	Max wait in ms (default: 10000)

take_screenshot

Captures a screenshot of the current page.

Parameter	Type	Required	Description
outputPath	string	No	Save path. If omitted, returns base64 image data.

close_session

Closes the current browser session. No parameters.

execute_script

Executes JavaScript in the browser. Use for advanced interactions not covered by other tools (e.g., drag and drop, scrolling, reading computed styles, DOM manipulation).

Parameter	Type	Required	Description
script	string	Yes	JavaScript code to execute
args	array	No	Arguments accessible via `arguments[0]`, etc.

window

Manages browser windows and tabs.

Parameter	Type	Required	Description
action	string	Yes	`list`, `switch`, `switch_latest`, or `close`
handle	string	No	Window handle (required for `switch`)

frame

Switches focus to a frame or back to the main page.

Parameter	Type	Required	Description
action	string	Yes	`switch` or `default`
by	string	No	Locator strategy (for `switch`)
value	string	No	Locator value (for `switch`)
index	number	No	Frame index, 0-based (for `switch`)
timeout	number	No	Max wait in ms (default: 10000)

alert

Handles browser alert, confirm, or prompt dialogs.

Parameter	Type	Required	Description
action	string	Yes	`accept`, `dismiss`, `get_text`, or `send_text`
text	string	No	Text to send (required for `send_text`)
timeout	number	No	Max wait in ms (default: 5000)

add_cookie

Adds a cookie. Browser must be on a page from the cookie's domain.

Parameter	Type	Required	Description
name	string	Yes	Cookie name
value	string	Yes	Cookie value
domain	string	No	Cookie domain
path	string	No	Cookie path
secure	boolean	No	Secure flag
httpOnly	boolean	No	HTTP-only flag
expiry	number	No	Unix timestamp

get_cookies

Gets cookies. Returns all or a specific one by name.

Parameter	Type	Required	Description
name	string	No	Cookie name. Omit for all cookies.

delete_cookie

Deletes cookies. Deletes all or a specific one by name.

Parameter	Type	Required	Description
name	string	No	Cookie name. Omit to delete all.

diagnostics

Gets browser diagnostics captured via WebDriver BiDi (auto-enabled when supported).

Parameter	Type	Required	Description
type	string	Yes	`console`, `errors`, or `network`
clear	boolean	No	Clear buffer after returning (default: false)

</details> <details> <summary><strong>Resources</strong></summary>

MCP resources provide read-only data that clients can access without calling a tool.

browser-status://current

Returns the current browser session status (active session ID or "no active session").

Property	Value
MIME type	`text/plain`
Requires browser	No

accessibility://current

Returns an accessibility tree snapshot of the current page — a compact, structured JSON representation of interactive elements and text content. Much smaller than full HTML. Useful for understanding page layout and finding elements to interact with.

Property	Value
MIME type	`application/json`
Requires browser	Yes

</details>

<details> <summary><strong>Development</strong></summary>

Setup

git clone https://github.com/angiejones/mcp-selenium.git
cd mcp-selenium
npm install

Run Tests

npm test

Requires Chrome + chromedriver on PATH. Tests run headless.

Install via Smithery

npx -y @smithery/cli install @angiejones/mcp-selenium --client claude

Install globally

npm install -g @angiejones/mcp-selenium
mcp-selenium

</details>

License

MIT

FAQ

What is the Selenium MCP server?: Selenium is a Model Context Protocol (MCP) server profile on explainx.ai. MCP lets AI hosts (e.g. Claude Desktop, Cursor) call tools and resources through a standard interface; this page summarizes categories, install hints, and community ratings.
How do MCP servers relate to agent skills?: Skills are reusable instruction packages (often SKILL.md); MCP servers expose live capabilities. Teams frequently combine both—skills for workflows, MCP for APIs and data. See explainx.ai/skills and explainx.ai/mcp-servers for parallel directories.
How are reviews shown for Selenium?: This profile displays 52 aggregated ratings (sample rows for discoverability plus signed-in user reviews). Average score is about 4.4 out of 5—verify behavior in your own environment before production use.

Use Cases▌

Web Research & Information Gathering

Fetch and extract information from websites automatically

Example

Research competitor pricing, scrape product reviews, monitor news mentions

✓

Automate 5-10 hours/week of manual web research

Content Monitoring & Alerts

Track website changes, new content, price updates

Example

Monitor competitor blog for new posts, track stock availability, watch for pricing changes

✓

Stay informed without manual checking, never miss important updates

Data Extraction & Aggregation

Extract structured data from multiple websites

Example

Compile product listings from 10 e-commerce sites, aggregate job postings, collect real estate data

✓

Build datasets 100x faster than manual copying

API-less Integration

Interact with services that don't offer APIs

Example

Check form submissions, validate website functionality, test user flows

✓

Automate interactions with any website, even without API

Implementation Guide▌

Prerequisites

›Claude Desktop or Cursor with MCP support
›Understanding of web scraping ethics and robots.txt
›Rate limiting awareness to avoid overwhelming target sites
›Knowledge of legal restrictions on data collection

Time Estimate

20-40 minutes including configuration and testing

Installation Steps

1.Install web automation MCP server via npm or pip
2.Configure allowed domains and rate limits in MCP config
3.Test with simple fetch: 'Get content from example.com'
4.Progress to extraction: 'Extract all product prices from this page'
5.Set up monitoring: 'Check this URL daily for changes'
6.Parse structured data: 'Create CSV from this table'
7.Respect robots.txt and rate limits always

Troubleshooting

⚠403 Forbidden: Website blocks bots—respect their wishes, use official API instead
⚠Rate limit errors: Slow down requests, add delays between fetches
⚠Stale data: Target site changed HTML structure—update selectors
⚠Timeout errors: Site is slow or blocking—increase timeout, try different user agent
⚠JavaScript-rendered content: Use headless browser MCP servers for dynamic sites

Best Practices▌

✓ Do

+Check robots.txt and respect crawl rules
+Rate limit requests: 1-2 requests/second maximum
+Use official APIs when available instead of scraping
+Identify your bot with descriptive user agent
+Cache results to minimize repeated requests
+Handle errors gracefully with retries and fallbacks
+Validate extracted data for accuracy

✗ Don't

−Don't scrape sites that explicitly forbid it (robots.txt, ToS)
−Don't overwhelm servers with rapid requests—use rate limiting
−Don't scrape personal data without consent and legal basis
−Don't ignore copyright on extracted content
−Don't assume HTML structure is stable—handle changes
−Don't use scraped data for commercial purposes without permission

💡 Pro Tips

★Use CSS selectors or XPath for robust data extraction
★Set up monitoring alerts for extraction failures (structure changed)
★Implement exponential backoff for retries on failures
★Store raw HTML for reprocessing if extraction logic changes
★Combine with data analysis tools for insights from extracted data
★Consider using official APIs or RSS feeds as more stable alternatives

Technical Details▌

Architecture

MCP server handles HTTP requests, HTML parsing, JavaScript rendering (if headless browser), and returns structured data to Claude.

Protocols

HTTP/HTTPS
WebSocket (for real-time sites)
Puppeteer/Playwright (for JavaScript sites)

Compatibility

Static HTML sites
JavaScript-rendered SPAs (with headless browser)
REST APIs
GraphQL endpoints

When to Use This▌

✓ Use When

Use for research automation, content monitoring, data aggregation from multiple sources, and when official APIs don't exist. Best for read-only information gathering.

✗ Avoid When

Avoid for sites with APIs (use API instead), sites that explicitly forbid scraping, when data is copyrighted, or for login-required content without proper authorization.

Integration▌

→Scheduled monitoring with change detection
→Multi-source data aggregation pipelines
→Fallback to web scraping when API rate limits hit
→Headless browser for JavaScript-heavy sites

Discussion

Product Hunt–style comments (not star reviews)

No comments yet — start the thread.

List & Promote Your MCP Server

Share your MCP server with the developer community

GET_STARTED →

MCP server reviews

Ratings

4.4★★★★★52 reviews

★★★★★Ganesh Mohane· Dec 20, 2024
I recommend Selenium for teams standardizing on MCP; the explainx.ai page compares cleanly with sibling servers.
★★★★★Isabella Agarwal· Dec 16, 2024
We evaluated Selenium against two servers with overlapping tools; this profile had the clearer scope statement.
★★★★★Layla Rao· Dec 16, 2024
I recommend Selenium for teams standardizing on MCP; the explainx.ai page compares cleanly with sibling servers.
★★★★★Advait Okafor· Dec 12, 2024
Strong directory entry: Selenium surfaces stars and publisher context so we could sanity-check maintenance before adopting.
★★★★★Sophia Okafor· Dec 4, 2024
Useful MCP listing: Selenium is the kind of server we cite when onboarding engineers to host + tool permissions.
★★★★★Isabella Wang· Nov 23, 2024
Selenium is a well-scoped MCP server in the explainx.ai directory — install snippets and categories matched our Claude Code setup.
★★★★★Advait Gupta· Nov 19, 2024
Selenium reduced integration guesswork — categories and install configs on the listing matched the upstream repo.
★★★★★Benjamin Sanchez· Nov 7, 2024
Selenium is among the better-indexed MCP projects we tried; the explainx.ai summary tracks the official description.
★★★★★Dev Agarwal· Nov 3, 2024
We wired Selenium into a staging workspace; the listing’s GitHub and npm pointers saved time versus hunting across READMEs.
★★★★★Anika Smith· Nov 3, 2024
Selenium has been reliable for tool-calling workflows; the MCP profile page is a good permalink for internal docs.

showing 1-10 of 52

1 / 6