Browser Automation▌
by hihuzhen
Automate browsers easily with a Chrome extension and WebSocket—no headless setup. Ideal for Selenium, WebDriver, and bro
Provides browser automation through a Chrome extension and WebSocket server, enabling navigation, element interaction, form filling, screenshot capture, content extraction, and console monitoring without headless browser setups.
Both formats append explainx.ai attribution and the canonical URL for this MCP server listing.
best for
- / AI agents performing web automation tasks
- / Testing and monitoring web applications
- / Data extraction from dynamic websites
- / Form filling and web workflow automation
capabilities
- / Navigate web pages and control browser history
- / Click elements and fill forms using selectors
- / Take screenshots of pages or specific elements
- / Extract web page content and HTML
- / Simulate keyboard inputs and interactions
- / Monitor and manage browser tabs and windows
what it does
Automates your actual Chrome browser through a WebSocket connection and browser extension, enabling AI agents to navigate websites, fill forms, and take screenshots without requiring headless browser setups.
about
Browser Automation is a community-built MCP server published by hihuzhen that provides AI assistants with tools and capabilities via the Model Context Protocol. Automate browsers easily with a Chrome extension and WebSocket—no headless setup. Ideal for Selenium, WebDriver, and bro It is categorized under browser automation. This server exposes 10 tools that AI clients can invoke during conversations and coding sessions.
how to install
You can install Browser Automation in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.
license
MIT
Browser Automation is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.
readme
Browser MCP Server 🚀
English | 简体中文
Browser MCP Server是一个基于WebSocket通信的浏览器MCP(Model Context Protocol)服务器实现,允许AI助手控制你的浏览器。
🚀 项目特点
- WebSocket通信: 使用WebSocket替代原有的通信方式,提供更高效的双向通信
- Python后端: App服务端完全使用Python重写,利用FastMCP框架
- 浏览器自动化: 允许AI助手执行各种浏览器操作
- 本地运行: 完全在本地运行,保证用户隐私
- 多工具支持: 支持截图、交互式操作等多种工具
📁 项目结构
├── packages/ # 项目包
│ ├── app/ # Python实现的MCP服务器
│ │ ├── src/nep_browser_engine/ # 主源码目录
│ │ ├── pyproject.toml # Python项目配置
│ │ └── .gitignore # Python项目的gitignore文件
│ └── extension/ # Chrome浏览器扩展
│ ├── common/ # 通用代码和常量
│ ├── entrypoints/ # 入口点(background和popup)
│ └── inject-scripts/ # 注入到网页的脚本
├── .gitignore # 根目录gitignore文件
└── LICENSE # 许可证文件
App部分(Python实现)
主要组件:
- WebSocket服务: 实现WebSocket服务器,负责与浏览器扩展通信
- MCP服务: 实现MCP协议,提供各种浏览器控制工具
- 消息处理: 处理WebSocket消息和MCP工具调用
Extension部分(TypeScript实现)
主要组件:
- WebSocket客户端: 负责与Python服务端通信
- 工具处理器: 处理来自服务端的工具调用请求
- 注入脚本: 在网页中执行各种操作
🛠️ 核心功能
页面交互
- 元素点击: 通过CSS选择器点击页面元素
- 表单填写: 填写表单或选择选项
- 键盘操作: 模拟键盘输入
- 获取页面内容: 提取页面文本和HTML
- 获取元素: 获取页面中的特定元素
- 交互式元素识别: 自动识别页面中的交互式元素
媒体和网络
- 截图: 截取整个页面或特定元素
🚀 快速开始
前置要求
- Python 3.9+ 和 pip/poetry/uv
- Chrome/Chromium浏览器
安装步骤
1. 安装Chrome扩展
cd extension
pnpm install
pnpm run build
# 或者去releases中下载指定版本
然后在Chrome浏览器中:
- 打开
chrome://extensions/ - 启用"开发者模式"
- 点击"加载已解压的扩展程序"
2. 运行服务
{
"mcpServers": {
"nep-browser-engine": {
"type": "stdio",
"command": "uvx",
"args": ["nep-browser-engine"]
}
}
}
3. 连接扩展和服务
点击浏览器中的扩展图标,连接到WebSocket服务。
📝 使用说明
与MCP协议客户端一起使用
可以将本服务与支持MCP协议的AI客户端一起使用,例如Claude、CherryStudio等。
🛠️ 可用工具列表
以下是主要的可用工具:
浏览器管理
get_windows_and_tabs: 获取所有打开的窗口和标签页browser_navigate: 导航到URL或刷新当前标签页browser_close_tabs: 关闭特定标签页或窗口browser_go_back_or_forward: 浏览器历史前进或后退
页面交互
browser_click_element: 点击页面元素browser_fill_or_select: 填写表单或选择选项browser_get_elements: 获取页面元素browser_keyboard: 模拟键盘输入browser_get_web_content: 获取网页内容browser_screenshot: 截取页面截图
🔧 开发指南
Python服务端开发
- 确保安装了所有依赖
- 可以通过修改
app/src/nep_browser_engine/config.py来配置WebSocket端口等参数 - 运行时可以通过参数指定传输协议:
python -m nep_browser_engine.app --transport stdio
Chrome扩展开发
- 修改代码后运行
pnpm run build重新构建扩展 - 扩展会自动重新加载(如果在开发者模式下)
- WebSocket默认连接地址为
ws://localhost:18765
📋 注意事项
- 本项目仍在开发中,可能存在一些bug和不完善的地方
- 使用前请确保理解所有工具的功能和潜在风险
- 请勿将本项目用于任何非法或未经授权的活动
🤝 贡献
欢迎提交issue和PR来帮助改进这个项目!
鸣谢
本项目参考 hangwin/mcp-chrome
📄 许可证
FAQ
- What is the Browser Automation MCP server?
- Browser Automation is a Model Context Protocol (MCP) server profile on explainx.ai. MCP lets AI hosts (e.g. Claude Desktop, Cursor) call tools and resources through a standard interface; this page summarizes categories, install hints, and community ratings.
- How do MCP servers relate to agent skills?
- Skills are reusable instruction packages (often SKILL.md); MCP servers expose live capabilities. Teams frequently combine both—skills for workflows, MCP for APIs and data. See explainx.ai/skills and explainx.ai/mcp-servers for parallel directories.
- How are reviews shown for Browser Automation?
- This profile displays 59 aggregated ratings (sample rows for discoverability plus signed-in user reviews). Average score is about 4.4 out of 5—verify behavior in your own environment before production use.
Use Cases▌
Web Research & Information Gathering
Fetch and extract information from websites automatically
Example
Research competitor pricing, scrape product reviews, monitor news mentions
Automate 5-10 hours/week of manual web research
Content Monitoring & Alerts
Track website changes, new content, price updates
Example
Monitor competitor blog for new posts, track stock availability, watch for pricing changes
Stay informed without manual checking, never miss important updates
Data Extraction & Aggregation
Extract structured data from multiple websites
Example
Compile product listings from 10 e-commerce sites, aggregate job postings, collect real estate data
Build datasets 100x faster than manual copying
API-less Integration
Interact with services that don't offer APIs
Example
Check form submissions, validate website functionality, test user flows
Automate interactions with any website, even without API
Implementation Guide▌
Prerequisites
- ›Claude Desktop or Cursor with MCP support
- ›Understanding of web scraping ethics and robots.txt
- ›Rate limiting awareness to avoid overwhelming target sites
- ›Knowledge of legal restrictions on data collection
Time Estimate
20-40 minutes including configuration and testing
Installation Steps
- 1.Install web automation MCP server via npm or pip
- 2.Configure allowed domains and rate limits in MCP config
- 3.Test with simple fetch: 'Get content from example.com'
- 4.Progress to extraction: 'Extract all product prices from this page'
- 5.Set up monitoring: 'Check this URL daily for changes'
- 6.Parse structured data: 'Create CSV from this table'
- 7.Respect robots.txt and rate limits always
Troubleshooting
- ⚠403 Forbidden: Website blocks bots—respect their wishes, use official API instead
- ⚠Rate limit errors: Slow down requests, add delays between fetches
- ⚠Stale data: Target site changed HTML structure—update selectors
- ⚠Timeout errors: Site is slow or blocking—increase timeout, try different user agent
- ⚠JavaScript-rendered content: Use headless browser MCP servers for dynamic sites
Best Practices▌
✓ Do
- +Check robots.txt and respect crawl rules
- +Rate limit requests: 1-2 requests/second maximum
- +Use official APIs when available instead of scraping
- +Identify your bot with descriptive user agent
- +Cache results to minimize repeated requests
- +Handle errors gracefully with retries and fallbacks
- +Validate extracted data for accuracy
✗ Don't
- −Don't scrape sites that explicitly forbid it (robots.txt, ToS)
- −Don't overwhelm servers with rapid requests—use rate limiting
- −Don't scrape personal data without consent and legal basis
- −Don't ignore copyright on extracted content
- −Don't assume HTML structure is stable—handle changes
- −Don't use scraped data for commercial purposes without permission
💡 Pro Tips
- ★Use CSS selectors or XPath for robust data extraction
- ★Set up monitoring alerts for extraction failures (structure changed)
- ★Implement exponential backoff for retries on failures
- ★Store raw HTML for reprocessing if extraction logic changes
- ★Combine with data analysis tools for insights from extracted data
- ★Consider using official APIs or RSS feeds as more stable alternatives
Technical Details▌
Architecture
MCP server handles HTTP requests, HTML parsing, JavaScript rendering (if headless browser), and returns structured data to Claude.
Protocols
- HTTP/HTTPS
- WebSocket (for real-time sites)
- Puppeteer/Playwright (for JavaScript sites)
Compatibility
- Static HTML sites
- JavaScript-rendered SPAs (with headless browser)
- REST APIs
- GraphQL endpoints
When to Use This▌
✓ Use When
Use for research automation, content monitoring, data aggregation from multiple sources, and when official APIs don't exist. Best for read-only information gathering.
✗ Avoid When
Avoid for sites with APIs (use API instead), sites that explicitly forbid scraping, when data is copyrighted, or for login-required content without proper authorization.
Integration▌
- →Scheduled monitoring with change detection
- →Multi-source data aggregation pipelines
- →Fallback to web scraping when API rate limits hit
- →Headless browser for JavaScript-heavy sites
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
List & Promote Your MCP Server
Share your MCP server with the developer community
Ratings
4.4★★★★★59 reviews- ★★★★★Evelyn Bhatia· Dec 24, 2024
According to our notes, Browser Automation benefits from clear Model Context Protocol framing — fewer ambiguous “AI plugin” claims.
- ★★★★★Chinedu Chen· Dec 24, 2024
Browser Automation reduced integration guesswork — categories and install configs on the listing matched the upstream repo.
- ★★★★★Diya Torres· Dec 20, 2024
I recommend Browser Automation for teams standardizing on MCP; the explainx.ai page compares cleanly with sibling servers.
- ★★★★★Dev Okafor· Dec 16, 2024
Browser Automation has been reliable for tool-calling workflows; the MCP profile page is a good permalink for internal docs.
- ★★★★★Dhruvi Jain· Dec 12, 2024
Strong directory entry: Browser Automation surfaces stars and publisher context so we could sanity-check maintenance before adopting.
- ★★★★★Xiao Mehta· Dec 8, 2024
Browser Automation is among the better-indexed MCP projects we tried; the explainx.ai summary tracks the official description.
- ★★★★★Xiao Huang· Nov 27, 2024
Strong directory entry: Browser Automation surfaces stars and publisher context so we could sanity-check maintenance before adopting.
- ★★★★★Michael Rao· Nov 27, 2024
Useful MCP listing: Browser Automation is the kind of server we cite when onboarding engineers to host + tool permissions.
- ★★★★★Michael Singh· Nov 15, 2024
We wired Browser Automation into a staging workspace; the listing’s GitHub and npm pointers saved time versus hunting across READMEs.
- ★★★★★Diya Rahman· Nov 11, 2024
We evaluated Browser Automation against two servers with overlapping tools; this profile had the clearer scope statement.
showing 1-10 of 59