How do I install web-scraping-automation?

Run `npx skills add https://github.com/aaaaqwq/claude-code-skills --skill web-scraping-automation` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does web-scraping-automation support?

web-scraping-automation works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is web-scraping-automation free to use?

Yes. web-scraping-automation is free to install and use. It is available from the open explainx.ai skill registry published by aaaaqwq.

Where can I read ratings and reviews for web-scraping-automation?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

Backend

web-scraping-automation

此技能专门用于自动化网站数据爬取和 API 接口调用，包括：

aaaaqwq/claude-code-skills|Updated Jul 8, 2026

Works with

Claude CodeCursorClineWindsurfCodexGoose

Documentation

Installation Guide

Select your AI agent

How to use web-scraping-automation on Cursor

AI-first code editor with Composer

Prerequisites

Before installing skills in Cursor, ensure your development environment meets these requirements:

›Cursor installed and configured on your machine
›Node.js 16+ with npm — verify with node --version
›Active project directory where you want to add web-scraping-automation

Run the install command

Execute the skills CLI command in your project's root directory to begin installation:

$npx skills add https://github.com/aaaaqwq/claude-code-skills --skill web-scraping-automation

Fetches web-scraping-automation from aaaaqwq/claude-code-skills and configures it for Cursor.

Select Cursor when prompted

The CLI shows a list of agents. Use arrow keys and space to select Cursor:

◆ Which agents do you want to install to?

│

│ ── Universal (.agents/skills) ────────────────

│ · Cline · Codex · Goose · Windsurf

│ ●Cursor(selected)

│ · Cursor · Aider · Continue

Verify installation

Confirm successful installation by checking the skill directory location:

.cursor/skills/web-scraping-automation

Restart Cursor to activate web-scraping-automation. Access via /web-scraping-automation in your agent's command palette.

⚠

Security Notice

We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.

Skills execute code in your environment. Always review source, verify the publisher, and test in isolation before production.

›View source on GitHub ›Skills CLI docs ›About Cursor ›What are agent skills?

List & Monetize Your Skill

Submit your Claude Code skill and start earning

Get started →

Use Cases

Task Automation & Efficiency

Automate repetitive workflows and reduce manual effort

Example

Generate reports, summarize documents, draft communications

✓

Save 3-5 hours per week on routine tasks

Knowledge Enhancement

Learn new skills, understand complex topics, get expert guidance

Example

Explain concepts, provide examples, suggest learning resources

✓

Accelerate learning and skill development by 2x

Quality Improvement

Enhance output quality through reviews, suggestions, and refinements

Example

Review drafts, suggest improvements, catch errors

✓

Improve work quality by 30-40% with less effort

技术栈

⚠️ 资源清理原则（强制）

所有涉及浏览器的爬取任务完成后，必须自动关闭 Chrome/Selenium 进程！

# Playwright 示例 from playwright.sync_api import sync_playwright def scrape_website(): with sync_playwright() as p: browser = p.chromium.launch(headless=True) page = browser.new_page() # ... 爬取逻辑 ... browser.close() # ⚠️ 强制清理残留进程 import subprocess subprocess.run(['pkill', '-f', 'chrome'], capture_output=True) # Selenium 示例 from selenium import webdriver driver = webdriver.Chrome() try: # ... 爬取逻辑 ... pass finally: driver.quit() # ⚠️ 确保清理 import subprocess subprocess.run(['pkill', '-f', 'chrome'], capture_output=True)

原因: 避免内存泄漏和资源占用，防止 Gateway CPU 100% 过载

Python 爬虫

requests：HTTP 请求库

BeautifulSoup4：HTML 解析

Scrapy：专业爬虫框架

Selenium：浏览器自动化

Playwright：现代浏览器自动化

JavaScript 爬虫

axios：HTTP 客户端

cheerio：服务端 jQuery

puppeteer：Chrome 自动化

node-fetch：Fetch API

工作流程

目标分析：

检查网站结构和数据位置
分析 API 接口和认证方式
评估反爬虫机制

方案设计：

选择合适的技术栈
设计数据提取策略
规划错误处理和重试机制

脚本开发：

编写爬虫代码
实现数据解析逻辑
添加日志和监控

测试优化：

验证数据准确性
优化性能和稳定性
处理边界情况

常见场景示例

1. 简单网页爬取

import requests from bs4 import BeautifulSoup def scrape_website(url): headers = {'User-Agent': 'Mozilla/5.0'} response = requests.get(url, headers=headers) soup = BeautifulSoup(response.text, 'html.parser') # 提取数据 data = [] for item in soup.select('.product'): data.append({ 'title': item.select_one('.title').text, 'price': item.select_one('.price').text }) return data

2. API 调用

import requests def call_api(endpoint, params=None): headers = { 'Authorization': 'Bearer YOUR_TOKEN', 'Content-Type': 'application/json' } response = requests.get(endpoint, headers=headers, params=params) return response.json()

3. 动态网页爬取

from selenium import webdriver from selenium.webdriver.common.by import By def scrape_dynamic_page(url): driver = webdriver.Chrome() driver.get(url) # 等待页面加载 driver.implicitly_wait(10) # 提取数据 elements = driver.find_elements(By.CLASS_NAME, 'item') data = [elem.text for elem in elements] driver.quit() return data

web-scraping-automation

Documentation

Installation Guide

How to use web-scraping-automation on Cursor

Prerequisites

Run the install command

Select Cursor when prompted

Verify installation

Security Notice

List & Monetize Your Skill

Use Cases

Task Automation & Efficiency

Knowledge Enhancement

Quality Improvement

Install Skill

网站爬取与 API 自动化

功能说明

使用场景

技术栈

⚠️ 资源清理原则（强制）

Python 爬虫

JavaScript 爬虫

工作流程

最佳实践

常见场景示例

1. 简单网页爬取

2. API 调用

3. 动态网页爬取

反爬虫应对策略

数据存储方案

Implementation Guide

Best Practices

When to Use This

Learning Path

Related Skills

multi-search-engine

web-accessibility

no-mistakes

efficient-fable

apple-design

typescript-best-practices

Reviews

Discussion