performing-ai-driven-osint-correlation

mukul975/Anthropic-Cybersecurity-Skills · updated May 25, 2026

MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.

$npx skills install mukul975/Anthropic-Cybersecurity-Skills/performing-ai-driven-osint-correlation
0 commentsdiscussion
summary

Use AI and LLM-based reasoning to correlate findings across multiple OSINT sources—username enumeration, email lookups, social media profiles, domain records, breach databases, and dark-web mentions—into unified intelligence profiles with confidence scoring and link analysis.

skill.md
name
performing-ai-driven-osint-correlation
description
Use AI and LLM-based reasoning to correlate findings across multiple OSINT sources—username enumeration, email lookups, social media profiles, domain records, breach databases, and dark-web mentions—into unified intelligence profiles with confidence scoring and link analysis.
domain
cybersecurity
subdomain
threat-intelligence
tags
- osint - ai-correlation - threat-intelligence - reconnaissance - link-analysis - target-profiling - sherlock - theharvester - spiderfoot - maltego
version
'1.0'
author
juliosuas
license
Apache-2.0
atlas_techniques
- AML.T0051 - AML.T0054 - AML.T0056
nist_ai_rmf
- MEASURE-2.7 - MEASURE-2.5 - GOVERN-6.1 - MAP-5.1
d3fend_techniques
- Identifier Analysis - URL Analysis - Identifier Reputation Analysis - User Behavior Analysis - Content Validation
nist_csf
- ID.RA-01 - ID.RA-05 - DE.CM-01 - DE.AE-02

Performing AI-Driven OSINT Correlation

When to Use

  • You have collected raw OSINT data from multiple tools and sources but need to identify connections, contradictions, and patterns across them.
  • You need to build a unified intelligence profile for a target entity (person, organization, or infrastructure) from fragmented data.
  • Traditional manual correlation is too slow or error-prone for the volume of data collected.
  • You want confidence-scored assessments of identity linkage across platforms rather than simple keyword matching.

Prerequisites

  • Python 3.10+ with requests, json, and csv libraries
  • Sherlock installed (pip install sherlock-project)
  • theHarvester installed (pip install theHarvester)
  • SpiderFoot 4.0+ running on localhost:5001
  • Access to an LLM API (OpenAI, Anthropic, or local model via Ollama)
  • Optional: Maltego CE for graph visualization of correlation results
  • Optional: API keys for Shodan, VirusTotal, HaveIBeenPwned, Hunter.io

Workflow

Legal & Ethical Requirements

  • Obtain documented written authorization before any investigation
  • Establish lawful basis for data processing (law enforcement, corporate policy, etc.)
  • Define PII retention limits and data handling procedures
  • Comply with local privacy regulations (GDPR, CCPA, etc.)

Phase 1 — Multi-Source OSINT Collection

  1. Create the working directory for all OSINT outputs:

    mkdir -p /tmp/osint
    
  2. Enumerate usernames across platforms with Sherlock:

    sherlock "targetusername" --output /tmp/osint/sherlock-results.txt --csv
    
  3. Harvest emails, subdomains, and hosts with theHarvester:

    theHarvester -d targetdomain.com -b all -f /tmp/osint/harvester-results.json
    
  4. Run a SpiderFoot passive scan via REST API:

    curl -s http://localhost:5001/api/scan/start \
      -d "scanname=target-recon&scantarget=targetdomain.com&usecase=passive" \
      | jq '.scanid'
    
  5. Export SpiderFoot results when scan completes:

    SCAN_ID="<scanid_from_step_3>"
    curl -s "http://localhost:5001/api/scan/${SCAN_ID}/results?type=all" \
      -o /tmp/osint/spiderfoot-results.json
    
  6. Query breach databases for email exposure (example with HIBP API):

    curl -s -H "hibp-api-key: ${HIBP_KEY}" \
      -H "User-Agent: OSINT-Correlation-Skill" \
      "https://haveibeenpwned.com/api/v3/breachedaccount/[email protected]" \
      -o /tmp/osint/breach-results.json
    

Phase 2 — Data Normalization

  1. Normalize all collected data into a common schema. Create a unified JSON structure that tags each finding with its source, timestamp, and data type:

    cat > /tmp/osint/normalize.py << 'EOF'
    import json, csv, sys, os
    from datetime import datetime
    
    findings = []
    
    # Normalize Sherlock CSV results
    sherlock_path = "/tmp/osint/sherlock-results.txt"
    if os.path.exists(sherlock_path):
        with open(sherlock_path) as f:
            for row in csv.DictReader(f):
                findings.append({
                    "source": "sherlock",
                    "type": "social_profile",
                    "platform": row.get("name", ""),
                    "url": row.get("url_user", ""),
                    "username": row.get("username", ""),
                    "status": row.get("status", ""),
                    "collected_at": datetime.utcnow().isoformat()
                })
    
    # Normalize theHarvester JSON results
    harvester_path = "/tmp/osint/harvester-results.json"
    if os.path.exists(harvester_path):
        with open(harvester_path) as f:
            data = json.load(f)
            for email in data.get("emails", []):
                findings.append({
                    "source": "theHarvester",
                    "type": "email",
                    "value": email,
                    "collected_at": datetime.utcnow().isoformat()
                })
            for host in data.get("hosts", []):
                findings.append({
                    "source": "theHarvester",
                    "type": "hostname",
                    "value": host,
                    "collected_at": datetime.utcnow().isoformat()
                })
    
    # Normalize SpiderFoot results
    sf_path = "/tmp/osint/spiderfoot-results.json"
    if os.path.exists(sf_path):
        with open(sf_path) as f:
            for item in json.load(f):
                findings.append({
                    "source": "spiderfoot",
                    "type": item.get("type", "unknown"),
                    "value": item.get("data", ""),
                    "module": item.get("module", ""),
                    "collected_at": datetime.utcnow().isoformat()
                })
    
    with open("/tmp/osint/normalized-findings.json", "w") as f:
        json.dump(findings, f, indent=2)
    
    print(f"Normalized {len(findings)} findings from {len(set(f['source'] for f in findings))} sources")
    EOF
    python3 /tmp/osint/normalize.py
    

Phase 3 — AI-Driven Correlation

  1. Send normalized findings to an LLM for cross-source correlation analysis:

    cat > /tmp/osint/correlate.py << 'PYEOF'
    import json, os
    from openai import OpenAI  # or anthropic, ollama, etc.
    
    client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
    
    with open("/tmp/osint/normalized-findings.json") as f:
        findings = json.load(f)
    
    correlation_prompt = f"""You are an OSINT analyst. Analyze these findings collected
    from multiple sources and produce a correlation report.
    
    For each identity or entity you detect:
    1. List all linked accounts/profiles with the evidence connecting them.
    2. Assign a confidence score (0.0-1.0) for each linkage based on:
       - Exact username match across platforms (high)
       - Similar usernames with shared metadata (medium)
       - Same email in breach data and registration (high)
       - Co-occurring infrastructure (IP, domain) (medium)
       - Temporal correlation of account creation dates (low-medium)
    3. Identify contradictions or potential false positives.
    4. Flag high-risk exposures (breached credentials, PII leaks, infrastructure overlaps).
    5. Produce a structured JSON report.
    
    Raw findings:
    {json.dumps(findings[:500], indent=2)}
    """
    
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are an expert OSINT analyst specializing in identity correlation and link analysis."},
            {"role": "user", "content": correlation_prompt}
        ],
        temperature=0.1,
        response_format={"type": "json_object"}
    )
    
    report = json.loads(response.choices[0].message.content)
    
    with open("/tmp/osint/correlation-report.json", "w") as f:
        json.dump(report, f, indent=2)
    
    print(json.dumps(report, indent=2))
    PYEOF
    python3 /tmp/osint/correlate.py
    
  2. Perform entity resolution — deduplicate and merge related identities:

    cat > /tmp/osint/resolve.py << 'PYEOF'
    import json
    
    with open("/tmp/osint/correlation-report.json") as f:
        report = json.load(f)
    
    # Extract entities and build a link graph
    entities = report.get("entities", [])
    print(f"Identified {len(entities)} distinct entities")
    for entity in entities:
        name = entity.get("identifier", "unknown")
        confidence = entity.get("confidence", 0)
        links = entity.get("linked_accounts", [])
        risk = entity.get("risk_level", "unknown")
        print(f"  [{confidence:.0%}] {name} — {len(links)} linked accounts — risk: {risk}")
    PYEOF
    python3 /tmp/osint/resolve.py
    

Phase 4 — Reporting and Visualization

  1. Generate a final intelligence profile in Markdown:

    cat > /tmp/osint/report.py << 'PYEOF'
    import json
    from datetime import datetime
    
    with open("/tmp/osint/correlation-report.json") as f:
        report = json.load(f)
    
    md = f"# OSINT Correlation Report\n\n"
    md += f"**Generated:** {datetime.utcnow().isoformat()}Z\n\n"
    md += "## Entity Profiles\n\n"
    
    for entity in report.get("entities", []):
        eid = entity.get("identifier", "Unknown")
        conf = entity.get("confidence", 0)
        md += f"### {eid} (Confidence: {conf:.0%})\n\n"
        md += "| Source | Platform | Evidence |\n|--------|----------|----------|\n"
        for link in entity.get("linked_accounts", []):
            md += f"| {link.get('source','')} | {link.get('platform','')} | {link.get('evidence','')} |\n"
        md += f"\n**Risk Level:** {entity.get('risk_level', 'N/A')}\n\n"
        for flag in entity.get("flags", []):
            md += f"- ⚠️ {flag}\n"
        md += "\n"
    
    with open("/tmp/osint/intelligence-profile.md", "w") as f:
        f.write(md)
    
    print("Report written to /tmp/osint/intelligence-profile.md")
    PYEOF
    python3 /tmp/osint/report.py
    
  2. Optional — Import correlation graph into Maltego for visualization:

    # Export entities as Maltego-compatible CSV for manual import
    cat > /tmp/osint/maltego_export.py << 'PYEOF'
    import json, csv
    
    with open("/tmp/osint/correlation-report.json") as f:
        report = json.load(f)
    
    with open("/tmp/osint/maltego-import.csv", "w", newline="") as f:
        writer = csv.writer(f)
        writer.writerow(["Entity Type", "Value", "Linked To", "Link Label", "Confidence"])
        for entity in report.get("entities", []):
            for link in entity.get("linked_accounts", []):
                writer.writerow([
                    link.get("type", "Alias"),
                    link.get("value", ""),
                    entity.get("identifier", ""),
                    link.get("evidence", ""),
                    link.get("confidence", "")
                ])
    
    print("Maltego CSV exported to /tmp/osint/maltego-import.csv")
    PYEOF
    python3 /tmp/osint/maltego_export.py
    

Key Concepts

ConceptDescription
Cross-Source CorrelationMatching identifiers (usernames, emails, IPs) across independent OSINT sources to establish entity linkage
Confidence ScoringAssigning probabilistic confidence (0.0–1.0) to each linkage based on evidence strength and corroboration
Entity ResolutionDeduplicating and merging records that refer to the same real-world entity across fragmented datasets
False Positive DetectionUsing AI reasoning to identify coincidental matches versus genuine identity links
Multi-Vector IntelligenceCombining findings from social media, DNS, breach data, and infrastructure into a single threat picture
Link AnalysisGraph-based examination of relationships between entities, accounts, and infrastructure

Tools & Systems

ToolRole in Workflow
SherlockUsername enumeration across 400+ social platforms
theHarvesterEmail, subdomain, and host discovery from public sources
SpiderFootAutomated OSINT collection across 200+ modules
MaltegoGraph-based visualization of entity relationships
LLM API (GPT-4, Claude, Ollama)Cross-source reasoning, pattern detection, and confidence scoring
HaveIBeenPwnedBreach exposure and credential leak detection

Common Scenarios

  • Threat Actor Attribution: Correlate a suspicious username found in a phishing campaign with social media profiles, domain registrations, and breach data to build an attribution profile.
  • Attack Surface Mapping: Link discovered subdomains, emails, and employee social accounts to understand an organization's full external exposure.
  • Insider Threat Investigation: Cross-reference an employee's known accounts with dark web marketplace activity and breach databases.
  • Brand Impersonation Detection: Identify accounts across platforms mimicking a target brand by correlating registration patterns, naming conventions, and temporal signals.

Output Format

The final output is a structured JSON correlation report and a Markdown intelligence profile containing:

{
  "meta": {
    "target": "targetdomain.com",
    "sources_used": ["sherlock", "theHarvester", "spiderfoot", "hibp"],
    "total_findings": 247,
    "generated_at": "2025-01-15T14:30:00Z"
  },
  "entities": [
    {
      "identifier": "john.target",
      "confidence": 0.92,
      "linked_accounts": [
        {
          "source": "sherlock",
          "platform": "GitHub",
          "value": "john.target",
          "evidence": "Exact username match, bio references targetdomain.com",
          "confidence": 0.95
        }
      ],
      "risk_level": "high",
      "flags": [
        "Credentials exposed in 2 breaches (2022, 2023)",
        "Admin email for targetdomain.com found in public WHOIS"
      ]
    }
  ],
  "contradictions": [],
  "recommendations": []
}

Verification

  • Confirm that each linked account has been independently verified against at least two sources before assigning confidence > 0.8.
  • Cross-check AI-generated correlations manually for a random sample (10–20%) to validate accuracy.
  • Verify that no false positives from common usernames (e.g., "admin", "test") inflated entity profiles.
  • Ensure breach data timestamps are current and from reputable aggregators.
  • Validate that the final report does not include stale or retracted OSINT data.
how to use performing-ai-driven-osint-correlation

How to use performing-ai-driven-osint-correlation on Cursor

AI-first code editor with Composer

1

Prerequisites

Before installing skills in Cursor, ensure your development environment meets these requirements:

  • Cursor installed and configured on your development machine
  • Node.js version 16.0+ with npm package manager (verify with node --version)
  • Active project directory or workspace where you want to add performing-ai-driven-osint-correlation
2

Execute installation command

Execute the skills CLI command in your project's root directory to begin installation:

$npx skills install mukul975/Anthropic-Cybersecurity-Skills/performing-ai-driven-osint-correlation

The skills CLI fetches performing-ai-driven-osint-correlation from GitHub repository mukul975/Anthropic-Cybersecurity-Skills and configures it for Cursor.

3

Select Cursor when prompted

The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:

◆ Which agents do you want to install to?
│ ── Universal (.agents/skills) ── always included ────
│ • Amp
│ • Antigravity
│ • Cline
│ • Codex
│ ●Cursor(selected)
│ • Cursor
│ • Windsurf
4

Verify installation

Confirm successful installation by checking the skill directory location:

.cursor/skills/performing-ai-driven-osint-correlation

Reload or restart Cursor to activate performing-ai-driven-osint-correlation. Access the skill through slash commands (e.g., /performing-ai-driven-osint-correlation) or your agent's skill management interface.

Security & Verification Notice

We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.

Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.

List & Monetize Your Skill

Submit your Claude Code skill and start earning

GET_STARTED →

Use Cases

Task Automation & Efficiency

Automate repetitive workflows and reduce manual effort

Example

Generate reports, summarize documents, draft communications

Save 3-5 hours per week on routine tasks

Knowledge Enhancement

Learn new skills, understand complex topics, get expert guidance

Example

Explain concepts, provide examples, suggest learning resources

Accelerate learning and skill development by 2x

Quality Improvement

Enhance output quality through reviews, suggestions, and refinements

Example

Review drafts, suggest improvements, catch errors

Improve work quality by 30-40% with less effort

Implementation Guide

Prerequisites

  • Claude Desktop or compatible AI client with skill support
  • Clear understanding of task or problem to solve
  • Willingness to iterate and refine outputs

Time Estimate

15-45 minutes depending on use case complexity

Installation Steps

  1. 1.Install skill using provided installation command
  2. 2.Test with simple use case relevant to your work
  3. 3.Evaluate output quality and relevance
  4. 4.Iterate on prompts to improve results
  5. 5.Integrate into regular workflow if valuable

Common Pitfalls

  • Expecting perfect results without iteration
  • Not providing enough context in prompts
  • Using skill for tasks outside its intended scope
  • Accepting outputs without review and validation

Best Practices

✓ Do

  • +Start with clear, specific prompts
  • +Provide relevant context and constraints
  • +Review and refine all outputs before using
  • +Iterate to improve output quality
  • +Document successful prompt patterns

✗ Don't

  • Don't use without understanding skill limitations
  • Don't skip validation of outputs
  • Don't share sensitive information in prompts
  • Don't expect skill to replace human judgment

💡 Pro Tips

  • Be specific about desired format and style
  • Ask for multiple options to choose from
  • Request explanations to understand reasoning
  • Combine AI efficiency with human expertise

When to Use This

✓ Use When

Use when skill capabilities match your task, clear ROI on time saved, and you can validate outputs. Best for repetitive tasks, learning, and quality improvement.

✗ Avoid When

Avoid when task requires deep expertise you can't validate, involves sensitive decisions, or when learning process is more valuable than speed of completion.

Learning Path

  1. 1Familiarize yourself with skill capabilities and limitations
  2. 2Start with low-risk, non-critical tasks
  3. 3Progress to more complex and valuable use cases
  4. 4Build expertise through regular use and experimentation

Discussion

Product Hunt–style comments (not star reviews)
  • No comments yet — start the thread.
general reviews

Ratings

4.638 reviews
  • Neel Singh· Dec 20, 2024

    Keeps context tight: performing-ai-driven-osint-correlation is the kind of skill you can hand to a new teammate without a long onboarding doc.

  • Neel Srinivasan· Dec 4, 2024

    Useful defaults in performing-ai-driven-osint-correlation — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.

  • Neel Anderson· Nov 11, 2024

    performing-ai-driven-osint-correlation is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.

  • Naina Diallo· Oct 2, 2024

    Solid pick for teams standardizing on skills: performing-ai-driven-osint-correlation is focused, and the summary matches what you get after install.

  • Anaya Li· Sep 25, 2024

    We added performing-ai-driven-osint-correlation from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.

  • Isabella Robinson· Sep 9, 2024

    performing-ai-driven-osint-correlation reduced setup friction for our internal harness; good balance of opinion and flexibility.

  • Sakshi Patil· Sep 5, 2024

    performing-ai-driven-osint-correlation fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Lucas Sethi· Aug 28, 2024

    We added performing-ai-driven-osint-correlation from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.

  • Chaitanya Patil· Aug 24, 2024

    performing-ai-driven-osint-correlation has been reliable in day-to-day use. Documentation quality is above average for community skills.

  • Tariq Anderson· Aug 16, 2024

    performing-ai-driven-osint-correlation reduced setup friction for our internal harness; good balance of opinion and flexibility.

showing 1-10 of 38

1 / 4