explainx / curriculum sample

RAG & retrieval curriculum — grounded answers at enterprise scale

Retrieval is where many ‘ChatGPT for documents’ programs quietly fail. This track emphasizes offline evals, chunking strategy, and operator feedback loops over embedding hype.

About the Instructor

Yash Thakker

AI Instructor & Product Leader

Yash Thakker has 12+ years of experience building AI products and has taught 160,000+ students across 50+ courses. He facilitates corporate AI training for enterprises including Tata, PayPal, and Fortune 500 teams. Yash holds an MBA from SIMSREE and a B.Tech in Information Technology. Based in Mumbai, he delivers programs globally, specializing in Claude AI, generative AI, and practical AI implementation for regulated industries.

Credentials

  • MBA, SIMSREE (Sydenham Institute of Management Studies)
  • B.Tech, Information Technology, University of Mumbai
  • 12+ years building AI products
  • 160,000+ students trained across 50+ courses

program objectives

  • Choose chunking and metadata strategies aligned to ticket volume and doc heterogeneity.
  • Define truthiness tests and human QA sampling rates before production readouts.
  • Understand when to re-rank vs. expand queries vs. switch retrievers entirely.
  • Integrate analytics that tie retrieval quality to business KPIs (resolution time, after-contact work).

how we deliver

  1. 1

    Discovery call & problem framing

    We align on sponsors, success metrics, and constraints (2026 tool landscape, data rules, procurement gates) before anything is scheduled company-wide.

  2. 2

    Stakeholder interviews & day-in-the-life context

    Short conversations with practitioners (not only leadership) so scenarios reflect real workflows—not generic slide demos.

  3. 3

    Curriculum design & artifacts

    Modular agenda, exercise scripts, evaluation rubrics, and governance checkpoints matched to your vocabulary (banking, FMCG, engineering, etc.).

  4. 4

    Engaged, hands-on delivery

    Facilitation-led sessions with live exercises, breakout prompts, and documented failure modes—minimum passive lecture time.

  5. 5

    Post-session support: documentation & next steps

    Written recap, pilot backlog, links to explainx.ai courses for scaled upskilling, and optional office hours so momentum doesn’t stop at the workshop.

modules

Designing the knowledge substrate

Docs, wikis, PDFs, and structured stores—normalization plan first.

session outline

  • Content inventory: authoritative sources vs. stale mirrors.
  • Metadata contracts: ACLs, freshness, ownership.

labs

  • Sketch chunk boundaries for two real document types you bring (anonymized).

beyond-catalog topics (custom)

  • Hybrid lexical + dense retrieval pairings for regulated vocabularies.
  • Handling tables, scanned PDFs, and multi-language corpora common in India + Middle East rollouts.

Eval harness & continuous improvement

What to measure weekly.

session outline

  • Offline vs. online metrics; when A/B tests mislead in low-traffic B2B settings.
  • Human labeling throughput vs. spot checks—a pragmatic hybrid.

labs

  • Draft a minimal eval rubric with acceptance thresholds.

beyond-catalog topics (custom)

  • Routing policies when user intent is ambiguous (multi-intent enterprise search).

quick contact

Scope or pilot this curriculum

Share sponsor, headcount, and cities — we reply with timing and options. Rough budget helps us match the right depth.

related on-demand courses

faq

Is this only for engineers?

Mixed cohort: engineers own implementation details while PMs/COE own metrics and labeling budgets.

← All curriculum samples·training hub