ai-ml-data-science▌
vasilyu1983/ai-agents-public · updated Apr 8, 2026
MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.
This skill turns raw data and questions into validated, documented models ready for production:
Data Science Engineering Suite - Quick Reference
This skill turns raw data and questions into validated, documented models ready for production:
- EDA workflows: Structured exploration with drift detection
- Feature engineering: Reproducible feature pipelines with leakage prevention and train/serve parity
- Model selection: Baselines first; strong tabular defaults; escalate complexity only when justified
- Evaluation & reporting: Slice analysis, uncertainty, model cards, production metrics
- SQL transformation: SQLMesh for staging/intermediate/marts layers
- MLOps: CI/CD, CT (continuous training), CM (continuous monitoring)
- Production patterns: Data contracts, lineage, feedback loops, streaming features
Modern emphasis (2026): Feature stores, automated retraining, drift monitoring (Evidently), train-serve parity, and agentic ML loops (plan -> execute -> evaluate -> improve). Tools: LightGBM, CatBoost, scikit-learn, PyTorch, Polars (lazy eval for larger-than-RAM datasets), lakeFS for data versioning.
Quick Reference
| Task | Tool/Framework | Command | When to Use |
|---|---|---|---|
| EDA & Profiling | Pandas, Great Expectations | df.describe(), ge.validate() |
Initial data exploration and quality checks |
| Feature Engineering | Pandas, Polars, Feature Stores | df.transform(), Feast materialization |
Creating lag, rolling, categorical features |
| Model Training | Gradient boosting, linear models, scikit-learn | lgb.train(), model.fit() |
Strong baselines for tabular ML |
| Hyperparameter Tuning | Optuna, Ray Tune | optuna.create_study(), tune.run() |
Optimizing model parameters |
| SQL Transformation | SQLMesh | sqlmesh plan, sqlmesh run |
Building staging/intermediate/marts layers |
| Experiment Tracking | MLflow, W&B | mlflow.log_metric(), wandb.log() |
Versioning experiments and models |
| Model Evaluation | scikit-learn, custom metrics | metrics.roc_auc_score(), slice analysis |
Validating model performance |
Data Lake & Lakehouse
For comprehensive data lake/lakehouse patterns (beyond SQLMesh transformation), see data-lake-platform:
- Table formats: Apache Iceberg, Delta Lake, Apache Hudi
- Query engines: ClickHouse, DuckDB, Apache Doris, StarRocks
- Alternative transformation: dbt (alternative to SQLMesh)
- Ingestion: dlt, Airbyte (connectors)
- Streaming: Apache Kafka patterns
- Orchestration: Dagster, Airflow
This skill focuses on ML feature engineering and modeling. Use data-lake-platform for general-purpose data infrastructure.
Related Skills
For adjacent topics, reference:
- ai-mlops - APIs, batch jobs, monitoring, drift, data ingestion (dlt)
- ai-llm - LLM prompting, fine-tuning, evaluation
- ai-rag - RAG pipelines, chunking, retrieval
- ai-llm-inference - LLM inference optimization, quantization
- ai-ml-timeseries - Time series forecasting, backtesting
- qa-testing-strategy - Test-driven development, coverage
- data-sql-optimization - SQL optimization, index patterns (complements SQLMesh)
- data-lake-platform - Data lake/lakehouse infrastructure (ClickHouse, Iceberg, Kafka)
Decision Tree: Choosing Data Science Approach
User needs ML for: [Problem Type]
- Tabular data?
- Small-medium (<1M rows)? -> LightGBM (fast, efficient)
- Large and complex (>1M rows)? -> LightGBM first, then NN if needed
- High-dim sparse (text, counts)? -> Linear models, then shallow NN
- Time series?
- Seasonality? -> LightGBM, then see ai-ml-timeseries
- Long-term dependencies? -> Transformers (see ai-ml-timeseries)
- Text or mixed modalities?
- LLMs/Transformers -> See ai-llm
- SQL transformations?
- SQLMesh (staging/intermediate/marts layers)
Rule of thumb: For tabular data, tree-based gradient boosting is a strong baseline, but must be validated against alternatives and constraints.
Core Concepts (Vendor-Agnostic)
- Problem framing: define success metrics, baselines, and decision thresholds before modeling.
- Leakage prevention: ensure all features are available at prediction time; split by time/group when appropriate.
- Uncertainty: report confidence intervals and stability (fold variance, bootstrap) rather than single-point metrics.
- Reproducibility: version code/data/features, fix seeds, and record the environment.
- Operational handoff: define monitoring, retraining triggers, and rollback criteria with MLOps.
Implementation Practices (Tooling Examples)
- Track experiments and artifacts (run id, commit hash, data version).
- Add data validation gates in pipelines (schema + distribution + freshness).
- Prefer reproducible, testable feature code (shared transforms, point-in-time correctness).
- Use datasheets/model cards and eval reports as deployment prerequisites (Datasheets for Datasets: https://arxiv.org/abs/1803.09010; Model Cards: https://arxiv.org/abs/1810.03993).
Do / Avoid
Do
- Do start with baselines and a simple model to expose leakage and data issues early.
- Do run slice analysis and document failure modes before recommending deployment.
- Do keep an immutable eval set; refresh training data without contaminating evaluation.
Avoid
- Avoid random splits for temporal or user-correlated data.
- Avoid "metric gaming" (optimizing the number without validating business impact).
- Avoid training on labels created after the prediction timestamp (silent future leakage).
Core Patterns (Overview)
Pattern 1: End-to-End DS Project Lifecycle
Use when: Starting or restructuring any DS/ML project.
Stages:
- Problem framing - Business objective, success metrics, baseline
- Data & feasibility - Sources, coverage, granularity, label quality
- EDA & data quality - Schema, missingness, outliers, leakage checks
- Feature engineering - Per data type with feature store integration
- Modelling - Baselines first, then LightGBM, then complexity as needed
- Evaluation - Offline metrics, slice analysis, error analysis
- Reporting - Model evaluation report + model card
- MLOps - CI/CD, CT (continuous training), CM (continuous monitoring)
Detailed guide: EDA Best Practices
Pattern 2: Feature Engineering
Use when: Designing features before modelling or during model improvement.
By data type:
- Numeric: Standardize, handle outliers, transform skew, scale
- Categorical: One-hot/ordinal (low cardinality), target/frequency/hashing (high cardinality)
- Feature Store Integration: Store encoders, mappings, statistics centrally
- Text: Cleaning, TF-IDF, embeddings, simple stats
- Time: Calendar features, recency, rolling/lag features
Key Modern Practice: Use feature stores (Feast, Tecton, Databricks) for versioning, sharing, and train-serve parity.
Detailed guide: Feature Engineering Patterns
Pattern 3: Data Contracts & Lineage
Use when: Building production ML systems with data quality requirements.
Components:
- Contracts: Schema + ranges/nullability + freshness SLAs
- Lineage: Track source -> feature store -> train -> serve
- Feature store hygiene: Materialization cadence, backfill/replay, encoder versioning
- Schema evolution: Backward/forward-compatible migrations with shadow runs
Detailed guide: Data Contracts & Lineage
Pattern 4: Model Selection & Training
Use when: Picking model families and starting experiments.
Decision guide (modern benchmarks):
- Tabular: Start with a strong baseline (linear/logistic, then gradient boosting) and iterate based on error analysis
- Baselines: Always implement simple baselines first (majority class, mean, naive forecast)
- Train/val/test splits: Time-based (forecasting), group-based (user/item leakage), or random (IID)
- Hyperparameter tuning: Start manual, then Bayesian optimization (Optuna, Ray Tune)
- Overfitting control: Regularization, early stopping, cross-validation
Detailed guide: Modelling Patterns
Pattern 5: Evaluation & Reporting
Use when: Finalizing a model candidate or handing over to production.
Key components:
- Metric selection: Primary (ROC-AUC, PR-AUC, RMSE) + guardrails (calibration, fairness)
- Threshold selection: ROC/PR curves, cost-sensitive, F1 maximization
- Slice analysis: Performance by geography, user segments, product categories
- Error analysis: Collect high-error examples, cluster by error type, identify systematic failures
- Uncertainty: Confidence intervals (bootstrap where appropriate), variance across folds, and stability checks
- Evaluation report: 8-section report (objective, data, features, models, metrics, slices, risks, recommendation)
- Model card: Documentation for stakeholders (intended use, data, performance, ethics, operations)
Detailed guide: Evaluation Patterns
Pattern 6: Reproducibility & MLOps
Use when: Ensuring experiments are reproducible and production-ready.
Modern MLOps (CI/CD/CT/CM):
- CI (Continuous Integration): Automated testing, data validation, code quality
- CD (Continuous Delivery): Environment-specific promotion (dev -> staging -> prod), canary deployment
- CT (Continuous Training): Drift-triggered and scheduled retraining
- CM (Continuous Monitoring): Real-time data drift, performance, system health
Versioning:
- Code (git commit), data (DVC, LakeFS), features (feature store), models (MLflow Registry)
- Seeds (reproducibility), hyperparameters (experiment tracker)
Detailed guide: Reproducibility Checklist
Pattern 7: Feature Freshness & Streaming
Use when: Managing real-time features and streaming pipelines.
Components:
- Freshness contracts: Define freshness SLAs per feature, monitor lag, alert on breaches
- Batch + stream parity: Same feature logic across batch/stream, idempotent upserts
- Schema evolution: Version schemas, add forward/backward-compatible parsers, backfill with rollback
- Data quality gates: PII/format checks, range checks, distribution drift (KL, KS, PSI)
Detailed guide: Feature Freshness & Streaming
Pattern 8: Production Feedback Loops
Use when: Capturing production signals and implementing continuous improvement.
Components:
- Signal capture: Log predictions + user edits/acceptance/abandonment (scrub PII)
- Labeling: Route failures/edge cases to human review, create balanced sets
- Dataset refresh: Periodic refresh (weekly/monthly) with lineage, protect eval set
- Online eval: Shadow/canary new models, track solve rate, calibration, cost, latency
Detailed guide: Production Feedback Loops
Resources (Detailed Guides)
For comprehensive operational patterns and checklists, see:
- EDA Best Practices - Structured workflow for exploratory data analysis
- Feature Engineering Patterns - Operational patterns by data type
- Data Contracts & Lineage - Data quality, versioning, feature store ops
- Modelling Patterns - Model selection, hyperparameter tuning, train/test splits
- Evaluation Patterns - Metrics, slice analysis, evaluation reports, model cards
- Reproducibility Checklist - Experiment tracking, MLOps (CI/CD/CT/CM)
- Feature Freshness & Streaming - Real-time features, schema evolution
- Production Feedback Loops - Online learning, labeling, canary deployment
- Class Imbalance Patterns - Resampling, cost-sensitive learning, threshold tuning, evaluation for skewed datasets
- Hyperparameter Optimization - Bayesian optimization, early stopping, search strategies, budget allocation
- Interpretability & Explainability - SHAP, LIME, feature importance, model cards for regulated domains
Templates
Use these as copy-paste starting points:
Project & Workflow Templates
- Standard DS project template:
assets/project/template-standard.md - Quick DS experiment template:
assets/project/template-quick.md
Feature Engineering & EDA
- Feature engineering template:
assets/features/template-feature-engineering.md - EDA checklist & notebook template:
assets/eda/template-eda.md
Evaluation & Reporting
- Model evaluation report:
assets/evaluation/template-evaluation-report.md - Model card:
assets/evaluation/template-model-card.md - ML experiment review:
assets/review/experiment-review-template.md
SQL Transformation (SQLMesh)
For SQL-based data transformation and feature engineering:
- SQLMesh project setup:
../data-lake-platform/assets/transformation/sqlmesh/template-sqlmesh-project.md - SQLMesh model types:
../data-lake-platform/assets/transformation/sqlmesh/template-sqlmesh-model.md(FULL, INCREMENTAL, VIEW) - Incremental models:
../data-lake-platform/assets/transformation/sqlmesh/template-sqlmesh-incremental.md - DAG and dependencies:
../data-lake-platform/assets/transformation/sqlmesh/template-sqlmesh-dag.md - Testing and data quality:
../data-lake-platform/assets/transformation/sqlmesh/template-sqlmesh-testing.md
Use SQLMesh when:
- Building SQL-based feature pipelines
- Managing incremental data transformations
- Creating staging/intermediate/marts layers
- Testing SQL logic with unit tests and audits
For data ingestion (loading raw data), use:
- ai-mlops skill (dlt templates for REST APIs, databases, warehouses)
Navigation
Resources
- references/reproducibility-checklist.md
- how to use ai-ml-data-science
How to use ai-ml-data-science on Cursor
AI-first code editor with Composer
1Prerequisites
Before installing skills in Cursor, ensure your development environment meets these requirements:
- ›Cursor installed and configured on your development machine
- ›Node.js version 16.0+ with npm package manager (verify with
node --version) - ›Active project directory or workspace where you want to add ai-ml-data-science
2Execute installation command
Execute the skills CLI command in your project's root directory to begin installation:
$npx skills add https://github.com/vasilyu1983/ai-agents-public --skill ai-ml-data-scienceThe skills CLI fetches
ai-ml-data-sciencefrom GitHub repositoryvasilyu1983/ai-agents-publicand configures it for Cursor.3Select Cursor when prompted
The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:
◆ Which agents do you want to install to?││ ── Universal (.agents/skills) ── always included ────│ • Amp│ • Antigravity│ • Cline│ • Codex│ ●Cursor(selected)│ • Cursor│ • Windsurf4Verify installation
Confirm successful installation by checking the skill directory location:
.cursor/skills/ai-ml-data-scienceReload or restart Cursor to activate ai-ml-data-science. Access the skill through slash commands (e.g.,
/ai-ml-data-science) or your agent's skill management interface.⚠Security & Verification Notice
We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.
Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.
Additional Resources
GET_STARTED →List & Monetize Your Skill
Submit your Claude Code skill and start earning
Use Cases▌
Task Automation & Efficiency
Automate repetitive workflows and reduce manual effort
Example
Generate reports, summarize documents, draft communications
✓Save 3-5 hours per week on routine tasks
Knowledge Enhancement
Learn new skills, understand complex topics, get expert guidance
Example
Explain concepts, provide examples, suggest learning resources
✓Accelerate learning and skill development by 2x
Quality Improvement
Enhance output quality through reviews, suggestions, and refinements
Example
Review drafts, suggest improvements, catch errors
✓Improve work quality by 30-40% with less effort
Implementation Guide▌
Prerequisites
- ›Claude Desktop or compatible AI client with skill support
- ›Clear understanding of task or problem to solve
- ›Willingness to iterate and refine outputs
Time Estimate
15-45 minutes depending on use case complexity
Installation Steps
- 1.Install skill using provided installation command
- 2.Test with simple use case relevant to your work
- 3.Evaluate output quality and relevance
- 4.Iterate on prompts to improve results
- 5.Integrate into regular workflow if valuable
Common Pitfalls
- ⚠Expecting perfect results without iteration
- ⚠Not providing enough context in prompts
- ⚠Using skill for tasks outside its intended scope
- ⚠Accepting outputs without review and validation
Best Practices▌
✓ Do
- +Start with clear, specific prompts
- +Provide relevant context and constraints
- +Review and refine all outputs before using
- +Iterate to improve output quality
- +Document successful prompt patterns
✗ Don't
- −Don't use without understanding skill limitations
- −Don't skip validation of outputs
- −Don't share sensitive information in prompts
- −Don't expect skill to replace human judgment
💡 Pro Tips
- ★Be specific about desired format and style
- ★Ask for multiple options to choose from
- ★Request explanations to understand reasoning
- ★Combine AI efficiency with human expertise
When to Use This▌
✓ Use When
Use when skill capabilities match your task, clear ROI on time saved, and you can validate outputs. Best for repetitive tasks, learning, and quality improvement.
✗ Avoid When
Avoid when task requires deep expertise you can't validate, involves sensitive decisions, or when learning process is more valuable than speed of completion.
Learning Path▌
- 1Familiarize yourself with skill capabilities and limitations
- 2Start with low-risk, non-critical tasks
- 3Progress to more complex and valuable use cases
- 4Build expertise through regular use and experimentation
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
general reviewsRatings
4.6★★★★★35 reviews- ★★★★★Layla Robinson· Dec 28, 2024
Registry listing for ai-ml-data-science matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Mia Lopez· Dec 12, 2024
Keeps context tight: ai-ml-data-science is the kind of skill you can hand to a new teammate without a long onboarding doc.
- ★★★★★Pratham Ware· Dec 8, 2024
ai-ml-data-science reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Yash Thakker· Nov 27, 2024
I recommend ai-ml-data-science for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Chinedu Abbas· Nov 19, 2024
Useful defaults in ai-ml-data-science — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Layla Gupta· Nov 3, 2024
ai-ml-data-science has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Chinedu Ndlovu· Oct 22, 2024
Solid pick for teams standardizing on skills: ai-ml-data-science is focused, and the summary matches what you get after install.
- ★★★★★Dhruvi Jain· Oct 18, 2024
Useful defaults in ai-ml-data-science — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Layla Iyer· Oct 10, 2024
I recommend ai-ml-data-science for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Sakura Garcia· Sep 25, 2024
ai-ml-data-science fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
showing 1-10 of 35
1 / 4