Tuesday, June 23, 2026
Merged timeline of 266 items — blog publish times and listing timestamps, cut at midnight . Page 5 of 6.
Merged timeline of 266 items — blog publish times and listing timestamps, cut at midnight . Page 5 of 6.
Diagnose failed or unhealthy Dynamo deployments. Use when pods, model-cache jobs, PVCs, workers, frontend/router health, endpoints, or benchmark jobs fail; use recipe-runner/router-starter before this for normal bring-u…
Install Holoscan SDK v4.3+ via Conda in a CUDA 13 environment. Use for Conda installs; redirect CUDA 12 hosts to container/wheel.
Select, validate, patch, and deploy existing NVIDIA Dynamo Kubernetes recipes. Use for model/backend/GPU/deployment-mode recipe bring-up; use router-starter for router-only mode work and troubleshoot for broken deployme…
Build Holoscan SDK from source via the in-tree ./run script. Use only when published packages don't meet the user's needs.
Install Holoscan SDK natively on Ubuntu via apt. Use for C++ installs on Ubuntu; pair with /holoscan-install-wheel for Python.
Install cuOpt for Python, C, or server via pip, conda, or Docker; verify the install. For building cuOpt from source, see cuopt-developer.
CUDA-Q onboarding guide for installation, test programs, GPU simulation, QPU hardware, and quantum applications.
Stage 1 of Clinical ASR Flywheel. Use when bootstrapping a cycle: NVCF+MW disclosure, NVIDIA_API_KEY check, deps install, TTS+ASR smoke test.
Used for header-only preflight of one DICOM series folder before conversion or inference. Not for de-identification or clinical clearance.
Start or patch Dynamo router modes and run router endpoint smoke checks. Use for round-robin, KV-aware, least-loaded, or device-aware routing setup; use recipe-runner for recipe deployment and troubleshoot for failure d…
cuOpt REST server — start server, endpoints, Python/curl client examples. Use when the user is deploying or calling the REST API.
Stage 4 of the Clinical ASR Flywheel. Use when priority KER is above 0.3 to run stock NeMo SFT on Parakeet TDT v2 and offline cycle N+1 re-eval. NOT for generic word boosting (use /finetune-asr).
Validate that a Dynamo deployment's NIXL/UCX/NCCL interconnect is ready for disaggregated serving over RDMA/NVLink. Use after recipe-runner brings a deployment up (especially disagg/multi-node) to confirm the KV transpo…
Used for extracting selected metadata from one DICOM file and flagging standard-tag PHI presence. Not for anonymization or clinical use.
Vehicle routing (VRP, TSP, PDP) — problem types and data requirements. Domain concepts; no API or interface.
Stage 3 of Clinical ASR Flywheel. Score a NeMo manifest, produce the five-section KER leaderboard (by-ipa_source diagnostic). Not for ASR auth (/riva-asr).
Use when a user asks to build, optimize, backtest, rebalance, or analyze a stock portfolio with Mean-CVaR, efficient frontiers, scenario generation, or NVIDIA cuOpt.
NVIDIA DeepStream SDK 9.0 development with Python pyservicemaker API. Use when building video analytics pipelines, GStreamer-based video processing, TensorRT inference integration, object detection/tracking, or Kafka/me…
Modify, build, test, debug, and contribute to NVIDIA cuOpt (C++/CUDA, Python, server, CI). Use for solver internals, PRs, DCO, and code conventions.
LP, MILP, and QP (beta) with cuOpt — C API only. Use when the user is embedding LP, MILP, or QP in C/C++.
Vehicle routing (VRP, TSP, PDP) with cuOpt — Python API only. Use when the user is building or solving routing in Python.
cuOpt REST server — what it does and how requests flow. Domain concepts; no deploy or client code.
Solve LP, MILP, QP (beta) with cuOpt Python API — linear/quadratic objectives, integer variables, scheduling, portfolio, least squares.
LP, MILP, and QP (beta) with cuOpt — CLI only (MPS files, cuopt_cli). Use when the user is solving LP, MILP, or QP from MPS via command line.
After solving a non-trivial problem, detect generalizable learnings and propose skill updates. Always active — applies to every interaction.
Load a sharded, on-disk dataset (sharded .npy, Parquet/Arrow, raw binary, sharded HDF5, custom layouts) into a distributed cuPyNumeric ndarray via a manual partition + leaf @task launch with CPU/OMP/GPU variants. Use wh…
Use when the user wants to create a dataset, generate synthetic data, or build a data generation pipeline.
Pre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy pa…
DALI imperative dynamic mode (`nvidia.dali.experimental.dynamic`, ndd): use when working on ndd code or migrating pipelines; skip pipeline-only tasks.
Install and verify cuPyNumeric for Python — requirements, commands, verification. Source builds are out of scope.
Base rules for end users calling NVIDIA cuOpt (routing/LP/MILP/QP/install/server). Not for cuOpt internals — use cuopt-developer for those.
Official NVIDIA-authored guidance for NVIDIA cuDF GPU DataFrames, pandas acceleration, dask-cuDF, ETL, joins, groupby, CSV/Parquet I/O, nullable semantics, and multi-GPU DataFrame workloads.
Skybridge is a comprehensive open-source framework designed for building multi-channel applications using React.
readywhen acts as your 24/7 AI Chief of Staff, managing commitments and follow-ups efficiently.
AgentX simplifies the evaluation of AI agents, allowing users to identify and resolve issues with a single click.
HAQQ Legal AI on Mobile democratizes legal understanding, making it accessible to anyone with a smartphone.
Alai 2.0 serves as an AI design partner, assisting users in creating presentations, social media posts, and more.
VibeThinker-3B is a compact dense model with 3B parameters designed for verifiable reasoning in small language models. It achieves frontier-level performance on demanding tasks.
Mercury 2 is the world's fastest reasoning language model, designed for real-time AI applications. It utilizes diffusion-based reasoning for rapid response generation.
Over 30 million people now use AI companions daily. Some are processing grief. Some are practicing social skills. Some have fallen in love. The psychology is real, the risks are real, and the ethics are complicated. Here's what's actually happening when your chatbot becomes your companion.
The standard assumption for running a 70B model locally: you need 140GB of VRAM. AirLLM breaks that assumption by loading layers one at a time from disk, holding only one layer in GPU memory at any moment. 21K+ GitHub stars, three lines of code to start. Here is what it actually buys you and what it costs you.
Baidu's Unlimited-OCR lands on GitHub and Hugging Face with 1.8k stars overnight. The model parses entire PDFs, multi-page scans, and dense documents in one shot — no chunking, no stitching — and ships with both a Transformers and a high-throughput SGLang backend.
A Berkeley professor published an Atlantic essay arguing against rushing GPT-6 on harm grounds — while disclosing her own cancer history. Marc Andreessen said "Did cancer write this?" Matthew Berman said "Psychopath." The argument underneath the outrage is worth engaging with. Here's what both sides are getting right and what they're missing.
Claude Code starts every session knowing nothing about your project. CLAUDE.md is the only signal that survives across sessions — and most developers write ones that are nearly useless. This post shows the difference between a generic CLAUDE.md (that changes nothing) and a specific one (that eliminates boilerplate answers) with before/after examples.
Firecrawl is not another scraping library. It is a web context layer between the messy, JS-rendered, CAPTCHA-gated internet and LLMs that need clean data. The Agent endpoint — describe what you want, get it — is the interesting part. 137K stars and counting.