explainx.ainewsletter3.4k
trending🔥loopsskills
pricing
workshops ↗
explainx.ai

Learn to lead teams that combine humans and agents. Platform access, live workshops, bootcamps, and 50+ courses — plus skills, tools, and MCP to practice what you learn.

follow us

custom AI agents

[email protected]

get started

Join · $29/moUpcoming workshop

learn

platform · $29/moupcoming workshopworkshopsbootcampscoursescertificationscertification testsexplainx universitycorporate trainingfacilitatorshackathonslearn skills & mcp

discover

skillstoolsagentsmcp serversdesignsllmsagiranks

content

releasesvisionmissionaboutteamcareersresourcespromptsgenerators hubgenerator SEO hubprompt templatesprompt guidesblogfor LLMsdemo

Sister Products

Infloq

Infloq

Influencer marketing

BgBlur

BgBlur

Privacy-first blur

Olly Social

Olly Social

Social AI copilot

Ceptory

Ceptory

Video intelligence

BgRemover

BgRemover

Background removal

newsletter · weekly

Get AI news, tools, and insights in your inbox.

contactsupportprivacytermsdata rightssubmission guidelines

© 2026 AISOLO Technologies Pvt Ltd

← Back to blog

explainx / blog

HIW-500: 500 Hours of Humanoid Teleop Data From 12 Real Homes

BitRobot, Unitree, and Hugging Face released HIW-500 — 500+ hours, 23K+ episodes, 10+ TB of Unitree G1 teleop in Southeast Asian homes. LeRobot format cuts it to 2.15 TB. Full guide.

Jun 23, 2026·7 min read·Yash Thakker
RoboticsPhysical AIHugging FaceUnitreeDatasetsImitation Learning
HIW-500: 500 Hours of Humanoid Teleop Data From 12 Real Homes

On June 23, 2026, BitRobot (@BitRobotNetwork) released HIW-500 — Humanoids-in-the-Wild 500 — calling it the largest open-source humanoid teleoperation dataset collected in real homes.

Built with Unitree and Hugging Face / LeRobot, the drop lands the same day as Nori L2's iPhone-priced robot teaser and a week after Genesis Eno. The through-line in June 2026 robotics: real environments, not lab demos, are where the data wars are being fought.

"Built for learning from real homes, not lab-only scenes." — HIW-500 project page

TL;DR

SpecHIW-500
ReleaseJune 2026 (V1)
RobotUnitree G1 (29-DoF + grippers)
Collection12 real homes, Southeast Asia
Hours500+ teleoperation
Episodes23,743
Frames40,839,947
Tasks11 household skills
Subtask labels161 labels, 148K+ annotations
Raw size~10+ TB (ROS bag / MCAP)
LeRobot size~2.15 TB (v3.0, same trajectories)
License framingResearch + commercial training (see HF page)
DownloadBitRobot/HIW-500-LeRobot
newsletter3.4k

Curated AI updates on agents, skills, and MCP — delivered to your inbox. Unsubscribe anytime.


Why in-the-wild humanoid data matters

Lab datasets teach robots to repeat. Home datasets teach robots to generalize.

HIW-500 explicitly targets variation that controlled benchmarks strip out:

  • Layout — different room geometries across 12 homes
  • Clutter — object density and placement change episode to episode
  • Lighting — natural indoor conditions, not studio rigs
  • Operator style — multiple human teleoperators with different habits
  • Object state — fridges half-full, pillows moved, trash in different bins

That is the gap between Figure Helix-02 tidying a staged bedroom and a robot that works in your kitchen. Policy researchers have argued for years that pooling diverse teleop data drives generalist policies — Open X-Embodiment showed cross-embodiment pooling improves small-data domains by 50%. HIW-500 is the humanoid-home slice of that bet at scale.

Unitree's response on X:

"We're excited to support BitRobot in open-sourcing the largest humanoid whole-body teleoperation dataset collected in real homes. We hope it accelerates progress toward general-purpose humanoid robots." — @UnitreeRobotics


What is in each episode

Each episode records human whole-body teleoperation of a Unitree G1 in a real home — not joystick-only arm control, but locomotion + bimanual manipulation together.

Camera streams

StreamResolutionFPSNotes
Head camera480 × 1280 RGB30Stereo scene context, navigation
Left wrist480 × 64030Stereo IR, close-range manipulation
Right wrist480 × 64030Stereo IR, close-range manipulation

Videos are encoded AV1 in the LeRobot release (yuv420p, CRF 30).

Robot state and actions

From the LeRobot dataset card:

  • observation.state — 29-DoF joint positions (hips, knees, ankles, waist, arms, wrists)
  • observation.state.wbc — whole-body control state: pivot velocity, roll/pitch/yaw, height, left/right end-effector poses, gripper triggers
  • action — 23-D whole-body teleop commands mirroring human operator inputs
  • IMU + odometry — in raw recordings
  • language_persistent / language_events — language annotations per episode

This is VLA-ready multimodal data: three video streams + proprioception + language labels in a single LeRobot v3.0 schema (codebase_version: v3.0, robot_type: unitree_g1).

Live WorkshopAug 1–2, 2026 · 2 days

Claude for Work

Use Claude as a thought partner for writing, research & decisions — no coding required. 2 live sessions with Yash Thakker.

Register now→

Claude for Work is a 2-day live workshop on using Claude to supercharge your daily work — writing, research, analysis, and decision-making — without any coding required. Learn how to set up Claude Projects with custom instructions, run deep-research sprints, co-write documents that sound like you, and build repeatable prompt systems for your team. August 1–2, 2026. Hosted by Yash Thakker, founder of AISOLO Technologies, instructor to 350,000+ students.

Includes 1-year access to all session recordings, a personal prompt library, Discord community access, and a certificate of completion. No coding or technical background required. Designed for managers, marketers, founders, and writers.


The 11 household tasks

BitRobot's project page lists every task in V1:

TaskCategory
Building children's tableAssembly / furniture
Hang hangerCloset organization
Clean up the roomGeneral tidying
Setting the tableDining prep
Restocking fridgeKitchen manipulation
Kitchen organizationStorage / sorting
Hang keys on a hookFine placement
Move pillow to sofaSoft-object handling
Sweep floorTool use + navigation
Picking trashGrasp + dispose
Clothes washingLaundry workflow

161 subtask labels and 148,000+ subtask annotations break episodes into fine-grained action tiles — useful for hierarchical policies, skill discovery, and evaluation rubrics rather than end-to-end black-box imitation only.


Two Hugging Face formats

FormatDatasetSizeBest for
RawBitRobot/HIW-500~10+ TBFull ROS bag / MCAP, custom pipelines
LeRobot v3.0BitRobot/HIW-500-LeRobot~2.15 TBACT, diffusion policies, HF training stacks

LeRobot's team re-encoded the full corpus:

"We re-encoded the full dataset into LeRobot format: ~10TB → ~2TB, no loss of fidelity. Same trajectories, a fraction of the footprint, far easier to stream." — @LeRobotHF

For researchers already on LeRobot — the same stack the Columbia Nori Bot paper uses for ACT training — HIW-500-LeRobot is the practical entry point. Parquet chunks at 100 MB, video chunks at 200 MB, 1,000 episodes per chunk.

Quick load check

from lerobot.common.datasets.lerobot_dataset import LeRobotDataset

dataset = LeRobotDataset("BitRobot/HIW-500-LeRobot")
print(dataset.num_episodes)   # 23743
print(dataset.num_frames)     # 40839947
sample = dataset[0]
print(sample.keys())          # observation.images.*, action, language_*, etc.

Verify against your installed LeRobot version — HIW-500 targets LeRobot v3.0.


Who built it and who backs it

PartnerRole
BitRobotDataset lead; "world's open robotics lab"
UnitreeG1 hardware platform
Hugging Face / LeRobotHosting + LeRobot conversion
Virtuals ProtocolPublic supporter on X

BitRobot's roadmap shows V1 in June 2026 (this release) and V2 with more tasks and environments later.


How HIW-500 fits the 2026 robotics stack

LayerHIW-500 role
Imitation learning500+ hours of human demos → ACT / diffusion / VLA training
Mobile manipulationWhole-body teleop — walk, reach, grasp in one trajectory
Generalist policiesHome diversity beats lab repetition for OOD generalization
Simulation bridgeFeed NVIDIA Cosmos or Genesis-style sim for sim-to-real
Data collection opsContrast with Shift-free cleaning data collection in NYC — different geography, same "real world" thesis

Consumer vs research gap: Nori L2 targets sub-$1k manipulators with OpenClaw scheduling. HIW-500 targets Unitree G1-class humanoids learning from hundreds of home hours. Same month, opposite ends of the cost curve — both need data.


Citation

@misc{hiw500_2026,
  title={HIW-500: Humanoids In-the-Wild Dataset for Robot Learning},
  author={BitRobot and Unitree and Hugging Face},
  year={2026},
  howpublished={\url{https://bitrobot-foundation.github.io/humanoids-in-the-wild-500-hours/}}
}

Project page includes a Rerun live preview for browsing episodes before downloading terabytes.

For commercial access beyond the public release, BitRobot offers Request Data Access on their project site.


Related ExplainX guides

Humanoids and datasets:

  • Nori L2: iPhone-price robot + LeRobot stack — sub-$1k manipulation vs G1 humanoid scale
  • Figure Helix-02 collaborative humanoid tidying — policy demo vs teleop dataset
  • Figure AI: robots outnumber humans at BotQ — factory humanoid production context
  • Genesis AI Eno agentic robot — foundation-model robotics stack
  • Shift-free cleaning: NYC robotics data collection — another real-world data thesis

Physical AI and training:

  • NVIDIA Cosmos 3 physical AI world models — simulation for robot training
  • What are AI agents? — from teleop demos to autonomous execution
  • What is OpenClaw? — proactive scheduling on cheap manipulators (Nori Bot lineage)

Primary sources: HIW-500 project page · LeRobot dataset · Raw dataset · @BitRobotNetwork · @UnitreeRobotics · @LeRobotHF


Summary

HIW-500 is the largest public humanoid teleop dataset from real homes as of June 23, 2026: 500+ hours, 23,743 episodes, 11 household tasks, 12 Southeast Asian homes, Unitree G1 whole-body control, and 148K+ subtask annotations. Download raw at ~10 TB or LeRobot v3.0 at ~2.15 TB on Hugging Face — same trajectories, easier streaming.

For anyone training mobile manipulation, bimanual skills, or home generalization policies, this is the dataset drop to benchmark against. Lab scores still matter; in-the-wild hours are what close the gap between demo videos and deployable humanoids.


Episode counts, file sizes, and task list reflect BitRobot's June 2026 V1 release and the Hugging Face dataset card. Re-check huggingface.co/datasets/BitRobot/HIW-500-LeRobot before large downloads — the corpus is multi-terabyte.

Related posts

Jun 23, 2026

Nori L2: a sub-iPhone-price robot opens orders next week

@NoriRobotics posted Nori L2 on June 23 with iPhone-tier pricing and orders opening next week. The site is waitlist-only so far. Here is the announcement, likely lineage from the $947 Nori Bot research platform, and how it compares to Figure, Genesis Eno, and XLeRobot.

Jun 22, 2026

Genesis AI Eno: first agentic general-purpose robot powered by GENE foundation model

Genesis AI published Meet Eno on June 16, 2026: a general-purpose robot that reasons, plans, and acts as one system with GENE, the company's foundation model. Twenty back-drivable degrees of freedom, checked-bag foldability, and a full in-house stack from simulation to hardware.

Jun 4, 2026

NVIDIA Cosmos 3: Open Physical AI World Models for Robots and Autonomous Systems

NVIDIA's Cosmos 3 release turns Cosmos from a broad world-model platform into an open developer stack for omnimodal Physical AI. This guide explains the Reasoner and Generator surfaces, the model family, supported inputs and outputs, setup paths, benchmarks, and where the limits still are.