A foundation model that treats movement as a first-class sequence domain — learning the structure of how bodies move through the physical world.
Foundation models proved themselves on text. The race now is the physical world.
In late 2025, Jeff Bezos took his first operational role since Amazon to co-found Project Prometheus — an AI company, since valued near $41B, built to make models that understand physical reality rather than just language. It is the loudest signal yet of a broad turn: the next era of AI is embodied.
Language was the first domain to fall to the foundation-model recipe because it was the easiest to digitize. It will not be the last. Any domain that unfolds in time and carries hard structural constraints — markets, biological signals, industrial telemetry, and above all movement — is a candidate for the same treatment.
LMM Technologies builds the foundation model for one of the most fundamental physical signals there is: how a body moves through space. Not a CAD assistant, not a robot — the motor prior underneath all of them.
We are not building artificial minds. We are manufacturing artificial instinct — compressing the structure of a domain into priors the way evolution did for living things, only on a vastly shorter timescale.
A foal stands within hours; a trader reads a tape at a glance. Expertise lives in priors built from exposure, not deliberation. Large models manufacture exactly this kind of prior.
The same recipe — large-scale training, learned tokenization, attention over long sequences — applies wherever an expert sees what a novice cannot. Motion is a prime case, and a largely untouched one.
Today's models are the bottom layer: manufactured instinct. The half still missing is learning — adapting from little because it stands on a great deal.
A Large Movement Model treats motion itself as the primary data type — sequences of joint positions, trajectories, and timing — and learns how movement unfolds the way a language model learns how text unfolds.
Video or sensor streams from any camera, frame rate, or skeleton format.
OpenPose BODY_25 keypoints — 25 joints with per-joint confidence.
Hip-centered, torso-scaled, resampled to 15 FPS with QC scoring.
Frame, window, and body-part views — motion as multi-resolution tokens.
Hierarchical transformer or diffusion inference over the sequence.
See the full technical overview — architecture, results, and cross-domain transfer →
The pipeline makes no assumption about what kind of movement it sees. In Phase I, motion dynamics learned on dance video transferred — essentially intact — to clinical motion capture the model had never seen.
Movement-quality analytics, recovery tracking, and compensatory-pattern detection across gait and balance.
Technique assessment, fatigue signals, injury-risk estimation, and biomechanical pattern analysis.
A pretrained human-motion prior for imitation learning, retargeting, and anticipatory human–robot coordination.
Workplace movement monitoring, physical-demand quantification, and real-time anomaly detection.
Gait recognition, anomalous-movement detection, and intent inference from body language.
Motion-aware interfaces that read intent from how a person moves, not just what they touch.
LMM internalizes that shoulders have range-of-motion limits, that gait is periodic, that balance demands continuous postural adjustment. This knowledge exists independently of any particular body.
Most humanoid and assistive robots are trained from scratch in simulation — expensive, with brittle sim-to-real transfer — or rely on hand-coded primitives. A model that already understands joint coordination, plausible trajectories, and multi-timescale dynamics gives a robot a head start: fine-tune from a rich prior instead of learning from zero.
The architecture lines up with the problem in a way that's hard to ignore: the HTT's three temporal levels correspond to a robot's actuator control, trajectory planning, and task sequencing, and the body-part stream is the kind of coordination signal that bimanual manipulation and whole-body balance demand. In principle, a 1–2 second forecast horizon is the right window for a robot to anticipate a human collaborator rather than merely react. Whether the motion prior transfers this way is a direction we're pursuing — not a result we're claiming.
Pipeline, architectures, and Phase I results — including a diffusion model that cuts forecasting error 30% and motion dynamics that transfer across capture modalities.
Read the research →Why the word learning may be the wrong word — and what it means to build instinct first, deliberation second.
Read the note →If today's models are manufactured instinct, the part still missing is learning — and the measure of it is how little data it takes.
Read the note →A change of pace. When the grid goes down and "just Google it" stops working, the most valuable item in your kit might be a local AI model — on building an offline knowledge archive.
Read the dispatch →For research, partnership, or pilot inquiries across rehabilitation, sports, robotics, and embodied systems.
contact@lmmtech.ai