AI can now predict your next move - with remarkable accuracy
12-09-2025

AI can now predict your next move - with remarkable accuracy

Self-driving cars are getting better at seeing. OmniPredict, a new AI system, aims to make them better at understanding. 

Instead of treating people as moving pixels, this system created by Texas A&M University and KAIST, fuses what a camera sees with the broader context of a scene.

It then uses that information to forecast what pedestrians are likely to do next – hesitate, step off the curb, cross, or retreat.

“Cities are unpredictable. Pedestrians can be unpredictable,” said study co-author Srikanth Saripalli. “Our new model is a glimpse into a future where machines don’t just see what’s happening, they anticipate what humans are likely to do, too.”

OmniPredict is among the first systems to bring a multimodal large language model – the same class of technology behind advanced chatbots and visual question answering – into the pedestrian prediction loop. 

The result is an engine that recognizes posture, gaze, and motion cues. It also weighs those signals against context — such as road layout, vehicle approach, partial occlusions, and social cues – to generate real-time behavioral forecasts.

OmniPredict builds street smarts

Traditional autonomy stacks rely on vision networks trained on massive datasets to classify and track objects, then hand off to a separate module that predicts motion from recent trajectories.

That works well when the world behaves like the training set. It struggles when it doesn’t. 

“Weather changes, people behaving unexpectedly, rare events and the chaotic elements of a city street all could possibly affect even the most sophisticated vision-based systems,” Saripalli noted.

OmniPredict tackles that fragility by combining perception with reasoning. Visual inputs are paired with contextual descriptors such as where the crosswalk is, whether a pedestrian is half-occluded by a parked van, or whether someone is looking toward the vehicle.

Then, they are passed to a multimodal LLM that has been adapted to the driving domain. Instead of merely extrapolating motion vectors, the model interprets the scene as a set of human intentions unfolding in time.

In practical terms, that allows it to distinguish “standing near the curb while chatting” from “weight shifting forward to step into the street,” and to adjust its forecast on the fly as cues change frame by frame.

Why anticipation matters

Humans make eye contact, read body language, and game out possibilities: if that person speeds up, if that stroller turns, if that cyclist wobbles. 

Autonomous vehicles need a comparable anticipatory layer to avoid over-braking, hesitating, or making abrupt maneuvers that cause their own hazards. 

“It opens the doors for safer autonomous vehicle operation, fewer pedestrian-related incidents and a shift from reacting to proactively preventing danger,” Saripalli said. 

The psychological dividend could be real, too. Picture standing at a crosswalk knowing the vehicle has already factored in your likely next move.

“Fewer tense standoffs. Fewer near-misses. Streets might even flow more freely. All because vehicles understand not only motion, but most importantly, motives,” noted Saripalli.

Human cues in complex settings

The team sees applications well beyond city streets. Systems that can read posture changes, hesitation, stress signals, and threat cues could assist soldiers and first responders.

These systems may also offer valuable support to security teams working in fast-moving, high-stakes environments.

“We are opening the door for exciting applications,” Saripalli said. “For instance, the possibility of a machine to capably detect, recognize, and predict outcomes of a person displaying threatening cues could have important implications.” 

“Our goal in the project isn’t to replace humans, but to help augment them with a smarter partner.”

Putting OmniPredict to the test

To gauge whether this reasoning-heavy approach generalizes, the researchers evaluated OmniPredict on two of the toughest pedestrian behavior benchmarks – JAAD and WiDEVIEW – without specialized fine-tuning for those datasets. 

The model delivered 67% accuracy, outperforming state-of-the-art baselines by about 10 percent.

Crucially, it held up when the team added harder cases: partially hidden walkers, people glancing toward the vehicle, mixed lighting, and varied street geometries. 

Response latency stayed low, and performance transferred across different road contexts, both encouraging signs for eventual field deployment.

 “OmniPredict’s performance is exciting, and its flexibility hints at much broader real-world potential,” Saripalli said.

Driving that feels more human-aware

OmniPredict doesn’t discard classical autonomy. It complements it. Camera and sensor perception still identify agents and localize the vehicle. Planning still obeys traffic rules and comfort limits. 

The upgrade happens in the middle: the prediction layer becomes context-aware and intention-sensitive.

That shift lets planners be both safer and smoother, leaving room for a tentative step-off, committing sooner when a pedestrian is clearly yielding, or slowing preemptively when body language signals uncertainty.

The payoff is fewer hard brakes, fewer ambiguous nudges, and driving that feels more human-aware.

Limitations, ethics, and the road ahead

For now, OmniPredict is a research prototype, not a production system. It needs broader testing across cultures, weather, and urban designs, as well as guardrails to prevent over-confidence and careful attention to privacy when interpreting human cues. 

Any tool that “reads” people must be transparent about what it infers and why, and must be evaluated for bias across body types, mobility aids, clothing styles, and behaviors.

Still, the direction is clear: autonomy that leans less on brute-force pattern matching and more on behavioral reasoning. 

“OmniPredict doesn’t just see what we do, it understands why we do it and can now predict when we are likely to do an action,” Saripalli said. 

If vehicles can reliably anticipate the next few seconds of human behavior – and plan with that in mind – the industry edges closer to streets that feel calmer, safer, and more cooperative for everyone on them.

The study is published in the journal Computers & Electrical Engineering.

—–

Like what you read? Subscribe to our newsletter for engaging articles, exclusive content, and the latest updates.

Check us out on EarthSnap, a free app brought to you by Eric Ralls and Earth.com.

—–

News coming your way
The biggest news about our planet delivered to you each day
Subscribe