Humans glance at a scene and instantly know what action to take – stroll, pedal, or dive. Artificial intelligence (AI), despite headline‑grabbing advances, still struggles with that snap judgment.
PhD student Clemens Bartnik of the University of Amsterdam and colleagues used brain‑scanning to show why the gap remains.
In 1979, psychologist James Gibson coined the term affordances, describing how objects invite action. The new Amsterdam work places that idea squarely in the living human brain.
Participants lay in a scanner and viewed snapshots of shorelines, staircases, and alleyways. They pressed a button to pick walking, cycling, driving, swimming, boating, or climbing while the machine tracked blood flow in visual areas.
“These action possibilities are therefore processed automatically,” said lead scientist Iris Groen. Activity patterns in the visual cortex changed not just with what was visible but with what the body could do.
The signature appeared even when volunteers made no explicit choice about movement. That means the brain tags potential actions as part of its basic image pipeline, well before conscious deliberation.
Earlier work hinted at such fast coding for grasping tools, yet locomotion is broader and demands constant spatial updating.
By isolating the signal in higher‑order scene regions, the team showed a dedicated circuit rather than a by‑product of object recognition.
Even in early development, humans link sight with movement. Babies crawl toward open spaces and avoid drop-offs not because they understand height, but because their bodies learn consequences through trial and error.
This tight loop between action and feedback trains the brain to anticipate what a space allows. By the time we’re adults, these patterns run automatically, helping us judge what’s possible in a split second.
Vision systems built on deep neural networks excel at labeling objects or entire scenes. But when the researchers fed the same photos to leading models, the machines mis‑guessed feasible actions about one‑quarter of the time.
“Even the best AI models don’t give exactly the same answers as humans,” Groen noted. Even large language‑vision hybrids such as GPT‑4 improve only after extra training on affordance labels.
Analysis of the networks’ hidden layers revealed weak alignment with the fMRI patterns. The difference suggests current architectures ignore geometric and bodily constraints that matter to humans.
What makes the human edge even sharper is that we’ve spent our entire lives testing these environments. The sensorimotor system doesn’t just interpret images, it overlays them with memories of movement, pain, balance, and success.
AI models don’t grow up in a world of slippery floors, steep curbs, or off‑trail adventures. They haven’t fallen on ice or scrambled over rocks, and that limits their ability to map pictures to possible actions with the same nuance.
Training gargantuan models consumes megawatt‑hours and tons of carbon. If engineers can borrow the brain’s lean affordance code, future systems might reach better decisions with fewer parameters.
Robots navigating rubble, drones flying through forests, and wheelchairs plotting ramps all need that frugal insight. Instead of photographing every walkway on Earth, designers could hard‑wire a few spatial heuristics and learn the rest on‑site.
Energy savings translate to slimmer batteries and wider access outside big tech campuses. Hospitals, schools, and small‑town emergency crews stand to gain from models that think more like the people they serve.
Disaster‑response robots already use lidar and stereo cameras, yet they fail when smoke or dust hides surfaces. A cortex‑inspired layer could fill gaps by inferring where treads may grip or where water flows.
Virtual‑reality therapists also watch the project. Stroke patients relearn walking faster when simulations adjust paths to match perceived affordances, not textbook dimensions.
Self‑driving cars face the nuance of a bike lane merging with a crosswalk at dusk. Embedding affordance sensors might cut false positives and avoid abrupt braking that unnerves riders.
Researchers still debate whether affordance maps arise from vision alone or feed back from motor plans. Future experiments will likely combine fMRI with muscle recordings to trace the loop.
Another unknown is how culture tweaks perception. A skateboarder and a hiker read the same staircase differently, and algorithms may need similar flexibility.
The findings remind us that seeing is inseparable from doing. Our eyes deliver a running forecast of possible moves, shaping intuition long before words enter the chat box.
Acknowledging that layered wisdom could steer AI toward tools that extend rather than replace human ability. Nature’s shortcut may yet teach silicon to tread lightly while thinking ahead.
The study is published in Proceedings of the National Academy of Sciences.
—–
Like what you read? Subscribe to our newsletter for engaging articles, exclusive content, and the latest updates.
Check us out on EarthSnap, a free app brought to you by Eric Ralls and Earth.com.
—–