Satellites have learned to find things on their own. Here’s what this means:

For the first time, an Earth observation satellite has found what it was looking for on its own, without human analysts on the ground. This milestone, which occurred in April, marks the first reported vision language model from orbit and provides a glimpse into how AI could fundamentally transform the capabilities and value of space-based sensors.

Typically, satellites download large amounts of data to analysts on Earth below who use machine learning algorithms or their own eyes to figure out what’s going on. But Yam-9, a spacecraft built by space infrastructure company Loft Orbital, had a software package built by NASA’s Jet Propulsion Laboratory that responded to natural language queries to identify areas of interest.

The vision language model (VLM) that powered the demonstration, Google DeepMind’s Gemma 3, was built specifically for edge applications. That is, it is designed to run on limited hardware far from data centers. VLM combines the contextual understanding of large-scale language models with image analysis capabilities. Researchers asked the model to classify sensor data for where the natural environment meets human development, for example, or identify infrastructure around rail hubs, and it did.

This demonstration is important for two reasons. In the short term, performing initial data classification from orbit could make space sensors much more useful by reducing the flood of raw data that analysts currently have to process. In the long term, this is a proof point for running large-scale AI infrastructure in space.

“This opens the door to enabling an always-on, patrol layer in space,” Paul Lasserre, head of AI at Loft, told TechCrunch. “If you have a VLM, you can interact with the satellite with logic like, ‘Monitor this perimeter and let me know if anything looks suspicious.’”

Loft’s spacecraft is designed as a platform for third-party customers. The business model is closer to Infrastructure-as-a-Service than traditional satellite manufacturing. One recent deal saw it build, launch and operate six new satellites for EarthDaily, which will analyze and sell data collected from the spacecraft. Yam-9 was launched in the fall of 2025 as a guide for the company’s orbital AI project and includes the Nvidia Jetson Orrin AGX GPU, one of the key chips used in space computing.

Juan Delfa Victoria, technical leader of NASA JPL’s AI group, led the development of NAVI-Orbital, a software package that is an effective harness for the Gemma 3 VLM. Gemma 3 is already available, but software engineers needed to streamline the software package to reduce the amount of libraries and memory required.

Although this is the first reported use of VLM in orbit, other companies are expected to follow suit. Planet Labs flies satellites with Jetson Orin processors. For now, it’s using it for simpler object detection tasks, but a spokesperson said research is underway on other AI applications, including VLM.

Kepler Communications, which operates the largest GPUS group in space, declined to say whether it has deployed a VLM in space due to NDA agreements with partners, but noted that there have been “several undisclosed use cases for computing environments” since the spacecraft launched in January.

“Now that we have proven the concept, this is really the direction of travel,” Lasserre said. The goal is to build a constellation using 50 to 100 satellites like Yam-9 to ensure real-time coverage anywhere on Earth. (Loft currently operates 12 spacecraft in orbit.)

The lessons learned from deploying these small models in orbit will inform how companies attempt to deploy large-scale computing infrastructure in space, especially in the mundane but essential areas of power and memory management.

It may also pave the way for new scientific tools. The idea for NAVI-Space originated with JPL researcher Taran Cyriac John, who was thinking about a digital assistant for astronauts exploring the Moon or Mars.

“We’re thinking, we have astronauts in compression suits and we know they can’t tap on a keyboard. Whatever they want is complicated.” Delpha Victoria said. “So what about providing an assistant that can view conversational AI like in a video game or movie?”

Don’t call it HAL 9000.

If you purchase through links in our articles, we may receive a small commission. This does not affect our editorial independence.