Talking to Earth: a first-generation AI digital assistant

A recent initiative funded by ESA Φ-lab focuses on creating an AI-powered digital assistant that allows users to access and explore complex EO data through a natural language interface. In the long term, the aim is to integrate this tool into digital twins of Earth, supporting decision-making in areas such as climate monitoring, disaster management and urban planning.
A digital twin of the Earth environment is an interactive “digital replica” that allows us to understand the various relationships between the physical and natural Earth environments and society. It enables scientists to quantify past, present and future events on our planet, integrating models, observations, and technologies such as Artificial Intelligence (AI) to improve our understanding of the human impact on global environment and society.
Through data and simulations, digital twins allow for real-time prediction, monitoring, control and optimisation of Earth’s natural and physical processes. Two related flagship programmes are the European Commission’s Destination Earth (DestinE) and the European Space Agency’s Digital Twin Earth (DTE).
In the past years, rapid advances in Earth Observation (EO) have led to an increase in the amount of available EO data to be processed for the benefit of society, creating an opportunity to harness the power of new technologies like AI. The growing European EO capability is delivering a unique and dynamic picture of Earth, but this information remains largely unexploited due to current limitations in querying and retrieval capabilities and does not include an appropriate interface for non-experts to interact with digital twins.
Technologies such as Computer Vision and Natural Language Processing – used for object detection/EO image segmentation and for interpretation of human language, respectively – have been treated as separate areas, without benefitting from each other. But what if we could use these technologies together to create a new way to interact with and understand EO data?
In that sense, ESA Φ-lab is funding the creation of a digital assistant interface for digital twins of Earth – Demonstrator Precursor Digital Assistant Interface for Digital Twin Earth (DA4DTE) – in collaboration with e-GEOS (Italy), the National and Kapodistrian University of Athens (Greece) and the Technical University of Berlin (Germany). Φ-lab has already a proven track record in the development of foundation models, which serve as the basis for digital assistants.
The developed digital assistant prototype interacts with users via two modalities, text and satellite images, and consists of four back-end engines: search-by-image, search-by-text, knowledge graph question answering (KGQA), and visual question answering (VQA) engines. These engines are orchestrated by the task interpreter to answer complex requests of users looking for EO data.
The Knowledge Graph Question-Answering Engine is built on a knowledge graph that integrates geospatial data from widely used geographical knowledge bases, such as OpenStreetMap, along with metadata from Copernicus missions. This knowledge graph is combined with Large Language Models (LLMs) such as Llama, GPT, and Mistral.
The search-by-image engine is based on a newly developed technique called Cross-Modal Masked AutoEncoder (CM-MAE). This method features modal-agnostic, self-supervised feature characterisation, making it well-suited for cross-modal image retrieval tasks. Additionally, the engine incorporates deep hashing modules to map cross-modal embeddings into compact binary hash codes, ensuring efficient and scalable data storage and retrieval.
In the end, the idea is that users, whether they are EO experts or not, will be able to perform a semantic query on EO data archives such as “Show me 3 pictures of rivers in Italy, with a vegetation coverage over 20%, taken after May 2020” or “Count the number of buildings in this area”. This digital assistant will help to answer questions in several EO-related data domains – agriculture, forest, urban, marine, cryosphere, among others – contributing to improved decision-making.
“As digital twins of Earth become increasingly sophisticated, the ability to interact with them is crucial. The digital assistant for Digital Twin Earth represents a major step forward in making complex Earth Observation data more accessible”, comments Nicolas Longépé, Earth Observation Data Scientist at ESA Φ-lab. “By allowing users to ask questions in natural language and receive insightful, data-driven responses, this tool lowers the barrier to entry and accelerates the use of EO data for decision-making. Whether it is for climate monitoring, disaster response, or environmental management, having an intelligent interface to navigate and interpret vast datasets is essential for informed action.”
All the developed back-end engines are available as open source with a permissive licence here. The precursor digital assistant prototype is available here. A presentation titled “A digital assistant for digital twins of the Earth” will take place during Living Planet Symposium, from 23 to 27 June 2025 in Vienna, Austria.
To know more: ESA Φ-lab, Digital Twin Earth, Destination Earth
Photo courtesy of Unsplash/Carl Wang
Share