Opportunities in AI/ML for the Rubin LSST Dark Energy Science Collaboration
Authors
LSST Dark Energy Science Collaboration, Eric Aubourg, Camille Avestruz, Matthew R. Becker, Biswajit Biswas, Rahul Biswas, Boris Bolliet, Adam S. Bolton, Clecio R. Bom, Raphaël Bonnet-Guerrini, Alexandre Boucaud, Jean-Eric Campagne, Chihway Chang, Aleksandra Ćiprijanović, Johann Cohen-Tanugi, Michael W. Coughlin, John Franklin Crenshaw, Juan C. Cuevas-Tello, Juan de Vicente, Seth W. Digel, Steven Dillmann, Mariano Javier de León Dominguez Romero, Alex Drlica-Wagner, Sydney Erickson, Alexander T. Gagliano, Christos Georgiou, Aritra Ghosh, Matthew Grayling, Kirill A. Grishin, Alan Heavens, Lindsay R. House, Mustapha Ishak, Wassim Kabalan, Arun Kannawadi, François Lanusse, C. Danielle Leonard, Pierre-François Léget, Michelle Lochner, Yao-Yuan Mao, Peter Melchior, Grant Merz, Martin Millon, Anais Möller, Gautham Narayan, Yuuki Omori, Hiranya Peiris, Laurence Perreault-Levasseur, Andrés A. Plazas Malagón, Nesar Ramachandra, Benjamin Remy, Cécile Roucelle, Jaime Ruiz-Zapatero, Stefan Schuldt, Ignacio Sevilla-Noarbe, Ved G. Shah, Tjitske Starkenburg, Stephen Thorp, Laura Toribio San Cipriano, Tilman Tröster, Roberto Trotta, Padma Venkatraman, Amanda Wasserman, Tim White, Justine Zeghal, Tianqing Zhang, Yuanyuan Zhang
Abstract
The Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) will produce unprecedented volumes of heterogeneous astronomical data (images, catalogs, and alerts) that challenge traditional analysis pipelines. The LSST Dark Energy Science Collaboration (DESC) aims to derive robust constraints on dark energy and dark matter from these data, requiring methods that are statistically powerful, scalable, and operationally reliable. Artificial intelligence and machine learning (AI/ML) are already embedded across DESC science workflows, from photometric redshifts and transient classification to weak lensing inference and cosmological simulations. Yet their utility for precision cosmology hinges on trustworthy uncertainty quantification, robustness to covariate shift and model misspecification, and reproducible integration within scientific pipelines. This white paper surveys the current landscape of AI/ML across DESC's primary cosmological probes and cross-cutting analyses, revealing that the same core methodologies and fundamental challenges recur across disparate science cases. Since progress on these cross-cutting challenges would benefit multiple probes simultaneously, we identify key methodological research priorities, including Bayesian inference at scale, physics-informed methods, validation frameworks, and active learning for discovery. With an eye on emerging techniques, we also explore the potential of the latest foundation model methodologies and LLM-driven agentic AI systems to reshape DESC workflows, provided their deployment is coupled with rigorous evaluation and governance. Finally, we discuss critical software, computing, data infrastructure, and human capital requirements for the successful deployment of these new methodologies, and consider associated risks and opportunities for broader coordination with external actors.
Concepts
The Big Picture
Starting in 2025, a camera the size of a small car began scanning the entire southern sky every few nights from a mountaintop in Chile. The Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST) will spend a decade photographing 20 billion galaxies, generating roughly 20 terabytes of data every single night. That’s like downloading the entire Library of Congress twice, daily, for ten years.
The goal: understanding the two biggest mysteries in physics. Dark energy, the unknown force accelerating the universe’s expansion. Dark matter, the invisible substance shaping how galaxies cluster and move. But traditional analysis tools will buckle under LSST’s data volume. A new white paper from the LSST Dark Energy Science Collaboration (DESC), with over 70 scientists from four continents, outlines how AI and machine learning must evolve to meet this challenge.
Key Insight: The same core AI/ML challenges (uncertain predictions, distributional shift, and lack of interpretability) appear across every major dark energy measurement technique. Focused investment in a few methodological areas could produce gains across the entire field at once.
How It Works
DESC isn’t starting from scratch. AI and machine learning are already woven into their pipelines.
Neural networks already estimate photometric redshifts, gauging galaxy distances from light color rather than slow spectroscopic observations. ML classifiers handle transient classification, sorting millions of nightly alerts into supernovae, asteroids, and active galactic nuclei in real time. And for weak gravitational lensing, where tiny distortions in background galaxy shapes trace the distribution of dark matter, deep learning is increasingly taking over shape measurement.

The paper spots a gap, though. Most current AI tools aren’t ready for precision cosmology. Cosmology demands error bars you can actually trust. If a neural network predicts “this galaxy is at redshift 0.8 with 5% uncertainty,” it needs to be right 95% of the time, not just on the training set, but on real LSST data that will look different from any simulation used to train it. This is covariate shift, and it shows up everywhere: simulations never perfectly match reality, and instruments drift as they age.
Four methodological priorities cut across all the science cases:
- Bayesian inference at scale: Simulation-based inference (SBI) learns statistical patterns by comparing thousands of simulations to observations rather than solving equations directly. Normalizing flows map out the full probability distribution of any unknown quantity. Both can replace unreliable point estimates, but scaling them to LSST’s volume requires new algorithmic approaches.
- Physics-informed methods: Embedding known physical laws directly into neural network architectures (how gravity bends light, how galaxies cluster) reduces training data requirements and improves reliability when data falls outside the training distribution.
- Validation frameworks: Protocols to test whether AI systems fail gracefully or catastrophically when faced with unexpected inputs, before such failures corrupt a decade of measurements.
- Active learning for discovery: Intelligently selecting which objects warrant expensive spectroscopic follow-up, rather than random sampling, could multiply the scientific yield of complementary surveys.
The paper also looks ahead to foundation models (massive AI systems pre-trained on enormous datasets and then fine-tuned for many tasks) and agentic AI systems. The idea mirrors what large language models did for natural language processing: a single model trained on broad astronomical data could serve as a universal backbone for dozens of downstream analyses.
Early examples like AstroCLIP and UniverseNet have shown promise. LLM-driven agents that write code, run analyses, interpret results, and iterate autonomously might eventually handle entire pipeline segments, from raw images all the way to cosmological parameters.
The paper tempers this optimism, though. Foundation models trained on biased simulations could propagate subtle systematic errors across every downstream analysis. Serious evaluation and governance need to come before any such system enters a production cosmology pipeline.
Why It Matters
The questions DESC is trying to answer sit among the deepest in fundamental physics: whether dark energy is Einstein’s cosmological constant (a fixed background energy woven into the fabric of spacetime) or something stranger, and whether general relativity holds on cosmic scales. Getting the right answer requires not just better telescopes but better statistical machinery.
The AI methods being developed for LSST push hard on uncertainty quantification (measuring how confident a prediction actually is), domain adaptation (making models work reliably on data that differs from their training set), and physics-constrained machine learning. Drug discovery, climate modeling, and particle physics all face the same problems.
What sets this paper apart is its collaborative scope. Rather than advocating for a single lab’s approach, it synthesizes the priorities of an entire international community and flags the need for open-source tools, shared validation datasets, and reproducible pipelines.
There’s also the underappreciated challenge of compute equity: as analyses grow more computationally intensive, smaller institutions risk being locked out of frontier science. And the field needs scientists who can speak both physics and machine learning fluently.
Bottom Line: LSST will overwhelm conventional analysis methods with data. This roadmap shows how AI, built on trustworthy uncertainty quantification, physics-informed design, and careful validation, can turn that flood into the most precise measurements of dark energy ever made.
IAIFI Research Highlights
This white paper unifies AI/ML methodology research with precision cosmology, showing that advances in machine learning robustness and uncertainty quantification directly enable better measurements of dark energy and dark matter at cosmic scales.
The work identifies simulation-based inference, physics-informed neural networks, and foundation models for scientific data as key frontiers, pushing the limits of how reliably AI can be deployed when the data volumes are massive and the error tolerances are tight.
The roadmap supports the tightest-ever constraints on dark energy's equation of state and tests of gravity on the largest observable scales through trustworthy AI-driven cosmological analysis.
With LSST operations underway, the methodological investments outlined here will shape whether a decade of data yields definitive answers about the nature of dark energy; the full roadmap is available at [arXiv:2601.14235](https://arxiv.org/abs/2601.14235).
Original Paper Details
Opportunities in AI/ML for the Rubin LSST Dark Energy Science Collaboration
2601.14235
LSST Dark Energy Science Collaboration, Eric Aubourg, Camille Avestruz, Matthew R. Becker, Biswajit Biswas, Rahul Biswas, Boris Bolliet, Adam S. Bolton, Clecio R. Bom, Raphaël Bonnet-Guerrini, Alexandre Boucaud, Jean-Eric Campagne, Chihway Chang, Aleksandra Ćiprijanović, Johann Cohen-Tanugi, Michael W. Coughlin, John Franklin Crenshaw, Juan C. Cuevas-Tello, Juan de Vicente, Seth W. Digel, Steven Dillmann, Mariano Javier de León Dominguez Romero, Alex Drlica-Wagner, Sydney Erickson, Alexander T. Gagliano, Christos Georgiou, Aritra Ghosh, Matthew Grayling, Kirill A. Grishin, Alan Heavens, Lindsay R. House, Mustapha Ishak, Wassim Kabalan, Arun Kannawadi, François Lanusse, C. Danielle Leonard, Pierre-François Léget, Michelle Lochner, Yao-Yuan Mao, Peter Melchior, Grant Merz, Martin Millon, Anais Möller, Gautham Narayan, Yuuki Omori, Hiranya Peiris, Laurence Perreault-Levasseur, Andrés A. Plazas Malagón, Nesar Ramachandra, Benjamin Remy, Cécile Roucelle, Jaime Ruiz-Zapatero, Stefan Schuldt, Ignacio Sevilla-Noarbe, Ved G. Shah, Tjitske Starkenburg, Stephen Thorp, Laura Toribio San Cipriano, Tilman Tröster, Roberto Trotta, Padma Venkatraman, Amanda Wasserman, Tim White, Justine Zeghal, Tianqing Zhang, Yuanyuan Zhang
The Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST) will produce unprecedented volumes of heterogeneous astronomical data (images, catalogs, and alerts) that challenge traditional analysis pipelines. The LSST Dark Energy Science Collaboration (DESC) aims to derive robust constraints on dark energy and dark matter from these data, requiring methods that are statistically powerful, scalable, and operationally reliable. Artificial intelligence and machine learning (AI/ML) are already embedded across DESC science workflows, from photometric redshifts and transient classification to weak lensing inference and cosmological simulations. Yet their utility for precision cosmology hinges on trustworthy uncertainty quantification, robustness to covariate shift and model misspecification, and reproducible integration within scientific pipelines. This white paper surveys the current landscape of AI/ML across DESC's primary cosmological probes and cross-cutting analyses, revealing that the same core methodologies and fundamental challenges recur across disparate science cases. Since progress on these cross-cutting challenges would benefit multiple probes simultaneously, we identify key methodological research priorities, including Bayesian inference at scale, physics-informed methods, validation frameworks, and active learning for discovery. With an eye on emerging techniques, we also explore the potential of the latest foundation model methodologies and LLM-driven agentic AI systems to reshape DESC workflows, provided their deployment is coupled with rigorous evaluation and governance. Finally, we discuss critical software, computing, data infrastructure, and human capital requirements for the successful deployment of these new methodologies, and consider associated risks and opportunities for broader coordination with external actors.