ORACLE: A Real-Time, Hierarchical, Deep-Learning Photometric Classifier for the LSST
Authors
Ved G. Shah, Alex Gagliano, Konstantin Malanchev, Gautham Narayan, Alex I. Malz, The LSST Dark Energy Science Collaboration
Abstract
We present ORACLE, the first hierarchical deep-learning model for real-time, context-aware classification of transient and variable astrophysical phenomena. ORACLE is a recurrent neural network with Gated Recurrent Units (GRUs), and has been trained using a custom hierarchical cross-entropy loss function to provide high-confidence classifications along an observationally-driven taxonomy with as little as a single photometric observation. Contextual information for each object, including host galaxy photometric redshift, offset, ellipticity and brightness, is concatenated to the light curve embedding and used to make a final prediction. Training on $\sim$0.5M events from the Extended LSST Astronomical Time-Series Classification Challenge, we achieve a top-level (Transient vs Variable) macro-averaged precision of 0.96 using only 1 day of photometric observations after the first detection in addition to contextual information, for each event; this increases to $>$0.99 once 64 days of the light curve has been obtained, and 0.83 at 1024 days after first detection for 19-way classification (including supernova sub-types, active galactic nuclei, variable stars, microlensing events, and kilonovae). We also compare ORACLE with other state-of-the-art classifiers and report comparable performance for the 19-way classification task, in addition to delivering accurate top-level classifications much earlier. The code and model weights used in this work are publicly available at our associated GitHub repository (https://github.com/uiucsn/ELAsTiCC-Classification).
Concepts
The Big Picture
Imagine you’re a triage nurse in the world’s busiest emergency room, except instead of patients, thousands of exploding stars and warping spacetime events are rushing through the door every night. You have seconds to decide which ones need immediate attention. That’s the situation astronomers face with the Vera C. Rubin Observatory’s Legacy Survey of Space and Time (LSST), which will discover far more transient celestial events than any previous survey.
The arithmetic is brutal. Spectroscopy (spreading starlight into its component wavelengths to read chemical fingerprints and velocities) is the gold standard for identifying what an object actually is. But LSST will generate so many events that spectroscopic follow-up will be possible for less than 1% of them. Miss the window on a rare kilonova, the kind of explosion that forges gold and uranium, and that science is gone forever. Wait too long for a confident classification and the source has dimmed beyond reach.
A team led by Ved G. Shah, including IAIFI researcher Alex Gagliano, built ORACLE (Online Ranked Astrophysical CLass Estimator), a neural network that classifies cosmic events almost from the moment they appear, using as little as a single observation. The trick: ORACLE gets less specific early and more specific as data accumulates, always matching its confidence to what the data can actually support.
How It Works
ORACLE’s core is a recurrent neural network built with gated recurrent units (GRUs), specialized memory cells that decide what to remember and what to discard as new observations arrive. Unlike classifiers that wait for a complete picture, GRUs update their internal state with each new measurement. This makes them a natural fit for astronomical data, where a light curve (an object’s brightness record over time) arrives point by point over days, months, or years.

The real innovation is the training objective. Shah et al. developed a hierarchical cross-entropy loss function that encodes 19 astrophysical classes in a tree structure, teaching the model to be correct at multiple levels of detail simultaneously.
At the top of the tree, everything is either a “transient” (something that changes and fades, like a supernova) or a “variable” (something that oscillates repeatedly, like a pulsating star). Below that, branches split into supernova subtypes, active galactic nuclei, microlensing events, variable star classes, and kilonovae. The loss function rewards correctness at every level, not just at the leaves. So what does this hierarchy buy you in practice?
- After 1 day of data: 0.96 macro-averaged precision at the top level (transient vs. variable), already actionable for urgent follow-up decisions
- After 64 days: top-level precision exceeds 0.99
- After 1024 days: 0.83 precision across all 19 classes
ORACLE also ingests host galaxy context: photometric redshift (distance), the transient’s spatial offset from the galaxy’s center, ellipticity, and brightness. These features get combined with the light curve summary before the final prediction. A transient in the outskirts of a dim dwarf galaxy is probably not the same phenomenon as one sitting dead-center in a massive elliptical, and the model learns these associations from data rather than hand-coded rules.
Training used roughly half a million simulated events from the Extended LSST Astronomical Time-series Classification Challenge (ELAsTiCC), a community benchmark designed to stress-test classifiers against the realistic variety of LSST’s anticipated data stream.
Why It Matters
Most existing classifiers treat classification as a single-shot problem: accumulate enough data, then output a label. ORACLE treats it as a continuous dialogue between telescope and algorithm, committing only to the level of specificity the data can support at each moment.
Benchmarked against state-of-the-art competitors, ORACLE delivers comparable performance on the 19-way task while outperforming rivals at early times on the top-level triage decision.
The payoff goes well beyond supernovae. Kilonovae evolve on timescales of hours to days; an early, confident “transient” flag could trigger spectroscopic resources before the optical emission fades. The same logic applies to tidal disruption events, fast blue optical transients, and any rare phenomenon whose secrets live in early-phase behavior.
As LSST comes online, classifiers like ORACLE will be the first filter between millions of nightly alerts and the limited pool of follow-up telescopes. With 96% precision distinguishing transients from variables after just one day, scaling to 83% across all 19 classes with full light curves, ORACLE gives astronomers a working triage system for the biggest sky survey ever attempted.
IAIFI Research Highlights
ORACLE applies deep sequence modeling to a core problem in time-domain astrophysics, embedding physical taxonomy directly into the loss function so the network learns astrophysical structure alongside its data.
The hierarchical cross-entropy loss function is a generalizable technique for any domain where class relationships form a tree, from medical diagnosis to ecological species identification.
By enabling near-instantaneous classification of rare events like kilonovae, ORACLE supports multi-messenger astronomy and the study of neutron star mergers, where the universe forges heavy elements through r-process nucleosynthesis.
The team plans to deploy ORACLE within LSST's live alert broker infrastructure. Code and model weights are publicly available at [github.com/uiucsn/ELAsTiCC-Classification](https://github.com/uiucsn/ELAsTiCC-Classification), and the work appears in *The Astrophysical Journal* (2025). [arXiv:2501.01496](https://arxiv.org/abs/2501.01496)
Original Paper Details
ORACLE: A Real-Time, Hierarchical, Deep-Learning Photometric Classifier for the LSST
[2501.01496](https://arxiv.org/abs/2501.01496)
Ved G. Shah, Alex Gagliano, Konstantin Malanchev, Gautham Narayan, Alex I. Malz, The LSST Dark Energy Science Collaboration
We present ORACLE, the first hierarchical deep-learning model for real-time, context-aware classification of transient and variable astrophysical phenomena. ORACLE is a recurrent neural network with Gated Recurrent Units (GRUs), and has been trained using a custom hierarchical cross-entropy loss function to provide high-confidence classifications along an observationally-driven taxonomy with as little as a single photometric observation. Contextual information for each object, including host galaxy photometric redshift, offset, ellipticity and brightness, is concatenated to the light curve embedding and used to make a final prediction. Training on ~0.5M events from the Extended LSST Astronomical Time-Series Classification Challenge, we achieve a top-level (Transient vs Variable) macro-averaged precision of 0.96 using only 1 day of photometric observations after the first detection in addition to contextual information, for each event; this increases to >0.99 once 64 days of the light curve has been obtained, and 0.83 at 1024 days after first detection for 19-way classification (including supernova sub-types, active galactic nuclei, variable stars, microlensing events, and kilonovae). We also compare ORACLE with other state-of-the-art classifiers and report comparable performance for the 19-way classification task, in addition to delivering accurate top-level classifications much earlier. The code and model weights used in this work are publicly available at our associated GitHub repository (https://github.com/uiucsn/ELAsTiCC-Classification).