SPLASH: A Rapid Host-Based Supernova Classifier for Wide-Field Time-Domain Surveys
Authors
Adam Boesky, V. Ashley Villar, Alexander Gagliano, Brian Hsu
Abstract
The upcoming Legacy Survey of Space and Time (LSST) conducted by the Vera C. Rubin Observatory will detect millions of supernovae (SNe) and generate millions of nightly alerts, far outpacing available spectroscopic resources. Rapid, scalable photometric classification methods are therefore essential for identifying young SNe for follow-up and enabling large-scale population studies. We present SPLASH, a host-based classification pipeline that infers supernova classes using only host galaxy photometry. SPLASH first associates SNe with their hosts (yielding a redshift estimate), then infers host galaxy stellar mass and star formation rate using deep learning, and finally classifies SNe using a random forest trained on these inferred properties, along with host-SN angular separation and redshift. SPLASH achieves a binary (Type Ia vs. core-collapse) classification accuracy of $76\%$ and an F1-score of $69\%$, comparable to other state-of-the-art methods. By selecting only the most confident predictions, SPLASH can return highly pure subsets of all major SN types, making it well-suited for targeted follow-up. Its efficient design allows classification of $\sim 500$ SNe per second, making it ideal for next-generation surveys. Moreover, its intermediate inference step enables selection of transients by host environment, providing a tool not only for classification but also for probing the demographics of stellar death.
Concepts
The Big Picture
Imagine trying to identify every species of bird on Earth, not by watching them fly or listening to their calls, but by looking at the trees they nest in. That sounds like a long shot, yet astronomers have built an AI system called SPLASH that does something equivalent: classifying exploding stars by studying where they live, not how they look.
Every night, somewhere in the universe, a star dies in a cataclysmic explosion called a supernova. These events scatter heavy elements across galaxies, power cosmic chemical evolution, and act as “standard candles” (objects with known intrinsic brightness that let us measure cosmic distances). Studying them in detail requires spectroscopy, which amounts to splitting the explosion’s light into a chemical fingerprint. But spectroscopic time on the world’s telescopes is precious, and it’s about to get a lot more strained.
The Vera C. Rubin Observatory, launching its Legacy Survey of Space and Time (LSST) in 2025, will detect millions of supernovae per year and generate millions of nightly transient alerts: automated flags for any object in the sky that suddenly brightens or changes. Today, astronomers can follow up roughly one in ten detected transients with spectroscopy. With LSST’s flood of data, that ratio crashes to about one in five hundred. Astronomers need a way to quickly sort the promising candidates from the noise, and SPLASH does exactly that, classifying around 500 supernovae per second using nothing but a photograph of the host galaxy.
Key Insight: SPLASH proves you don’t need to watch a supernova evolve to classify it. The neighborhood it explodes in tells you most of what you need to know, instantly, before the explosion is even a day old.
How It Works
The SPLASH pipeline has three stages, each feeding into the next.

Stage 1: Finding the host. When LSST flags a new transient, SPLASH identifies which galaxy it belongs to using directional light radius (DLR) association, a method that accounts for the angular separation between the supernova and potential host galaxies along with the shape and orientation of those galaxies. This step also yields a photometric redshift, an estimate of how far away the supernova is, derived from the host galaxy’s light profile.
Stage 2: Reading the galaxy’s vital signs. SPLASH feeds multi-band photometric measurements of the host galaxy into a neural network trained on nearly three million galaxies from the Zou et al. (2022) catalog. The network infers two physical properties:
- Stellar mass (M★): roughly how much “stuff” is in the galaxy, a proxy for its age and evolutionary history
- Star formation rate (SFR): how actively the galaxy is currently making new stars
These aren’t just convenient numbers. Old, massive, “red and dead” elliptical galaxies with low star formation rates overwhelmingly host Type Ia supernovae, which arise from white dwarfs (the dense, Earth-sized remnants of burned-out stars) that accrete mass from a companion until they explode. Young, gas-rich, actively star-forming galaxies preferentially host core-collapse supernovae (CCSNe), the deaths of massive, short-lived stars. The galaxy is, in a real sense, a record of which kinds of stars it’s been making.
Stage 3: The random forest classifier. SPLASH takes the inferred stellar mass, star formation rate, host-SN angular separation, and redshift and feeds them into a random forest, an ensemble of decision trees that votes on the most likely supernova type. The classifier handles both binary classification (Type Ia vs. core-collapse) and finer-grained multi-class scenarios covering subtypes like Type Ib/c, Type II, and superluminous supernovae (SLSNe).

The results hold up well against state-of-the-art photometric classifiers: 76% binary accuracy and a 69% F1-score, a combined measure of how well the classifier avoids both false alarms and missed detections. SPLASH can also tune its confidence threshold to return highly pure samples. If you only want confident Type Ia candidates for a cosmology study, you can dial up the purity at the cost of completeness. That kind of flexibility matters for real observational programs.
Why It Matters
The practical payoff is speed. Light-curve-based classifiers need to track how a supernova’s brightness changes over days or weeks before making a call. SPLASH can classify a supernova the moment it’s detected, because it only needs a picture of the host galaxy, which existed long before the explosion. That makes it uniquely suited for rapid-response programs targeting supernovae in their first hours, when signatures of the star’s pre-explosion environment are still visible.
SPLASH is more than a fast classifier, though. Because it infers physical host properties as an intermediate step, it doubles as a selection tool for studying stellar demographics. Want a sample of supernovae in low-mass, high-star-formation-rate galaxies? SPLASH can filter for that directly.
The upshot is that population-level studies of how stellar death rates vary with galactic environment become practical at scale. As LSST’s dataset grows over its ten-year run, tools like SPLASH will be needed to carve massive catalogs into scientifically meaningful subpopulations.
Bottom Line: SPLASH classifies ~500 supernovae per second using only host galaxy photometry, achieving 76% accuracy at the moment of detection. It’s a natural first-response tool for the data flood that LSST is about to produce.
IAIFI Research Highlights
SPLASH lives at the intersection of astrophysics and machine learning. Deep neural networks extract galaxy physics from photometry, then feed those physical priors into a downstream classifier. The pipeline architecture follows the actual causal chain from stellar environment to stellar death.
The two-stage approach (inferring interpretable physical properties before classification, rather than classifying from raw photometry) shows that embedding domain knowledge into an ML pipeline can improve generalization while also producing a scientifically useful intermediate product.
Rapid, environment-aware classification of millions of supernovae pushes forward multiple research fronts: constraining progenitor physics, measuring the cosmic equation of state through Type Ia cosmology, and mapping how star formation history shapes transient demographics.
Future work will incorporate photometric light-curve data alongside host properties to push accuracy further. The pipeline can scale with LSST's alert stream from day one. See [arXiv:2506.00121](https://arxiv.org/abs/2506.00121) (Boesky et al. 2025).