Nonperturbative renormalization for the neural network-QFT correspondence
Authors
Harold Erbin, Vincent Lahoche, Dine Ousmane Samary
Abstract
In a recent work arXiv:2008.08601, Halverson, Maiti and Stoner proposed a description of neural networks in terms of a Wilsonian effective field theory. The infinite-width limit is mapped to a free field theory, while finite $N$ corrections are taken into account by interactions (non-Gaussian terms in the action). In this paper, we study two related aspects of this correspondence. First, we comment on the concepts of locality and power-counting in this context. Indeed, these usual space-time notions may not hold for neural networks (since inputs can be arbitrary), however, the renormalization group provides natural notions of locality and scaling. Moreover, we comment on several subtleties, for example, that data components may not have a permutation symmetry: in that case, we argue that random tensor field theories could provide a natural generalization. Second, we improve the perturbative Wilsonian renormalization from arXiv:2008.08601 by providing an analysis in terms of the nonperturbative renormalization group using the Wetterich-Morris equation. An important difference with usual nonperturbative RG analysis is that only the effective (IR) 2-point function is known, which requires setting the problem with care. Our aim is to provide a useful formalism to investigate neural networks behavior beyond the large-width limit (i.e.~far from Gaussian limit) in a nonperturbative fashion. A major result of our analysis is that changing the standard deviation of the neural network weight distribution can be interpreted as a renormalization flow in the space of networks. We focus on translations invariant kernels and provide preliminary numerical results.
Concepts
The Big Picture
You want to understand why a crowded city works the way it does. You could track every person’s movements, every coffee run, every commute. Or you could zoom out and describe the city through traffic patterns, density flows, and aggregate behavior. This second approach, ignoring microscopic details to reveal large-scale structure, is what physicists do with the renormalization group: a mathematical framework for systematically coarsening your description from small scales to large ones. Erbin, Lahoche, and Ousmane Samary have now applied this technique to neural networks, uncovering a precise connection between machine learning and quantum field theory, the mathematical language of fundamental particles and forces.
Neural networks are opaque. They work, often brilliantly, but we struggle to explain why or how to systematically improve them. A 2020 result showed that infinitely wide neural networks are mathematically equivalent to the simplest possible quantum field theories, ones with no interactions, where outputs follow a tidy bell-curve pattern.
Real networks are finite and messy, though. The authors set out to close that gap using some of the sharpest tools in theoretical physics. Their central result is a nonperturbative renormalization group analysis of neural networks, an approach that handles networks far from the well-behaved infinite-width regime and captures behaviors that simpler approximations miss.
Key Insight: Changing the standard deviation of a neural network’s weight distribution is mathematically equivalent to flowing along a renormalization group trajectory through the space of possible networks, connecting mundane design choices directly to the physics of phase transitions.
How It Works
The foundation is the NN-QFT correspondence: a neural network’s statistical behavior, encoded in its correlation functions (measurements of how different outputs relate to each other), maps onto the mathematical structure of a quantum field theory. In the infinite-width limit this is a free field theory, equivalent to saying the network outputs follow a Gaussian process fully described by a mean and a variance.
Add finite-width corrections and you get interaction terms, the non-Gaussian parts that make both the theory and the network genuinely complex.
Previous work used Wilsonian perturbative renormalization, a systematic expansion that treats interaction terms as small corrections and builds up a solution layer by layer, like approximating a complicated curve with a polynomial. It works near the Gaussian limit. Many interesting network behaviors live far from that limit, where the expansion breaks down.
The fix: the Wetterich-Morris equation, a cornerstone of nonperturbative RG (NPRG). Rather than assuming interactions are small, NPRG tracks how a theory’s effective behavior changes as you gradually integrate out finer and finer details, scale by scale.

Here’s the twist. In standard NPRG applications (condensed matter, particle physics) you know the microscopic theory and derive the macroscopic one. With neural networks the situation is inverted: only the large-scale correlation between any two network outputs, the infrared (IR) 2-point function, is known. The authors had to reformulate the problem with boundary conditions at the large-scale end rather than the small-scale end.
The analysis distinguishes two flavors of RG flow:
- Passive RG: flowing through the space of field theories by integrating out momentum modes, the standard physics picture
- Active RG: flowing through the space of neural networks themselves, where changing hyperparameters like weight standard deviation moves you along a trajectory in network space
The active RG is where it gets interesting. When you tune a network’s weight initialization (a routine practical choice), you are performing a renormalization group transformation in a precise mathematical sense.

The team focuses on translation-invariant kernels, which let momentum-space methods apply cleanly, and works within the local potential approximation (LPA), a controlled simplification that preserves the essential physics while keeping the equations tractable. They analyze both symmetric and symmetry-broken phases, tracking fixed points and flows numerically.
The paper also tackles a subtle issue around locality. In standard QFT, locality means fields interact only at the same spacetime point, rooted in causality. Neural network inputs are arbitrary vectors with no natural spacetime structure, so that constraint doesn’t apply. The RG itself generates a notion of locality, with modes organized by their eigenvalues under the kernel rather than by spacetime position. When input data lacks permutation symmetry, the authors argue that random tensor field theories provide the right generalization, connecting to an active area of mathematical physics.
Why It Matters
If weight initialization standard deviation corresponds to a position in theory space, then finding good initializations becomes a question about the geometry of that space. Which fixed points are stable? Which flows lead to networks that learn well? The renormalization group could, in principle, give systematic guidance for hyperparameter selection, architecture design, and generalization.
The techniques developed here (reformulating NPRG with IR rather than UV boundary conditions, distinguishing passive and active RG flows) are useful beyond this specific application. They apply to any system where only coarse-grained information is available at the outset, pushing the NN-QFT correspondence into genuinely nonperturbative territory.
The suggestion that tensor field theories might describe networks processing structured data opens a separate direction: connecting tensor models, used in quantum gravity and combinatorics, to machine learning.
Bottom Line: By applying the Wetterich-Morris nonperturbative renormalization group to neural networks, this paper turns a mathematical analogy into a precise, computable framework. Tuning a network’s weight distribution is literally a renormalization flow, and the analysis extends the NN-QFT correspondence well beyond its infinite-width origins.
IAIFI Research Highlights
This work connects quantum field theory and machine learning by extending the NN-QFT correspondence into the nonperturbative regime, showing that neural network hyperparameter choices have precise counterparts in RG flow equations.
The framework provides a principled, physics-based formalism for analyzing neural network behavior beyond the Gaussian limit, with potential applications to initialization strategies, generalization, and architecture design.
The paper adapts the Wetterich-Morris exact RG equation to a novel setting where only IR data is available, advancing nonperturbative QFT methods and connecting them to tensor field theories relevant to quantum gravity.
Future directions include extending beyond the local potential approximation, incorporating derivative corrections, and applying tensor field theory methods to networks with structured inputs; the paper is available as [arXiv:2108.01403](https://arxiv.org/abs/2108.01403).
Original Paper Details
Nonperturbative renormalization for the neural network-QFT correspondence
2108.01403
Harold Erbin, Vincent Lahoche, Dine Ousmane Samary
In a recent work arXiv:2008.08601, Halverson, Maiti and Stoner proposed a description of neural networks in terms of a Wilsonian effective field theory. The infinite-width limit is mapped to a free field theory, while finite $N$ corrections are taken into account by interactions (non-Gaussian terms in the action). In this paper, we study two related aspects of this correspondence. First, we comment on the concepts of locality and power-counting in this context. Indeed, these usual space-time notions may not hold for neural networks (since inputs can be arbitrary), however, the renormalization group provides natural notions of locality and scaling. Moreover, we comment on several subtleties, for example, that data components may not have a permutation symmetry: in that case, we argue that random tensor field theories could provide a natural generalization. Second, we improve the perturbative Wilsonian renormalization from arXiv:2008.08601 by providing an analysis in terms of the nonperturbative renormalization group using the Wetterich-Morris equation. An important difference with usual nonperturbative RG analysis is that only the effective (IR) 2-point function is known, which requires setting the problem with care. Our aim is to provide a useful formalism to investigate neural networks behavior beyond the large-width limit (i.e.~far from Gaussian limit) in a nonperturbative fashion. A major result of our analysis is that changing the standard deviation of the neural network weight distribution can be interpreted as a renormalization flow in the space of networks. We focus on translations invariant kernels and provide preliminary numerical results.