← Back to Timeline

Smoothness Errors in Dynamics Models and How to Avoid Them

Foundational AI

Authors

Edward Berman, Luisa Li, Jung Yeon Park, Robin Walters

Abstract

Modern neural networks have shown promise for solving partial differential equations over surfaces, often by discretizing the surface as a mesh and learning with a mesh-aware graph neural network. However, graph neural networks suffer from oversmoothing, where a node's features become increasingly similar to those of its neighbors. Unitary graph convolutions, which are mathematically constrained to preserve smoothness, have been proposed to address this issue. Despite this, in many physical systems, such as diffusion processes, smoothness naturally increases and unitarity may be overconstraining. In this paper, we systematically study the smoothing effects of different GNNs for dynamics modeling and prove that unitary convolutions hurt performance for such tasks. We propose relaxed unitary convolutions that balance smoothness preservation with the natural smoothing required for physical systems. We also generalize unitary and relaxed unitary convolutions from graphs to meshes. In experiments on PDEs such as the heat and wave equations over complex meshes and on weather forecasting, we find that our method outperforms several strong baselines, including mesh-aware transformers and equivariant neural networks.

Concepts

graph neural networks relaxed unitary convolutions oversmoothing mesh-based pde solving spectral methods surrogate modeling geometric deep learning equivariant neural networks symmetry preservation neural operators physics-informed neural networks loss function design

The Big Picture

Imagine teaching a child to paint by telling them: “Never let colors blur together.” That rule works great for crisp geometric art. But hand them a watercolor brush and ask them to paint a sunset, and suddenly the rule becomes a straitjacket. The colors need to bleed into one another.

Researchers at Northeastern University’s Geometric Learning Lab hit exactly this mismatch when using neural networks to simulate physical systems. The networks in question are graph neural networks (GNNs), models that treat physical simulations as graphs: collections of points connected by edges.

GNNs are powerful tools for modeling how heat spreads across a metal plate or how wind patterns evolve across a planet. But they carry a troublesome habit called oversmoothing. After repeated rounds of information-sharing between neighboring points, every location starts to look identical to its neighbors. Distinct features blur into a gray average, and your carefully initialized simulation melts into mush.

The proposed fix, unitary convolutions that lock each point’s value in place so it can’t blend with its neighbors, turned out to be its own kind of straitjacket. Edward Berman, Luisa Li, Jung Yeon Park, and Robin Walters proved that when the physics itself requires smoothing, forcing the network to never smooth anything produces systematic errors in the other direction. Think heat diffusion, where hot spots naturally spread and equalize. They call this failure mode undersmoothing.

Their solution: a new class of relaxed unitary convolutions that smooth exactly as much as the physics demands and no more.

Key Insight: Too much smoothing (oversmoothing) and too little smoothing (undersmoothing) are both failure modes for neural network PDE solvers. The same mathematical constraint meant to fix the first actively causes the second.

How It Works

The Rayleigh quotient measures how much a signal varies between neighboring points on a graph. High values mean the signal is rough and spiky; low values mean it’s smooth and uniform. Standard GNNs drive the Rayleigh quotient toward zero (oversmoothing), while unitary convolutions keep it frozen in place.

The team’s first theoretical result is a lower bound on approximation error for unitary networks. If your physical system’s solution changes its total “energy” in a direction-dependent way, a strictly unitary network cannot represent that behavior accurately. Heat diffusion is exactly such a system: total heat energy decreases as hot spots spread and cool. No amount of training fixes this. Unitarity isn’t just suboptimal; it’s provably wrong for these tasks.

The fix comes in two steps:

  1. Relax the constraint: Instead of forcing the network’s internal parameters to be exactly unitary, the network learns a unitary component plus a small correction term. The Rayleigh quotient can now decrease when the physics calls for it, while still resisting the runaway smoothing that plagues standard GNNs.
  2. Generalize to meshes: Real surfaces like an armadillo figurine or a weather simulation grid aren’t simple graphs. The team extended the framework from abstract graphs to triangular meshes, where a shape’s surface is tiled by small connected triangles, accounting for the varying geometry of each face.

The resulting architecture, R-UniMesh (Relaxed Unitary Mesh convolutions), splits the difference between the two failure modes.

Figure 1

The figure shows the payoff. On the heat equation benchmark over the armadillo mesh at timestep 190, the competing EMAN model has blurred the temperature field into a featureless smear (classic oversmoothing). The Hermes model swings the other way, preserving sharp artificial gradients that shouldn’t exist (undersmoothing). R-UniMesh tracks the ground truth closely, capturing both the smooth spreading of heat and the finer structure along the mesh’s ridges.

Why It Matters

The immediate practical stakes are weather forecasting and climate modeling. On real-world atmospheric data, R-UniMesh outperforms strong baselines including mesh-aware transformers and equivariant neural networks that bake in physical symmetries like rotation invariance. Getting smoothness right turns out to matter as much as getting geometric symmetry right, at least on these benchmarks.

The same smoothness mismatch should lurk anywhere researchers use neural networks to simulate PDEs on irregular geometries. Blood flow through arteries. Electromagnetic fields around antenna structures. Seismic waves through geological strata.

The Rayleigh quotient framework also works as a diagnostic tool: measure how your network changes the smoothness of its inputs, compare that to how the true physics changes smoothness, and you’ll know whether your architecture is over- or under-constrained. That’s a reusable lens, not just a one-off fix.

Open questions remain. The relaxation parameter that controls how much smoothing is allowed currently requires per-task tuning. Learning to set it automatically, perhaps by estimating the physics’ natural smoothing rate from data, is the obvious next step. Extending the framework to the non-triangular discretizations common in finite element analysis is another.

Bottom Line: By proving that “never smooth” is just as broken as “always smooth,” this work gives neural PDE solvers a mathematically grounded middle path and backs it up on everything from abstract heat equations to real atmospheric forecasting.

IAIFI Research Highlights

Interdisciplinary Research Achievement
This work connects differential geometry and deep learning, formalizing physical smoothing behavior (a classical PDE concept) as a trainable property of neural network architectures on scientific meshes.
Impact on Artificial Intelligence
Relaxed unitary convolutions provide a new architectural primitive for graph and mesh neural networks that provably avoids both oversmoothing and undersmoothing, outperforming mesh-aware transformers and equivariant networks on dynamics benchmarks.
Impact on Fundamental Interactions
More accurate neural surrogates for PDEs on complex surfaces speed up simulation of physical systems, from heat and wave propagation to atmospheric dynamics, that are central to fundamental physics modeling.
Outlook and References
Future work will explore automatic tuning of the relaxation parameter and extension to non-triangular mesh discretizations; see the paper ([arXiv:2602.05352](https://arxiv.org/abs/2602.05352)) and code at [github.com/EdwardBerman/rayleigh_analysis](https://github.com/EdwardBerman/rayleigh_analysis).