Researchers at UT Austin Introduce Panda: A Foundation Model for Nonlinear Dynamics Pretrained on 20,000 Chaotic ODE Discovered via Evolutionary Search

Chaotic systems, such as fluid dynamics or brain activity, are highly sensitive to initial conditions, making long-term predictions difficult. Even minor errors in modeling these systems can rapidly grow, which limits the effectiveness of many scientific machine learning (SciML) approaches. Traditional forecasting methods rely on models trained on specific time series or broad datasets lacking true dynamical structure. However, recent work has demonstrated the potential for local forecasting models to predict chaotic systems more accurately over longer timeframes by learning the numerical rules governing these systems. The real challenge is achieving out-of-domain generalization—creating models that can adapt and forecast new, previously unseen dynamical systems. This would require integrating prior knowledge with the ability to adapt locally. Still, the need for task-specific data constrains current methods and often overlooks key dynamical system properties such as ergodicity, channel coupling, and conserved quantities.

Machine learning for dynamical systems (MLDS) utilizes the unique properties of such systems as inductive biases. These include fixed relationships among system variables and invariant statistical measures, like strange attractors or conserved quantities. MLDS models use these properties to build more accurate and generalizable models, sometimes incorporating probabilistic or latent variable techniques. While datasets of dynamical systems have been curated and new systems are often generated by tweaking parameters or using symbolic methods, these approaches typically don’t ensure diverse or stable dynamics. Structural stability is a challenge—small changes may not yield new behaviors, while large ones can cause trivial dynamics. Foundation models aim to address this by enabling transfer learning and zero-shot inference. Still, most current models perform comparably to standard time series models or are limited in generating meaningful, dynamic variety. Some progress has been made through techniques like embedding spaces or symbolic discovery, but a richer, more diverse sampling of dynamical behaviors remains an open challenge.

Researchers at the Oden Institute, UT Austin, introduce Panda (Patched Attention for Nonlinear Dynamics), a pretrained model trained solely on synthetic data from 20,000 algorithmically-generated chaotic systems. These systems were created using an evolutionary algorithm based on known chaotic ODEs. Despite training only on low-dimensional ODEs, Panda shows strong zero-shot forecasting on real-world nonlinear systems—including fluid dynamics and electrophysiology—and unexpectedly generalizes to PDEs. The model incorporates innovations like masked pretraining, channel attention, and kernelized patching to capture dynamical structure. A neural scaling law also emerges, linking Panda’s forecasting performance to the diversity of training systems.

The researchers generated 20,000 new chaotic systems using a genetic algorithm that evolves from a curated set of 135 known chaotic ODEs. These systems are mutated and recombined using a skew product approach, with only truly chaotic behaviors retained through rigorous tests. Augmentations like time-delay embeddings and affine transformations expand the dataset while preserving its dynamics. A separate set of 9,300 unseen systems is held out for zero-shot testing. The model, Panda, is built on PatchTST and enhanced with features like channel attention, temporal-channel attention layers, and dynamic embeddings using polynomial and Fourier features, inspired by Koopman operator theory.

Panda demonstrates strong zero-shot forecasting capabilities on unseen nonlinear dynamical systems, outperforming models like Chronos-SFT across various metrics and prediction horizons. Trained solely on 3D systems, it generalizes to higher-dimensional ones due to channel attention. Despite never encountering PDEs during training, Panda also succeeds on real-world experimental data and chaotic PDEs, such as the Kuramoto-Sivashinsky and von Kármán vortex street. Architectural ablations confirm the importance of channel attention and dynamics embeddings. The model exhibits neural scaling with increased dynamical system diversity and forms interpretable attention patterns, suggesting resonance and attractor-sensitive structure. This indicates Panda’s broad generalization across complex dynamical behaviors.

In conclusion, Panda is a pretrained model designed to uncover generalizable patterns in dynamical systems. Trained on a large, diverse set of synthetic chaotic systems, Panda demonstrates strong zero-shot forecasting on unseen real-world data and even partial differential equations, despite only being trained on low-dimensional ODEs. Its performance improves with system diversity, revealing a neural scaling law. The model also shows emergent nonlinear resonance in attention patterns. While focused on low-dimensional dynamics, the approach may extend to higher-dimensional systems by leveraging sparse interactions. Future directions include alternative pretraining strategies to improve rollout performance forecasting chaotic behaviors.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 95k+ ML SubReddit and Subscribe to our Newsletter.

The post Researchers at UT Austin Introduce Panda: A Foundation Model for Nonlinear Dynamics Pretrained on 20,000 Chaotic ODE Discovered via Evolutionary Search appeared first on MarkTechPost.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签