Atomistic molecular dynamics (MD) simulations are essential for understanding the dynamics of biological systems. However, achieving accurate simulations remains a challenge. Indeed, the quest for precision is hindered by quantum chemistry, with the computational cost of Ab Initio MD (AIMD) remaining too high and its execution far too slow to simulate large systems at the time scales necessary in biology.
Similarly, other limitations exist. For example, in classical mechanics, they are reflected in the use of classical force fields (FF), empirical methods that are certainly fast but do not capture quantum precision. Similarly, in machine learning (ML), the use of neural network models (neural network potentials or NNPs) often encounters a problem of transferability of existing NNPs, which often struggle to apply to condensed phase systems, particularly charged species, e.g., ions in solution, etc... In order to solve these problems, we developed FeNNix-Bio1, a foundation (i.e. universal) neural network potential trained exclusively on synthetic data from quantum chemistry. It is designed to provide predictive condensed-phase DM simulations, including nuclear quantum effects in the dynamics for greater accuracy. The model comes in two versions: FeNNix-Biol(S), a lightweight version optimized for high throughput, and FeNNixBiol(M), a heavier version offering increased accuracy.
KEY CAPABILITIES OF THE MODEL
FeNNix-Bio1 demonstrates unprecedented transferability and accuracy for biological systems such as Water Properties and Ions in Solution: It successfully models liquid water and charged species, a historical weak point of NNPs. Thus, it is able to predict for the first time properties such as Free Energies of Hydration (FEH): The (M) version achieves an average error of only 0.37 kcal/mol on a set of 25 molecules, surpassing the accuracy of state-of-the-art force fields. Regarding complex environments, it is capable of very well describing free energy landscapes. It thus faithfully reproduces the free energy profiles (Ramachandran angles) of the alanine dipeptide, confirming its high transferability. It is the first foundation ML model capable of simulating the reversible folding of a protein (Chignoline) and recovering the expected metastable states. The model is also the first model capable of performing affinity calculations of a pharmaceutical ligand for its protein target (i.e., absolute free energy calculation), allowing for direct comparison of these predictions with experimental data. Finally, being based on quantum mechanics, FeNNix-Bio1 includes chemical reactivity: The model natively integrates the ability to simulate bond breaking and formation, and can be effectively fine-tuned on specific reaction data.
A MODEL THAT IS NOT TOO DEMANDING TO TRAIN, FAST, AND "SCALABLE"
FeNNix-Bio1 was trained exclusively on synthetic quantum chemistry data. No experimental data was used for the initial training of the foundation model. This approach ensures that the model intrinsically learns fundamental physics (the laws of quantum mechanics) without being biased by the errors or limitations of empirical force fields or experimental data. The specific architecture of the NNP is optimized for rapid inference, and the training cost is economical, i.e., less than 48 hours on a compute node. Thanks to a massively parallel and optimized implementation on GPUs, the inference of FeNNix-Bio1 is much faster than previously developed NNPs. It allows the simulation of massive systems of up to 7 million atoms, for example: the complete SARS-CoV2 Spike glycoprotein) and offers optimized scalability across multiple compute nodes.
TOWARD REACTIVE SIMULATIONS OF LARGE BIOLOGICAL SYSTEMS
FeNNix-Bio1 sets a new benchmark by providing fast, accurate, and inherently reactive MD simulations for biological systems. By combining the speed of empirical force fields with the precision of quantum mechanics, it represents an advancement for atomistic simulations in drug design, offering the possibility to capture crucial dynamic and chemical phenomena that were previously inaccessible.
The upcoming arrival of Exascale machines will allow for high-precision and reactive modeling of large systems of medical interest, further advancing the digitization of drug discovery.
References: A Foundation Model for Accurate Atomistic Simulations in Drug Design. T. Plé, O. Adjoua, A. Benali, E. Posenitskiy, C. Villot, L. Lagardère, J.-P. Piquemal, 2025, preprint : DOI: 10.26434/
A key figure:
"The training cost is economical, i.e., less than 48 hours on a computing node."
A researcher's quote/epigraph:
"FeNNix-Biol was trained exclusively on synthetic quantum chemistry data. No experimental data were used for the initial training of the foundation model."