High-accuracy data for models that understand matter.
Many Body generates high-accuracy molecular and materials datasets so that pharma, materials, and chemicals teams can train scientific foundation models that reflect real electronic behavior.
data your models wish they were trained on.
We generate and curate structured molecular and materials data at electronic-structure accuracy, then package it so it slots into real ML pipelines.
Scientific AI is advancing, but progress is uneven. Most success occurs where physics is smooth and approximations behave predictably. The hardest regimes, where collective electronic behavior and many-body interactions dominate, remain underexplored. These are also the regimes with the highest scientific and economic value.
Today’s models inherit the limitations of their training data. DFT reshaped computational science, but what began as an approximation gradually became treated as ground truth. Learning systems trained on this data reproduce its behavior, including its blind spots.
Scientific AI does not suffer from a lack of data volume. It suffers from a lack of coverage. What is required is reference-grade, post-approximation data that captures true electronic behavior where it matters most.
Many Body exists to build this missing data layer.
Many Body is for teams already serious about AI for Science, and who think the next edge will come from their data, not just their model size.
- You’re training, or planning to train, a molecular or materials foundation model.
- You see your models fail on edge-cases or chemistries dominated by strong correlation, metal coordination, or non-ground-state effects
- You want a clean way to bring higher-accuracy physics into your training and evaluation loop.
Examples of teams we support:
· Pharma / biotech building generative models for small
molecules or biologics.
· Materials & battery teams optimising stability, transport,
or interfaces.
· Industrial chemistry groups working on catalysts, process
conditions, or surface phenomena.
If you’re somewhere on that map, we’d be happy to explore where high-fidelity data could move the needle for your applications.
We’re starting with a small group of partners. If you’d like to gain a competitive advantage early, reach out.