Basecamp, Research

Basecamp Research United Kingdom England

07.08.2025 - 18:08:48

Basecamp Research launches ZymCTRL a world-first, open source, generative AI tool that designs enzymes for more sustainable industrial processes

and ChemBioChem, peer-reviewed scientific journals. In ChemBioChem, researchers at The Institute of Biochemistry at Austria's Graz University of Technology, cited ZymCTRL's efficiency and ease of use. "ZymCtrl designs putative enzyme variants on consumer GPUs within seconds and, remarkably, it creates these sequences with only an EC number as input," wrote Horst Lechner, principle investigator for the institute, which is focuses on enzyme design that differs from what's seen in nature.

Basecamp Research is sharing ZymCTRL open source with researchers and sees an array of potential applications, including designing enzymes for disease treatment and diagnostics, biofuel production, sustainable agriculture innovations and much more.

While ZymCTRL was initially trained on publicly available datasets, it can also be integrated with other datasets, including Basecamp Research's proprietary BaseGraph database, to further optimise the model and improve sequence outputs.

Highlights

ZymCTRL was first trained on the BRENDA enzyme database, comprising 37M enzyme sequences.
- From this, the team generated sets of carbonic anhydrases, enzymes that accelerate the conversion of carbon dioxide to bicarbonate, helping capture and store CO2, and lactate dehydrogenases, enzymes that help convert sugar into energy in our cells, with no further fine-tuning for the AI model.
After producing and purifying the proteins, several showed enzyme activity despite less than 40% of their sequences resembling proteins seen in the public database. This happened with no additional adjustments to the model.
- To correct for potential biases in public databases, which have uneven sampling due a lack of biodiversity, ZymCTRL was adjusted using a wider range of lactate dehydrogenase sequences from Basecamp Research's proprietary BaseGraph dataset.
- With this fine-tuning, the team created lactate dehydrogenases with higher quality scores in silico (in computer simulations), such as better predicted local distance difference test (pLDDT) values, compared to sequences generated with no prior training.Remarkably, active enzymes continued to show significant activity at a high temperature of 45°C as well as across a broad pH range of 4.5 to 9.5 – meaning it can work or stay stable in slightly acidic to slightly basic environments – offering significant industry advantages over naturally-occurring lactate dehydrogenases. This excellent pH tolerance allows a single enzyme to be used in many different processes with different pH levels, making the enzyme very useful and adaptable for many applications.Two of the artificial lactate dehydrogenase enzymes were produced in larger amounts and successfully freeze-dried. They kept their activity and showed they could work in complex reactions under harsh conditions, supporting their potential for industrial use.

"Beyond the obvious excitement of being able to generate truly de novo proteins, the results are a further testament to the ability of Basecamp Research's dataset to produce better results compared to publicly available datasets, which barely scratch the surface of the Earth's immense biodiversity," added Dr. Glen Gowers, co-founder of Basecamp Research. "Earlier we were able to show that our BaseFold model, also powered by our dataset, outperformed AlphaFold2 in predicting protein structures. Generative AI is going to have a huge impact across biotech, and we're dedicated to collecting the data and tools needed to make its potential a reality."

The full preprint can be found here: https://www.biorxiv.org/content/10.1101/2024.05.03.592223v1

Basecamp Research invites the research community to try ZymCTRL and has released it for public use on Hugging Face: https://huggingface.co/AI4PD/ZymCTRL

For media and other inquiries, please contact press@basecamp-research.com, +44 07867 488769

About Basecamp Research

Basecamp Research is a leader in mapping biodiversity for AI-based design of biological systems. We match and refine novel proteins for our partners' exact industrial, therapeutic or diagnostic applications using BaseGraph™, a new generation of AI design that is powered by the first-ever high-resolution map of global genetic biodiversity. 

Understanding the full genetic, evolutionary, and environmental context of each protein allows Basecamp Research to design tailored proteins for specific applications without the need for expensive and time-consuming directed evolution campaigns. We're a team of explorers, scientists and policy experts driven by our ambition to protect and learn from nature's diversity, whilst delivering life-changing breakthroughs to those who need them most. 

For more information, visit www.basecamp-research.com.

Photo - https://mma.prnewswire.com/media/2439331/Basecamp_Research.jpg
Logo - https://mma.prnewswire.com/media/2357382/4763840/Basecamp_Research_Logo.jpg

Basecamp Research Logo

Cision View original content:https://www.prnewswire.co.uk/news-releases/basecamp-research-launches-zymctrl-a-world-first-open-source-generative-ai-tool-that-designs-enzymes-for-more-sustainable-industrial-processes-302174359.html

@ prnewswire.co.uk