
The researchers revealed that the redox cofactor preference of malic enzymes can be strikingly converted by applying phylogenetic analysis to machine learning, without experimental screening. Machine learning uses the structurally homologous but functionally distinct enzymes’ amino acid sequences as input datasets to efficiently navigate toward the target function, and potentially provide new fundamental insights into enzyme–substrate specificity. Credit: 2022, Teppei Niide, Logistic Regression-guided Identification of Cofactor Specificity-contributing Residues in Enzyme with Sequence Datasets Partitioned by Catalytic Properties, ACS Synthetic Biology
Enzymes perform a number of useful functions that can be leveraged in applications such as in pharmaceutical and biofuel production. In order for enzymes to function effectively in laboratory or industrial settings, rather than within their natural cellular environment, mutations must be introduced that alter the amino acid sequence in a way that retains or improves function in the new environment. This process often requires extensive trial-and-error to identify the amino acids that should be mutated, and while computer-based methods can help, they typically rely on the crystal structures of enzymes, which may not be available. Researchers from Osaka University have now developed a new AI method that uses phylogenetic analysis to better identify candidate amino acids for mutation without needing the enzyme’s crystal structure.
For their study, the researchers focused on amino acid residues that contribute to substrate and cofactor specificity. The team used a dataset containing nearly 1,000 amino acid sequences of malic enzymes from different species to train a logistic regression model to identify and rank candidates for mutation. Across diverse species, these enzymes have the same fold structure but exhibit different cofactor specificity due to amino acid mutations occurring throughout the phylogenetic tree. By identifying amino acid sequences that did not change over the course of evolution, the researchers could identify mutations that are adaptations to different cellular conditions in different species. The AI model ranks mutations based on their contribution to cofactor specificity, which can guide engineers in targeting sequences with less guesswork in their experiments.
The team demonstrated the utility of their AI method by using the model to identify amino acid sequences driving redox cofactor specificity in NADP+-dependent and NAD+-dependent malic enzymes, which enabled them to engineer the Escherichia coli malic enzyme from NADP+ to NAD+ dependence. The crystal structure of the E. coli malic enzyme has not yet been elucidated, and the model showed that the residues contributing most to cofactor specificity are ones that may be difficult to identify from crystal structure observations, the authors wrote. Therefore, the researchers’ machine learning model can be valuable for facilitating engineering of enzymes both with and without available crystal structures. This study was published in ACS Synthetic Biology.
“By using artificial intelligence, we identified unexpected amino acid residues in malic enzyme that correspond to the enzyme’s use of different redox cofactors. This helped us understand the substrate specificity mechanism of the enzyme and will facilitate optimal engineering of the enzyme in laboratories,” said co-senior author Hiroshi Shimizu.
This work has the potential to dramatically accelerate and improve the success of substantially reconfiguring an enzyme’s specific mode of action without fundamentally altering its function, which could aid future advances in fields such as biomedicine and biofuel production, the researchers said.