Jack Gilbert is a UCSD Professor in Pediatrics and at the Scripps Institution of Oceanography (SIO), Deputy Director for Research at SIO, Associate Vice Chancellor for Marine Sciences at UCSD and Director of both the Microbiome and Metagenomics Center and the Microbiome Core Facility. He cofounded the Earth Microbiome Project and the American Gut Project. In 2021 Dr Gilbert became the UCSD PI for the National institutes of Health’s $175M Nutrition for Precision Medicine program. In 2023 he became President of Applied Microbiology International and won the IFF Microbiome Science Prize.
Dear Jack,
What could be achieved if there was a public or nonprofit AI effort with the same scale and level of funding as the current large private efforts? What would be the benefits for society? You work in a complex biological field that is important both for human health and for the environment, the study of the microbiome.
Jack:
The key problem with the microbiome is that it is incredibly high-dimensional. You have hundreds to thousands to millions of species in a given ecosystem. Each one of those species has a genetic repertoire of maybe 7,000 genes, and each one of those genes embodies a pathway which is interconnected to the potential generation of hundreds of thousands if not millions or billions of chemical products. So, the dimensionality of the microbiome is incredible. The other problem is that it changes all the time.
The microbiome has an important role in many fields: human medicine, environmental management, pollution, remediation, carbon dioxide sequestration with microbiome products, improving agricultural sustainability, improving aquaculture sustainability and productivity, climate gas activations or removing pollution from the air and all the applied programs which would hit the UN Sustainable Development Goals. Using microbial solution to fix these problems is hampered by the fact that we don't fundamentally understand how microbes interact and communicate.
Some of the best work on this problem is being done by Karsten Zengler, at UCSD. His work is demonstrating that you can predict which organisms a microbe might compete with.
If you have a community of 100 species and add in the 101st species, which organisms will that species compete with, which organisms will have synergies with, and how will that affect the abundance of the organisms that were there to begin with? This seems a simple problem but turns out to be highly intractable computationally. You can do some predictions, but you have to make huge numbers of assumptions and your predictive accuracies are somewhere about 60 to 65 percent.
So how do you improve that?
In my opinion you need to use AI to interrogate experimental systems that are well parameterized, to identify causal relationships between metabolic interacting pathways.
We talk about that a lot, we put it in our reviews, we put it in our perspectives. There is some basic efforts to try and leverage AI platforms to facilitate that, but the problem is that nearly all the statistical approaches that would allow us to identify those metabolic relationships generate huge numbers of potential positive hits.
We might be able to leverage AI to look at which interactions might be most probable and interrogate ever increasing data set sizes, especially longitudinal data sets, data sets over time where we can track and see how metabolism is shifting within a community.
If we were able to generate those data, or leverage existing data and add more data, and use AI to identify probable causal metabolic interactions and then use the outcome of this probability assessment, to determine predictive accuracy for how microbial communities communicate metabolically and compete and synergize, then I think we'd have the ability to restrict the hypotheses and the needed evidence to a limited number of experiments that could be tractable in a lab.
Otherwise, we would be just searching such an enormous state space, such an enormous hyperdimensional volume of potential experiments, that it becomes next to impossible.
Our problem is so much bigger than genetics. The human genome is paltry in comparison to the complexity of dealing with a microbial community. Also, human genetics doesn't change much. There is epigenetics, there are some mutations over time, but not a lot compared to the microbiome. The microbiome is undergoing changes every hour metabolically and in terms of abundance every day, so it's just much more complex.
So that's where I see AI being the biggest utility. And what can we apply that to? Well, I want to give two examples.
Right now, I think we have identified a microorganism which promotes colorectal cancer.
This is just one of many examples I can give in human health. We hypothesized that this microorganism stimulates tumorigenesis using an enzyme called collagenase, which attacks collagen in the gastrointestinal lining and causes a change in extracellular polysaccharide synthesis, which leads to a shift in the oxidative stress of the cell. This can trigger cancerous cell formation.
This mechanism of action for causing colorectal cancer is also held by lots of other species. The ability of this microorganism to do damage appears to be related to its abundance, the abundance of other species, the diet and genetic predisposition and overall oxidative stress, which is associated with pollution and psychological stress and many other factors.
Even if I had a drug that could knock out this bug, or if I had another bug competing against it, I couldn't easily predict the consequences because the interaction space and number of confounding variables to predict efficacy for that drug are incredibly large. It might work in 40% of the people, but in 60% of the people is going to be completely useless because of these other confounding variables.
So not only do I need to be able to identify how this microbe is interacting biochemically with other microbes in the environment, I need to understand the dose dependent response by which it will impact colorectal cancer formation. And then I need to understand its interactions with the human environment in terms of diet and lifestyle.
So, everything is connected and there is no easy way to find a solution. You need to do a massive AI-based interrogation.
The other application I want to highlight is to the soil and to the environment.
We have identified lots of different bacterial strains which appear to promote the growth of plants and make plants more resilient to drought. They change the carbon and nitrogen metabolic pathways in the soil to help create a more porous soil and a soil that's more likely to support plants under current climate change and reduce erosion.
We are trying to manipulate the microbiology of soil, but this is a very big problem, because soil is the most diverse microbial ecosystem on the planet. We're trying to link how microbes that we introduce into the soil will affect and change the carbon, nitrogen, phosphorus and sulfur metabolism inside the soil. That is a metabolic interaction like we saw in the gut, but, additionally, the plants and crops that we work with pump out their own incredibly diverse cornucopia of chemical compounds, mostly carbon-based compounds, which stimulate microbial activity in the soil.
What we're trying to do is re-engineer the microbial ecosystem of the soil and the rhizosphere of the plants. A particular crop species produces these 500 different carbon substrates out of its roots, it exudes them into the environment, and then we're looking to take bacteria from our collections, which can grow on those substrates and create a metabolic cascade into the soil environment to help to lock away carbon and produce glomalin compounds, which will make the soil healthier and more resilient.
One plant for a variety can produce 500 different types of carbon and interact with hundreds of species of probiotic bacteria to change the metabolic chemistry of a cubic foot of soil, by stimulating the productivity of potentially 100,000 different bacterial, archaeal and fungal species.
Numerically, that's an insurmountable computational problem to solve. We cannot predict how the interactions will lead to impacts. We have to simplify it to the point of it being almost overfitted and unusable for precision agriculture.
For the soil project, we recently received three million to launch a new program with field trials and greenhouse studies to generate the data to see if we can answer these kinds of questions. And for the colorectal cancer project, we have a $400,000 clinical trial starting this month, which will hopefully allow us to do a better investigation of how these microbes interact and how ecological dynamics occur in the guts of people who are at risk of this type of cancer.
There is one more thing we would like to ask you.
We are encouraging researchers at different career stages to share ideas about complex science problems that could benefit from a large-scale AI effort. We found that motivation and recognition could be provided if you and other well-known scientists were willing to talk to people that suggest the best ideas. You would be the judge and decide if any idea is for you deserving of attention. Any scientist selected might receive advice but could also be a potential collaborator. Many ideas will be produced, and society will take notice.
Jack:
I would be happy to be involved. I love having those discussions. I love stimulating ideas.
I have always been very collaborative and there's way more that needs to be done that we can never handle. I can even come up with very discrete research programs and I'm happy to share those if anybody wants to take our data and do something fun with it in the AI space.
REFERENCES
- Gaines, Sara, et al. "Western diet promotes intestinal colonization by collagenolytic microbes and promotes tumor formation after colorectal surgery." Gastroenterology 158.4 (2020): 958-970.
- Zarraonaindia, Iratxe, et al. "Response of horticultural soil microbiota to different fertilization practices." Plants 9.11 (2020): 1501.