Giovanni Paternostro and Guy Salvesen have spoken with Gene Yeo. Gene is Professor of Cellular and Molecular Medicine at the University of California San Diego (UCSD), the founding Director for UCSD’s Center for RNA Technologies and Therapeutics, the founding Chief Scientific Advisor for Sanford Laboratories for Innovative Medicine, a founding member of the Institute for Genomic Medicine and member of the UCSD Stem Cell Program and of Moores Cancer Center. He is also a co-founder of several biotech companies and serves on the scientific advisory board of the Allen Institute of Immunology. Gene is the founder of the SCREEN (San Diego Covid-19 Research Enterprise Network, 2020) and founding member of the SEARCH (San Diego Epidemiology and Research for Covid Health, 2020) alliances in San Diego. SCREEN had ~1000 scientist members in San Diego focusing on grassroots research coordination and community outreach. His primary research interest is in understanding how RNA processing is regulated and the roles that RNA binding proteins (RBPs) play in development and disease. He is organizing a Symposium on RNA and AI that will take place in La Jolla on October 29th:
https://rnacenter.ucsd.edu/events/rna-and-ai-symposium.html.
Dear Gene,
What could be achieved if there was a public or nonprofit AI effort with the same scale and level of funding as the current large private efforts? What would be the benefits for society?
Gene:
AI is obviously contributing to so many different aspects in science, but there are two problems that especially resonate with me.
One is the "virtual cell", a project which was initiated by Sidney Brenner. He worked on this with Terry Sejnowski and with many other scientists. What they were asking is: if you were to build a model of every chemical and biochemical reaction, and use every available biomedical data set, could you predict a cell's response in different conditions?
In my own work, I have tried to understand the problem of cellular resilience and how it translates into organismal resilience. Why do people age at different rates? Why are some more immune to diseases than others? Can you develop a digital avatar of yourself, starting from cells all the way up to the individual, and use it to predict your future health? You could predict if someone will have a disease before they get it, and then you could start thinking how to repair it prophylactically. In the RNA space, can you provide genetic therapies decades before the disease will appear? This would require an understanding of how all the parts and components act together holistically and then emerge into a digital health twin of yours. If we could do that, it would be amazing, and AI can play an important part.
In biology the bottleneck is in the data. We're collecting data types at higher throughput now, but with not enough consistency or using the right sort of models. For life sciences, I think the problem is not in the algorithms, but in the wealth of data that are generated at a sufficient high quality for AI models.
The other exciting area is the development of therapeutics. Biotech and pharma companies work on therapeutics that make money. In a nonprofit setting, we could work on therapeutics that cure people and may or may not lead to a direct profit for a company. I think there's a lot of room for smart therapeutics in a nonprofit environment.
We have another question.
We are encouraging researchers at different career stages to share ideas about complex science problems that could benefit from a large-scale AI effort. We found that motivation and recognition could be provided if you and other well-known scientists were willing to talk to people that suggest the best ideas. You would be the judge and decide if any idea is for you deserving of attention. Any scientist selected might receive advice but could also be a potential collaborator. Many ideas will be produced, and society will take notice. Would you be willing to talk to any of these scientists?
Gene:
Yes, I am always happy to participate in interesting scientific discussions. I think people are happy to share ideas if they realize that the problems are too hard to be solved by any individual. We can think of efforts like the Human Genome Project, or the deciphering of the Enigma machine codes during World War 2, or the Manhattan Project. In these cases, very large teams of people could put aside their egos to solve a problem together, because the reward mattered for everybody. The problem must be big enough.
As I mentioned, in the life sciences the challenge is that high-throughput data sets come from many individual groups, and they're not always on the same system, or collected in the same way, with the right sort of controls, such that you could just absorb the world's collective data sets and then build something similar to a large language model. There's so much variation that the algorithms would just be learning the variation. The bottleneck is to pick the right questions and to fund the right consolidated efforts to generate the data that AI needs.
Let me give you a very simple example in the RNA space. There are about 140 different chemical modifications you can make on RNA. How many of these can we read by any sequencing technology? Only very few. Why not the other 120 or 130? I think the technologies we use today could read them. We just don't have the right training data. There are companies that could make the training data, it would be expensive, but they could synthesize chemically these different modifications. Why don't they do that? Because there's no market. There's no market because they don't know what diseases they could diagnose using the RNA modification.
The commercial players won't invest because there's no market, but this is exactly where government and philanthropy come in. To fill the gap, to create enough fundamental knowledge to lower the barrier for private efforts.
Another point is the potential role of patrons in furthering scientific knowledge, as they have often done in the arts over the centuries. The challenge is to convert scientific knowledge into something that everyone appreciates and therefore something that has value for potential patrons. Many people don't understand science or appreciate it and sometimes are even suspicious of it. I think we have been remiss in how we communicate the importance of what we do to the public.
The historical section of the website presents an analysis of the role of patrons in the rise of modern science that might interest you.
I look forward to reading it.
Regarding data for AI, many have started to worry that LLMs might not keep improving at the present rate because they might be close to having used all he information available on the internet. Scientists could be an almost endless source of data, both by suggesting ideas and by producing the experimental data.
Yes, but we must produce the right type of data, and in the right format.
Many thanks, we really enjoyed the discussion, and we look forward to continuing it.