Key Takeaways
Facts: it takes an average of 15 years from the moment a drug molecule is discovered to commercial use and only 1 in 10 drugs make it through to the finish line.
🏆 Holy grail: can AI increase the chance of a drug making it to patients to >10%? Back of the envelope calculation: assuming peak sales of $1bn (low bar), a 10% increase in the Phase 3 PoS is incremental rNPV of ~$400M.
⭐ Low hanging fruit: can AI reduce the drug discovery time i.e. the time it takes to discover a viable drug candidate to < 5 years? For now I only focus on drug discovery. Same back of the envelope calculation, $1bn peak sales, each year saved in drug discovery is rNPV of ~$30-35M (time value 10+ years and 10% PoS).
🧬 AI models like representative, diverse, well structured data. Human biology is complex, dynamic and chaotic. An average human body has seven octillion atoms and any one of these atoms can cause, prevent, or cure a disease.
Lack of rich multimodal data across biological scale (cell to tissue to organ to human) and disease states makes one conclude that truly predictive models that can make a dent on the PoS of clinical trials are a few years away.
We are not in a place where AI can predict whether a molecule will fulfill its promise or won’t cause unwanted/unmanageable toxicity in clinical trials.
⏩ What about the low hanging fruit?
Absolutely yes. Chemistry and physics are more amenable to computational tools than biology and AI is inverting the traditional drug discovery process. According to BCG, AI has already reduced drug discovery timelines by as much as 35% to 40%. In an industry where monetization is constrained by finite patent life, each additional year on the market is significant.
🔸 AI: Guardian Angel or Cognitive Amplifier
Cognitive amplifier 100%, Guardian Angel not yet but that’s the future both literally (impact on health outcomes) and figuratively (life science doesn’t lag other sectors in shareholder returns).A basic primer on how to think about AI in drug discovery and development without getting lost in the complexity of the space.
Detailed views if you’re curious about the basics
This is meant for those who are not in the details of techbio, biotech and pharma industry but are curious and would like to understand where we are today and possibilities for the future.
We have all heard and read that drug discovery and development is long, complex, and high-risk. More specifically, it takes an average of 15 years from the moment a drug molecule is discovered to its approval for commercial use and it’s not just the time but only 1 in 10 drugs make it through to commercial use.
Drug discovery is the first stage, takes approximately 5 years out of that 15-year timeline and the goal is to find and design a molecule that can modulate (inhibit or activate) the function of a biological target that is thought to cause the disease.
Why does drug discovery take so long?
It usually starts with understanding the driving mechanisms of disease. An average human body has seven octillion (7×1027) atoms and any one of these atoms can cause, prevent, or cure a disease.
Let’s assume the underlying disease drivers are well understood (big if), it is then a game of lego at the molecular level. The universe of potential drug-like molecules is estimated to be between 1030 to 1060.
How do we find the molecules that bind to the biological target to modify its activity. If we find a molecule that binds to the target, it needs to be synthesizable. Stable to survive acidic conditions of the stomach and the liver enzymes trying to break it down and soluble i.e. hydrophilic but hydrophobic to bind to its target.
This is like searching for a needle in a haystack in the 1030 to 1060 combinatorial space of potential molecules and has historically involved a lot of back and forth and 3 years or more before a molecule with viable properties that can bind to the biological target of interest is identified.
Once a drug candidate is identified then it moves through a series of additional in vitro and in vivo tests including animal testing to make sure it is safe for human trials. This takes another 18-24 months including manufacturing the drug for human trials. Around 50% of the drugs fail at this stage due to safety issues that arise during animal testing.
If a molecule has made it this far then it gets into human trials which take an average of 7-10 years and only 10% of drugs entering clinical trials make it to commercialization.
Why do such a high proportion of drugs fail in clinical trials?
Analyses of clinical trial data from 2010 to 2017 shows that 90% failures in drug development are due to: lack of clinical efficacy (40%–50%), unmanageable toxicity (30%), poor drug-like properties (10%–15%), and lack of commercial need and poor strategic planning (10%).1
The above data on timelines and failure rate is pre-AI, so the obvious questions become can the use of AI in drug discovery and development-
The holy grail with the biggest impact on returns and health outcomes:
- increase the chance of a drug making it to patients to >10% by substantially improving the efficacy of drugs in humans and simultaneously keeping toxicity low/manageable?
& the low hanging fruit:
- reduce the drug discovery time i.e. the time it takes to discover a viable drug candidate to < 5 years?
AI with its ability to process vast datasets and uncover hidden patterns is the technology that can search for the needle in the haystack of 1030 to 1060potential molecules. Is it as simple as embracing AI and that’ll automatically translate into drugs that don’t fail in clinical trials due to lack of efficacy or unwanted toxicity?
The efficacy of a molecule depends on how closely the biological target the molecule is designed to modulate is related to the disease. Human biology is complex, dynamic and chaotic. The 37 trillion cells in the human body are organized in a hierarchical structure from cells to tissues to organs to the human and all these structures are constantly communicating through electrical and chemical signals.
Current AI models are not trained on the dynamic hierarchical molecular interactions but on static and limited data sets. We do not have rich multimodal data across biological scale (cell to tissue to organ to human) and disease states which limits AI models of human biology. For many complex diseases such as Alzheimer’s, and other neurodegenerative diseases, even the underlying mechanism is poorly understood.
We can safely say that no current AI model is anywhere near the capability needed to improve the baseline of ‘40%-50% of drugs fail due to lack of clinical efficacy’.
Can AI ever model the dynamic and complex nature of human biology? Priscilla Chan and Mark Zuckerberg via their Biohub announced their intention to go all in to solve this problem.
“Advances in artificial intelligence are already starting to give us new tools to understand and engineer biology,” said Head of Science Alex Rives. “As we bring together frontier artificial intelligence with frontier biology, it will become possible to build predictive models of biology that could greatly accelerate the rate at which fundamental new scientific discoveries can be made.”
What about toxicity, the next biggest driver of failed clinical trials?
Toxicity is a fine balance between ensuring the drug can achieve adequate efficacy via have high exposure in disease-targeted tissues but minimal drug exposure in healthy tissues to avoid toxicity even at high doses. Traditionally, toxicity assessment was delayed until several drug candidates were identified. In vitro (outside the living organism) and in vivo (inside the living organism) models were then used to assess the toxicity of a molecule.
AI in toxicology assessment has moved the identification of potential tox issues earlier in the drug discovery process i.e. while the drug is being designed in silico and shortlisted as a potential candidate instead of after the pool of drug candidates have been shortlisted.
But AI’s predictive ability of a drug’s likely toxicity profile in clinical trials is hampered by the same data challenges that plague AI for biology. Data from in vitro environments doesn’t model the complex realities of human biology i.e. if we don’t understand the cascade of events when a drug hits its target i.e. not just happens to the cell but how that affects the other pathways that are associated with the target then how can we predict with confidence off target or on target toxicity. We are not yet at the stage where we can claim victory and meaningfully move the needle away from ‘30% of drugs fail due to unmanageable toxicity’.
So is AI in drug discovery and development hype or is it delivering tangible results?
Back to the low hanging fruit:
‘Can we reduce the time it takes to discover a viable drug candidate to less than the historical average of 5 years?’
Compressing drug discovery timelines is where AI is having the biggest direct impact. Even though we have only mapped around 10% to 15% of human biology, chemistry can be handled much better computationally. AI can predict the precise 3D structure of a disease-related target and computational chemists can immediately begin designing drugs that can bind to the target effectively, a process that pre-AI would take years of lab work.
AI is inverting the traditional drug discovery workflow i.e. computational screening and optimized design parameters are used to shortlist the pool of candidates that are tested in the lab instead of a lab and labor intensive ‘whack a mole’ lead optimization process.
According to BCG, AI has already reduced the time it takes to identify and validate promising targets by 35% to 40% (source BCG) i.e. significant reduction in the ‘on average’ 5 year drug discovery. In an industry where monetization is constrained by finite patent life, each additional year on the market is significant.
Conclusion
AI models are only as good as the quality and volume of the data they are trained on and lack of high quality data sets that can model the complexity of human biology are the biggest bottleneck in meaningfully reducing the technical risk associated with drug discovery.
However, AI’s impact on condensing drug discovery cycle time and increasing the number of years of commercial life should not be underestimated. A very simple hypothetical example where a drug has peak sales of USD1bn (simply the threshold for what is considered a blockbuster, most blockbusters have multi-billion $ peak sales). If drug discovery timelines are shortened by 2 years i.e. additional USD2bn in peak sales, then risk adjusted NPV i.e. NPV adjusted for the probability of a drug discovery program reaching commercialization discounted at 10% cost of capital is USD65M+ or roughly USD 30-35M per year gained. Small numbers due to the impact on the earliest stages of drug discovery where probability of reaching the market is only 10% or less.
This above back of the envelope calculation also highlights why the real gains from AI will be from increasing the probability of success i.e. reducing the number of drugs that fail due to lack of efficacy or unmanageable toxicity.
Let’s take the same example i.e. a drug with peak sales of USD 1bn. Currently only 50% of drugs in Phase 3 make it to the finish line and the remaining 50% fail due to lack of efficacy (50%), toxicity (30%), poor drug design (10%). A 10% improvement in the probability of success due to better predictive capabilities of AI would be incremental rNPV of USD400M.
The verdict
AI: Guardian Angel or Cognitive Amplifier or Both. It’s already a cognitive amplifier but the future is AI as a guardian angel, literally by fulfilling its promise to find cure for diseases and figuratively by bringing shareholder returns in biotech and big pharma in lockstep with tech firms.

No responses yet