Invited Session 1: Beyond Data: AI-Driven Insights for the Next Era of Drug Development

10:15 AM - 12:00 PM, May 16, 2025; Chem-H Rotunda E241

Chair: Dacheng Liu
Speakers: Bin Chen,James Zou, Herb Pang, Jimeng Sun

Chair: Dacheng Liu

dacheng Dacheng Liu is a Highly Distinguished Therapeutic Area and Methodology Statistician at Boehringer Ingelheim, with 20 years of experience in the pharmaceutical industry. He provides leadership in driving the statistical quality and fostering innovation of companywide clinical development programs across all therapeutic areas. He represents Boehringer Ingelheim in industry-wide groups and leads collaborations with US partners from both industry and academia. Before his current role, Dacheng held positions as the Global Head of Clinical Data Sciences and the US Head of Statistics, where he led both US and global teams in clinical drug developments of the company pipeline. Dacheng has extensive experience leading early and late-phase development in multiple disease areas. He has over 40 publications in areas of clinical research, trial design, statistical methodologies, and AI/machine learning.


AI applications for evidence generation and drug development in industry

Speaker: Herb Pang, Ph.D., Roche/Genentech

pan Dr Herb Pang is an expert statistical scientist (technical Senior Director lvl equivalent) at Genentech/Roche. He is an adjunct faculty in the Department of Biostatistics and Bioinformatics at Duke University School of Medicine and honorary associate professor of the University of Hong Kong. He was promoted to an associate professor with tenure in 2019 at the University of Hong Kong. He obtained his PhD in Biostatistics from Yale University in 2008 and BA in Mathematics and Computer Science from the University of Oxford in 2002.

His primary research interests include antimicrobial resistance, big data, cancer genomics, classification, comparative effectiveness outcomes research, data science, design and analysis of clinical trials, machine learning, meta-analysis, metagenomics, multi-omics data integration, predictive models, real-world evidence for drug development, and translational medicine. He is a principal investigator on an HHS U01 FDA grant, entitled "Applying novel statistical approaches to develop a decision framework for hybrid randomized controlled trial designs which combine internal control arms with patients' data from real-world data source". He was a principal investigator on an NIH R21 grant, entitled ‘Translational Meta-analysis for Elderly Lung Cancer Patients’, from the National Institute on Aging. He was a principal investigator on an RGC grant entitled 'A Familial Meta -Omics Study of the Cutaneous Microbiome in Psoriasis', and HMRF grants entitled 'Modelling chemotherapy-induced neutropenia and other hematologic toxicities in elderly patients’ and ‘Translational meta-analysis of immunotherapy studies in lung and liver cancer’.

Dr Pang has published over 150 methodological and translational peer-reviewed research articles on statistics, genetics, genomics, bioinformatics, and clinical trials. His work has been published in top journals including, Bioinformatics, JAMA, Journal of Clinical Oncology, Journal of Thoracic Oncology, Lancet Oncology, Gastroenterology, Gut, Hepatology, and the Journal of the National Cancer Institute. He received the US Chinese Anti-Cancer Association (USCACA)-Asian Fund for Cancer Research (AFCR) 2015 Scholar Award. From January 2012 to December 2014, he served on the editorial board of the Journal of Clinical Oncology. He has also contributed as a reviewer for over 50 leading journals, such as Annals of Applied Statistics, Briefings in Bioinformatics, Bioinformatics, Biostatistics, JAMA Oncology, Nature Communications, Nucleic Acids Research, Science Translational Medicine, Statistics in Medicine, and Trends in Genetics. He is a member of the American Statistical Association and a fellow of the Royal Statistical Society.

Abstract

In this talk, we will cover the potential of AI in evidence generation and drug development in the pharmaceutical industry. Emphasizing the importance of collaboration with academia, it highlights examples of AI applications developed in partnership with universities. The focus will be on real-world applications, demonstrating how machine learning can make clinical trials and drug development more efficient. Additionally, the talk highlights some practical use of large language models in accelerating drug development. The landscape of regulatory guidance and recent work by regulatory statisticians concerning AI in drug development will also be briefly discussed. This session aims to provide an overview of efforts and AI innovations that are shaping the future of drug development, and ultimately improving patient outcomes.


Title: AI models for transcriptomics-based target identification and de-novo drug design

Speaker: Dr. Bin Chen, Michigan State University

Bin Dr. Bin Chen is a tenured associate professor leading a multidisciplinary lab at Michigan State University, with a mission to leverage advanced machine learning and emerging big data to discover new therapeutics. He is also the Founding Director of the Center for AI-Enabled Drug Discovery in the College of Human Medicine at Michigan State University. He was a faculty member at UCSF and pursued the postdoc training at Stanford. His current research areas include machine learning method development, integrative bioinformatics, and EHR mining. He has training in informatics, chemistry, and biology, and working experience in big pharmaceutical companies and small startups. His lab strives to pioneer transcriptomics-based drug discovery, develop foundation models to understand how individual cells respond to perturbations, and utilize massive real-world data to assess drug efficacy.

Abstract

Transcriptomics is one of the most widely used modalities for understanding disease mechanisms, yet its potential in drug discovery remains largely untapped. My lab has developed a suite of tools that leverage large-scale bulk, single-cell, and spatial transcriptomics for drug discovery. In this talk, I will present a deep learning-based drug discovery platform that uses transcriptomic features to screen large compound libraries and optimize lead compounds. To demonstrate its utility, I will share a case study in liver cancer, where we designed a novel compound that improved the IC₅₀ from 4 μM to 0.5 μM, with enhanced in vitro selectivity, favorable pharmacokinetics, and demonstrated in vivo activity. Additionally, I will introduce SPIDER, a zero-shot deep ensemble model for predicting surface protein abundance from single-cell transcriptomics. I will highlight its applications in biomarker and target identification in colon and liver cancers.


Title: AI scientists for drug discovery and development

Speaker: Dr. James Zou, Stanford University

James Zou James Zou is an associate professor of Biomedical Data Science at Stanford University. He works on advancing the foundations of ML and cutting-edge scientific and clinical applications. Many of his innovations are widely used in tech and biotech industries. He has received a Sloan Fellowship, the Overton Prize, an NSF CAREER Award, two Chan-Zuckerberg Investigator Awards, a Top Ten Clinical Achievement Award, several best paper awards, and faculty awards from Google, Amazon, Adobe and Apple. His research has also been profiled in popular press including the NY Times, WSJ, and WIRED.

Abstract

This talk will explore how generative AI agents can enable drug discovery and development. I’ll introduce the Virtual Lab—a collaborative team of AI scientist agents conducting in silico research meetings to tackle open-ended R&D projects. The Virtual Lab designed new nanobody binders to recent Covid variants that we experimentally validated. Then I will discuss some interesting opportunities in designing and optimizing multi-agent interactions.


Title: Large language models for clinical trial design, execution, and analysis

Speaker: Dr. Jimeng Sun, University of Illinois Urbana-Champaign

Dr. Jimeng Sun is a Health Innovation Professor at the Siebel School of Computing and Data Science and Carle Illinois College of Medicine at the University of Illinois Urbana-Champaign. He is also the co-founder of Keiji AI, a pioneering company leveraging artificial intelligence to transform clinical trials through optimization and predictive modeling. His work at Keiji AI includes optimizing trial design, patient recruitment, and outcome prediction to accelerate drug development and improve success rates. Dr. Sun’s research centers on using AI to advance healthcare, with a special focus on improving clinical trials, clinical decision support, drug discovery, computational phenotyping, and clinical predictive modeling. He has been named one of the Top 100 AI Leaders in Drug Discovery and Advanced Healthcare and has an extensive academic impact, with over 400 publications, more than 36,000 citations, and an h-index of 97.Dr. Sun collaborates with top healthcare institutions, including Massachusetts General Hospital, Beth Israel Deaconess Northwestern, and Vanderbilt, as well as industry leaders such as IQVIA, Medidata and GE Healthcare. He received his B.S. and M.Phil. in computer science from the Hong Kong University of Science and Technology and his Ph.D. from Carnegie Mellon University.

Abstract

Large language models (LLMs) hold great promise for accelerating clinical trials, but fully realizing their potential in medicine requires targeted applications across key research tasks. This talk focuses on three essential steps in clinical trials: design, execution, and analysis. Specifically, with a highlight on literature research, clinical research, and data science research. I will first introduce LEADS and TrialMind, which build specialized LLMs for literature mining tasks and boost human-AI collaboration. Next, I will present TrialGPT, an LLM pipeline that streamlines patient recruitment in clinical trials. Finally, I will discuss how LLMs can accelerate hypothesis validation through code generation and analysis of medical and biomedical data.