Keynotes,
Panels & Demos
AAAI 2024 Symposium on Clinical Foundation Models
25 – 27 March, 2024 at Stanford University
Day 3
Time Series Foundation Models
Keynote 6, 9:00 AM
Mononito Goswami, Ph.D. student in Robotics at Carnegie Mellon University
Abstract: Large language and vision models pre-trained on vast quantities of text and image data have many desirable properties: these models perform well on a variety of tasks on data from diverse domains, with little or no supervision, and can be tuned to perform well on specific tasks. There is a growing interest in unlocking these key capabilities for time-series data through time-series foundation models. I will begin this talk by first breaking down some characteristics of a time-series foundation model. I will then discuss key challenges towards building and pre-training these models, and recent approaches including MOMENT, MOIRAI, Lag-LLaMa, TimesFM, and Time-GPT. I will also review strategies to reprogram large language models for time-series forecasting and other prediction tasks. Finally, I will end the talk by discussing opportunities for future work, in particular, on multimodal time-series and text foundation models, and holistic benchmarking and evaluation.
Fantastic LVLMs and how to ground them
Keynote 7, 9:30 AM
Dr. Erhan Bas, Head of Foundational AI at GE Healthcare
Abstract: Large Vision Language Models (LVLM) have emerged as powerful models capable of achieving impressive zero-shot generalization on a wide range of downstream tasks. Its full potential is yet to be realized in the healthcare domain to optimize processes and transform workflows across technologists and clinicians. In this talk, I will discuss some of the recent advancements in LVLMs, their effectiveness, generalization capabilities, and shortcomings.
How to increase the adoption of (generative) AI in healthcare?
GE Healthcare
Microsoft
Panel 3, 11:00 AM
GE Healthcare
Roche
Day 2
Advancing Clinical Trial Development
with Generative AI
Keynote 4, 9:30 AM
Prof. Jimeng Sun, Health Innovation Professor at the University of Illinois Urbana Champaign
Abstract: Recent advancements in Generative AI have shown great potential in streamlining various aspects of clinical trial development. In this talk, we present three works that showcase how Generative AI can help clinical trial design and operation. First, we introduce TrialGPT, which utilizes Large Language Models (LLMs) to match patients to clinical trials. By analyzing patients' medical notes, TrialGPT accurately predicts their suitability for various trials, thereby accelerating the recruitment process and ensuring better patient-trial fit. Next, we discuss AutoTrial, a tool that simplifies the design of eligibility criteria for clinical trials. AutoTrial employs language models to generate clear and concise criteria, adapts to new information, and provides transparent explanations for its decisions, ultimately reducing the complexity of the trial design process. Finally, we present the Trial Foundation Model, a finetuned LLM called Panorama. Trained on millions of trial documents and publications, Panorama demonstrates superior performance in various trial design tasks. We showcase its potential to enhance clinical trial design.
Advancing Health at the Speed of AI
Keynote 5, 2:00 PM
Dr. Hoifung Poon, General Manager at Health Futures in Microsoft Research
Abstract: The dream of precision health is to develop a data-driven, continuous learning system where new health information is instantly incorporated to optimize care delivery and accelerate biomedical discovery. In reality, the health ecosystem is plagued by overwhelming unstructured data and unscalable manual processing. Self-supervised AI such as large language models (LLMs) can supercharge structuring of biomedical data and accelerate transformation towards precision health. In this talk, I'll present our research progress on generative AI for precision health, spanning biomedical LLMs, multimodal learning, and causal discovery. This enables us to extract knowledge from tens of millions of publications, structure multimodal real-world data for millions of cancer patients, and apply the extracted knowledge and real-world evidence to advancing precision oncology in deep partnerships with real-world stakeholders.
How to make an impact in the
practice of healthcare?
Microsoft
Gradient
Panel 2, 11:00 AM
Nixtla
Mayo Clinic
Plenary Volunteer
Interested in presenting a brief 5-min update at the plenary? Let us know…�
Automated Evaluation of Retrieval-Augmented
Language Models with Task-Specific Exam Generation
Demo 3, 4:00 PM
Gauthier Guinet, Applied Scientist at Amazon
Abstract: We propose a new method to measure the task-specific accuracy of Retrieval-Augmented Large Language Models (RAG). Evaluation is performed by scoring the RAG on an automatically-generated synthetic exam composed of multiple choice questions based on the corpus of documents associated with the task. Our method is an automated, scalable, interpretable, and cost-efficient strategy to select the optimal components for a RAG system. We leverage Item Response Theory (IRT) to estimate the quality of an exam and its informativeness on task-specific accuracy. IRT also provides a natural way to iteratively improve the exam by eliminating the exam questions that are not sufficiently informative about a model's ability. We demonstrate our approach on five new open-ended Question-Answering tasks based on Arxiv abstracts, StackExchange questions, AWS DevOps troubleshooting guides, Medical Data and SEC filings. In addition, our experiments reveal more general insights into factors impacting RAG performance like size, retrieval mechanism, prompting and fine-tuning. Most notably, our findings show that choosing the right retrieval algorithms often leads to bigger performance gains than simply using a larger language model.
Using the Gradient AI Agent Stack to Transform Healthcare Operations
Demo 4, 4:30 PM
Dr. Leo Pekelis, Chief Scientist at Gradient AI
Abstract: Companies use Gradient to build powerful medical benefits knowledge bases that save customer service teams up to 70% of their time. In our demo, we dive into how to leverage the Gradient platform (including PDF extraction and RAG), our proprietary medical foundational model (Nightingale), and our optimized medical embeddings model.
Day 1
Data-Efficient Foundation Models,
From Pretraining to Adaptation
Keynote 1, 9:10 AM
Prof. Frederic Sala, Assistant Professor, University of Wisconsin-Madison
Abstract: Powerful and typically massive 'foundation' models offer the promise of serving as a base for diverse applications, including in clinical domains. Unfortunately, it turns out that training and adapting these models for downstream tasks tends to be difficult and expensive, often requiring collecting and annotating substantial quantities of domain-specific data. In this talk, I will describe my group's work on addressing this challenge. First, we will discuss skill-based training, enabling language model pretraining and fine-tuning with substantially smaller corpora. Next, when adapting vision-language models like CLIP to deal with spurious correlations, we show how to self-guide the adaptation process, without any additional data. Next, we show how to integrate relational structures like knowledge graphs into model prediction pipelines, enabling models to adapt to new domains unseen during training, without additional annotated examples. Finally, in the most challenging scenarios, when the model must be fine-tuned on labeled data, we show how to obtain this data efficiently through techniques from weak supervision.
How LLMs might help us scale world class healthcare to everyone?
Keynote 2, 9:30 AM
Research Scientist, Google
Abstract: In recent years, the field of AI has been revolutionized by the emergence of Transformers and Large Language Models. However, perhaps nowhere is their impact likely to be more profound than in the biomedicine where they have the potential to act as care multipliers, help improve our understanding of biology and solve the burden of diseases. In this talk, I will introduce recent works from my team at Google AI, Med-PaLM, Med-PaLM 2, Med-PaLM M and AMIE which I believe are key milestones towards such a future. Med-PaLM and Med-PaLM 2 were the first AI systems to obtain passing and expert level scores on US Medical License exam questions respectively, a long standing grand challenge in AI. Med-PaLM M was the first demonstration of a generalist, multimodal, biomedical AI system. More recently, two recent studies highlight AMIE's promising capabilities. In a double-blind, randomized study, AMIE performed competitively against Primary Care Physicians in text consultations. Additionally, a separate study demonstrated AMIE's significant assistive potential for clinicians facing complex diagnostic challenges. I will outline the motivation, principles and technical innovations underpinning these systems. Finally, I will sketch out a vision for how we might be able to leverage such powerful systems to help scale world class healthcare to everyone and make medicine a humane endeavor again.
Foundation Models for
Clinical Decision Support in Critical Care
Keynote 3, 2:00 PM
Professor of Critical Care, Mathematics, and Chemical Engineering, University of Pittsburgh
Abstract: The modern intensive care unit is the prime example of a complex environment where decisions require rapid integration of large amounts of rapidly changing data originating from bedside monitors, other devices, electronic health records, images, and a growing number of potential other sources. Importantly data is dynamic reflecting changing health conditions. It has proven challenging to develop and deploy decision support systems that demonstrate performance criteria suitable for the ICU environment. Foundation models provide a new opportunity to integrate multi-domain time series for diagnostic, prognostic, and prescriptive tasks. In particular, high-frequency time series data is routinely obtained, is potential less biased than other data sources, and has been distinctly underexploited as a source of information. Traditional modeling approaches are challenged by such data. Pre-trained foundation models will likely play an important role in democratizing decision support in both critical and non-critical environments.
Foundation models for clinical AI–
Challenges and Opportunities
Panel 1, 11:00 AM
TimeGPT: A Foundation Model
for Time-series by Nixtla
Demo 1, 4:00 PM
Max Mergenthaler Canseco, Azul Garza, Cristian Challu, Co-founders at Nixtla
Company Bio: Nixtla is to time-series what Anthropic or Open AI are to language and images. We are the creators of TimeGPT. With our pre-trained model, an enterprise can upload its data and predict. Saving millions of dollars and months of development and maintenance. TimeGPT was trained on over 100 billion rows of financial, weather, energy, and web data – and democratizes the power of time-series analysis. Before TimeGPT Nixtla created the most comprehensive time series ecosystem, with more than 5 million downloads. Nixtla’s software is currently used in production by Fortune 500 companies like Amazon, Walmart, Meta, and Accenture.
Drugs and adverse event extraction using GenAI
Demo 2, 4:30 PM
Dr. Hoang Tran, Research Scientist
Abstract: In the new world of off-the-shelf generative AI models, you can just grab a model pre-trained by OpenAI, Google, Hugging Face, etc., and start generating predictions. And these predictions can be large chunks of generated content! This leaves many data scientists wondering, where does my data actually add value in the development of production AI healthcare applications. In this demo, you’ll see how unique data is critical to developing high-quality generative AI applications and learn where data can be used and how it should be prepared, managed, and applied to deliver real-world value for your organization.