AI at Work: Redefining the Data Science Landscape
December 17, 2024 - by Rebecca Handler
On December 3, 2024, Stanford University hosted a groundbreaking symposium to explore opportunities for how artificial intelligence (AI) can be integrated into data science workflows. Co-hosted by Stanford’s Quantitative Sciences Unit and Stanford Data Science, the event brought together leading experts in biostatistics, epidemiology, health policy, informatics, and public health to discuss the opportunities and challenges of AI in transforming data science and public health.
Opening Remarks and Keynote Address
The day began with remarks from Manisha Desai, PhD, Associate Dean for Quantitative and Data Sciences at Stanford and Director of the Quantitative Sciences Unit. Desai defined “data scientist” broadly, encompassing those who rely critically on data in clinical, policy, or research contexts. While she highlighted AI’s promise in streamlining workflows, such as using ChatGPT for statistical analysis plans, she emphasized potential implications for rigor and reproducibility. She said, “AI can be a really powerful tool for data scientists, but there are some threats, particularly to rigor.”
Ruth O’Hara, PhD, Senior Associate Dean for Research, drew comparisons to the genetic revolution, noting the importance of establishing shared standards and ethical practices for AI’s responsible use. “The meticulous effort required for genome sequencing must be replicated for AI to deliver on its promise,” she said.
In his keynote, Michael Pencina, PhD, Chief Data Scientist for Duke Health, emphasized AI’s duality as both thrilling and risky. He described the field as a “wild west” of development, where innovation often outpaces regulation. Pencina advocated for robust governance frameworks, including lifecycle management and algorithm registries, to ensure safety and efficacy. “The technology is moving fast, but our frameworks must keep up,” he stated.
Associate Dean for Quantitative and Data Sciences, Manisha Desai, gives opening remarks.
Ruth O’Hara, Senior Associate Dean for Research, also delivered opening remarks at the symposium.
Keynote speaker, Michael Pencina, Chief Data Scientist for Duke Health, was a featured speaker.
Left to right: James Zou, PhD; Jade Benjamin-Chiung, PhD, MPH; Sara Singer, MBA, PhD; Nigam Shah, MBBS, PhD; Manisha Desai, PhD
Panel Discussion: Challenges and Solutions for Integrating AI into the Data Scientist’s Workflow
A morning panel, moderated by Desai, tackled the complexities of incorporating AI into data science workflows. Panelists included Nigam Shah, MBBS, PhD, Jade Benjamin-Chung, PhD, MPH, Sara Singer, PhD, MBA, and James Zou, PhD.
Shah, Professor of Medicine and Chief Data Scientist for Stanford Health Care, highlighted the need to test entire workflows rather than isolated AI tools. “Many tools perform well in isolation but fail in real-world workflows,” he explained. Singer, Professor of Health Policy and Medicine, shared insights from her iPath project, which uses AI to analyze qualitative research in diabetes care. She warned that AI’s “hallucinations” – plausible but incorrect outputs – pose risks to research integrity.
Zou, Associate Professor of Biomedical Data Science, introduced his Virtual Lab, where generative AI collaborates on research idea generation. While promising, Zou cautioned against over-reliance, which could dilute research rigor. Benjamin-Chung, Assistant Professor of Epidemiology and Population Health, advocated for embedding AI into deterministic coding frameworks to enhance transparency and reproducibility. Together, the panel underscored the need for robust documentation and interdisciplinary collaboration to address AI’s limitations.
Left to right: Mark Musen, MD, PhD; Laurence Baker, PhD; Bryan Bunning; Eleni Linos, MD, MPH, DrPH; Steven Goodman, MD, MPH, PhD
Panel Discussion: Educating the Next Generation of Data Scientists in the Era of AI
The second panel explored how AI is reshaping education and training for data scientists. Moderated by Mark Musen, Chief of Biomedical Informatics Research, the discussion underscored the interdisciplinary nature of AI education. Panelists included Laurence Baker, PhD, Eleni Linos, MD, MPH, DrPH, Steve Goodman, MD, MHS, PhD, and Biomedical Informatics PhD student, Bryan Bunning.
Baker, Professor of Health Policy, stressed the importance of integrating AI across the research process, from hypothesis generation to presentation. Linos, Director of Stanford’s Center for Digital Health, emphasized maintaining rigorous quantitative skills while prioritizing communication, ethics, and teamwork. “We can’t lose sight of foundational values like collaboration and critical thinking,” Linos said.
Goodman, Associate Dean for Clinical and Translational Research, called for rethinking what defines a responsible scientist. He advocated for oral evaluations to assess students’ understanding of AI tools and their limitations. Ph.D. candidate Brian Bunning noted that while tools like ChatGPT lower barriers to analysis, students must still understand the biases and assumptions underlying these tools. “AI can help us work faster, but we can’t afford to lose the fundamentals,” he remarked.
Panelists agreed that curricula must adapt to blend traditional rigor with emerging technologies, fostering teamwork and real-world applications. As Musen concluded, “AI challenges us to rethink education—not just in content but in the way we learn from each other.”
Dean Lloyd Minor’s Remarks
Dean Lloyd B. Minor, MD, kicked off the afternoon session focused on public health by offering an inspiring vision for AI’s transformative potential in healthcare.
“AI represents the biggest moment for healthcare and public health since the discovery of antibiotics,” he declared. Minor highlighted innovations like synthetic control arms for clinical trials, which reduce costs and improve representation of marginalized groups.
“Change occurs at the speed of trust,” he reminded, emphasizing the need for transparency and ethical oversight.
Panel Discussion: AI for Public Health
Moderated by Melissa Bondy, PhD, Chair of Epidemiology and Population Health, a third panel focused on AI’s role in public health. Bondy emphasized the need for practical roadmaps to integrate AI while ensuring trust and equity. Panelists included John Auerbach, Ivor Horn, MD, MPH, and Michelle Williams, SM, ScD.
Dean Lloyd Minor at 2024 Symposium on AI for Data Science
Left to right: Melissa Bondy, PhD; John Auerback, MBA; Ivor Horn, MD, MPH; Michelle Williams, ScD
Auerbach, Senior Vice President for Health at ICF International, shared examples like Chicago’s AI-driven restaurant inspections, which prioritize high-risk establishments for limited resources. “AI enables earlier interventions, preventing crises before they escalate,” he noted. Auerbach also stressed the importance of trust among policymakers and communities to support AI adoption.
Stanford Visiting Professor and former Dean of Faculty Affairs at Harvard School of Public Health, Williams noted that “these (AI) tools will not replace the human capacity, the knowledge, the sheer tenacity of understanding the community they are serving, and we have to also use these examples to illustrate we are augmenting their capacity, not replacing.”
Horn, former Chief Health Equity Officer at Google, underscored the necessity of embedding equity into AI development. “Data without context is incomplete at best and harmful at worst,” she warned. Horn called for engaging historically marginalized communities in shaping AI tools to address social determinants of health. Together, panelists envisioned AI as a proactive tool for reducing health disparities while ensuring no community is left behind.
A Call for Thoughtful Integration
The symposium closed with a recurring theme: AI’s promise hinges on thoughtful integration into science and society. Desai noted the importance of a “shared commitment to approaching these technologies with responsibility and ethics at the forefront.” While its potential to revolutionize workflows and address inequities is immense, the day’s discussions reinforced the importance of balancing innovation with accountability, which requires having key parties – including the communities being served – at the table when discussing the development and deployment of AI. AI’s impact will depend on how rigorously and ethically it is implemented to ensure it truly serves the greater good.
Your next recommended read
AI in Medicine: Can GPT-4 Improve Diagnostic Reasoning?
A recent Stanford study explores GPT-4's potential in aiding diagnostic reasoning. Conducted by the Center for Biomedical Informatics Research, it tested GPT-4's ability to assist doctors in diagnosing complex cases.