Using generative AI assistant to interpret pharmacogenetic test results
Genetic test results can be hard to understand and interpret for people without a background in genetics. Investigators at 草榴社区入口鈥檚 are studying whether an artificial intelligence (AI) assistant could be beneficial in answering questions about these results for patients and physicians. They developed a generative AI assistant trained on a knowledge base comprising the latest Clinical Pharmacogenetics Implementation Consortium (CPIC) data for statins and tested its accuracy against OpenAI鈥檚 ChatGPT 3.5. The findings are published in , the Journal of the American Medical Informatics Association.
鈥淲e created a chatbot that can provide guidance on general pharmacogenomic testing, dosage implications and the side effects of therapeutics and address patient concerns. We see this tool as a superpowered assistant that can increase accessibility and help both physicians and patients answer questions about genetic test results,鈥 said first author Mullai Murugan, director of software engineering and programming at the Human Genome Sequencing Center at Baylor.
The study focused on pharmacogenomic testing for statins, which indicate whether a person is genetically predisposed to have a better or worse response to different statin medications used to treat high cholesterol. To interpret these results, the Baylor researchers developed their own AI assistant. Despite the popularity of ChatGPT, the team knew the chatbot had a major limitation that would impact the accuracy of its responses.
鈥淭he training cutoff date for ChatGPT 3.5 is January 2022, so that system won鈥檛 have access to any guidelines published after that date. It happens that the key publication on statin pharmacogenomics was published in May 2022,鈥 Murugan said.
The Baylor AI assistant uses Retrieval Augmented Generation (RAG) and is trained on a knowledge base of CPIC data and publications, which includes the most recent guidelines.
The team compared their AI assistant to ChatGPT 3.5 by giving both chatbots a set of questions designed to reflect typical inquiries from patients and healthcare providers. A panel of four experts in pharmacogenomics and cardiology judged the responses from both chatbots based on accuracy, relevancy, risk management, language clarity and other factors.
The Baylor AI assistant scored highest in accuracy and relevancy, and the largest gaps between the two chatbots were seen in questions from healthcare providers. In that category, Baylor鈥檚 chatbot scored 85% in accuracy and 81% in relevancy compared to ChatGPT鈥檚 58% in accuracy and 62% in relevancy. Both chatbots scored similarly in language clarity.
Despite initial promising results, the researchers stress that this technology is not ready for clinical use. The model still struggles to recognize some biomedical terms that don鈥檛 use typical words and characters. In addition, while the model is trained on pharmacogenomic data, it lacks training in typical language used by genetic counselors to explain results. Lastly, researchers emphasize a need to address ethical, regulatory and safety concerns before the tool can be used in a clinical setting.
鈥淲e are working to fine tune the chatbot to better respond to certain questions, and we want to get feedback from real patients,鈥 Murugan said. 鈥淏ased on this study, it is very clear that there is a lot of potential here.鈥
鈥淭his study underscores generative AI鈥檚 potential for transforming healthcare provider support and patient accessibility to complex pharmacogenomic information,鈥 said senior author Dr. Richard Gibbs, director of the Human Genome Sequencing Center and Wofford Cain Chair and Professor of Molecular and Human Genetics at Baylor. 鈥淲ith further development, these tools could augment healthcare expertise, provider productivity and the promise of equitable precision medicine.鈥
Other authors of this work are Bo Yuan, Eric Venner, Christie M. Ballantyne, Katherine M. Robinson, James C. Coons, Liwen Wang and Philip E. Empey. They are affiliated with one or more of the following institutions: 草榴社区入口, University of Pittsburgh and UPMC Presbyterian-Shadyside Hospital.
This work was partially funded by the National Institutes of Health鈥檚 All of Us Research Program.