Xia “Ben” Hu is a Rice University associate professor whose recent research not only focused on computational foundations, but also has helped computer scientists decipher the best use cases or applications for the latest surge of large language models (LLMs). His group developed an LLM survey — like a consumer guide to the best food processors for 2023, except the food processors are actually giant neural networks that process, predict, and sometimes create words and texts.
Hu said, “The recent progress in large language models has resulted in highly effective models — like OpenAI's ChatGPT— that have demonstrated exceptional performance in various tasks, including question answering, essay writing, and code generation. Determining the appropriate model that will best solve a specific problem can be a challenge; we’ve provided a short description of the top LLMs to have evolved from BERT through ChatGPT, then added the most appropriate uses for each.”
When their paper, “Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond,” was first announced, their animated gif showing the history of LLMs raced across social media platforms.
Large Language Models in healthcare
The DATA Lab, Hu’s team of Rice Ph.D. researchers, also considers biomedical applications for LLMs. At the November American Medical Informatics Association (AMIA) conference in New Orleans, Rice Ph.D. student Ruixiang (Ryan) Tang will present “Does Synthetic Data Generation of LLMs Help Clinical Text Mining?” This joint paper showcases the results of their research on uses of Large Language Models (LLMs) like ChatGPT in the healthcare sector.
Tang said, “While LLMs have shown promise in tasks such as question-answering and code generation, their direct application in healthcare raised concerns of low generalizability and data privacy. To address these issues, my recent research demonstrates that large language models can be combined with domain-specific expertise to generate a substantial corpus of high-quality, labeled synthetic data. This allows for fine-tuning a local model, improving performance in downstream tasks like biological named entity recognition and relation extraction, while also mitigating data privacy concerns.”
Hu has been impressed with Tang’s recent works in Trustworthy AI. “I have great respect for Ruixiang's vision of constructing a trustworthy AI system. Realizing the full potential of AI necessitates establishing trust throughout its entire life cycle. Ruixiang's work has addressed many key issues concerning data privacy, model security, and transparency. I am confident that his line of work will attract considerable attention from both industry and academia” said Hu.
Accessible LLMs for smaller resource budgets
Many computer scientists have watched the primary wave of AI advancements with concern because this progress has been pushed by an abundance of high-quality data and increased model sizes. For example, OpenAI's GPT3, which has 175 billion parameters and was trained using 45 Terabytes of data, requires more than 1,000 GPUs for training. This massive scale makes ML models inefficient and unsustainable, requiring more computational power than common hardware can provide. Researchers and small businesses, often operating with limited budgets, find themselves unable to procure the necessary hardware to keep pace with the advancements.
A recent NeurIPS 2023 paper by Ph.D. student Zirui Liu and team explores this issue. They propose using randomized algorithms that, while sacrificing some level of certainty, can notably decrease computational complexity. These algorithms enable users to fine-tune LLMs on their own data using affordable devices, e.g. one single GPU with restricted memory.
Liu said, “Our paper was inspired by the observation that almost all neural networks are optimized using first order stochastic optimization. Then why would we spend resources on obtaining exact gradients during fine-tuning? Inspired by this question, we replace the matrix multiplication operations in LLMs with their noisy but cheap approximated version. This adaptation allows for the fine-tuning of LLMs on downstream data using hardware that's accessible and affordable for most individuals and research labs”
Hu was impressed by Zirui’s ability to cover a broad spectrum of topics. He said, “Zirui is excellent across hardware, algorithms, and machine learning. As the direction of large-scale machine learning increasingly requires interdisciplinary collaboration, his skills in mastering these distinct areas are invaluable. Zirui’s work has greatly bridged the gap between the stalled hardware capacity and ever-growing demands of machine learning. He did an excellent job on uncovering new frontiers in machine learning. His versatility not only serves him well but also benefits our entire group”
Expanding LLM usability in healthcare
Recently, a study led by Ph.D. student Jiayi Yuan, which was selected as a finalist for the best student paper at AMIA 2023, explored the potential of large language models to enhance the accuracy of clinical trial decisions while ensuring data privacy and trustworthiness. Building on this research, the team is determined to further innovate by developing Foundation Models tailored for precise and reliable clinical decision-making.
Yuan said the anticipated outcomes of this research direction are multifaceted:
- Accelerated Medical Research: Automating certain methods can drastically cut down the duration of clinical trials, paving the way for swifter scientific breakthroughs.
- Enhanced Patient Care: This acceleration means patients can access promising treatments more readily, streamlining the overall process.
- Economic Benefits: Streamlining clinical decisions can lead to significant financial savings for healthcare institutions.
“Understanding the intricate balance between enhancing accuracy and ensuring data privacy was one of the most challenging aspects of our research. However, seeing the potential impact on patient care and the broader medical community was incredibly satisfying,” said Yuan
Hu was impressed with Yuan’s project leadership. He said, “Jiayi kept the focus on our goal, ensuring that every step we took was in the best interest of the patients and the integrity of the research. It was remarkable to witness his dedication and clarity throughout the process.”
“Our research team is filled with such exceptional students that I suspect I would not be competitive if I were to apply today,” Hu said with a laugh. “When I compare my resume of 15 years ago — as I was applying to grad school — and the accomplishments of students like Jiayi, Zirui, and Ruixiang, I don’t think I’d be admitted to Rice with that kind of fierce competition!”
Hu added, “Our PhD student Khushbu Pahwa is also working on Explainable AI and LLMs, particularly with healthcare data. Her work has the potential to assist doctors in gaining a deeper understanding of patients' diseases.”
Microsoft funding underwrites LLM research for patients and medical professionals
Yuan is also part of the DATA Lab research project recently funded by Microsoft. The grant is part of a new pilot research program enabling researchers to gain a better understanding of general AI while driving model innovation, ensuring social benefit, transforming scientific discovery and extending human capabilities across different domains (e.g., education, health, law, society).
Hu noted that the support from Microsoft will enhance the team's ability to explore the practical applications of large language models, especially in the healthcare domain. Their goal is to maximize the benefits for both medical professionals and patients, particularly with regard to the clinical trial matching problem.
“In addition to funding for some of our researchers, Microsoft is providing Azure cloud computing resources. With access to Azure as well as open AI, we should have large enough space to model our service. The funding is not linked to specific deliverables, but we still have the benefit of the Microsoft expertise and one of their researchers is our point of contact,” said Hu.