New NSF grant targets large language models and generative AI, exploring how they work and implications for societal impacts
The U.S. National Science Foundation has announced a grant of $9 million to Northeastern University for research to investigate how large language models (LLMs) and generative AI operate, focusing on the computing process called deep inference and AI’s long-term societal impacts.
The research seeks to establish a National Deep Inference Fabric (NDIF), a collaborative research platform. The goal is to provide U.S. researchers with access to cutting-edge LLMs within a transparent experimental platform that reveals the systems’ internal computations — a capability currently unavailable in academia.
“Chatbots have transformed society’s relationship with AI, but how they operate is yet to be fully understood”, said Sethuraman Panchanathan, director of the U.S. National Science Foundation. “With NDIF, U.S. researchers will be able peer inside the ‘black box’ of large language models, gaining new insights into how they operate and greater awareness of their potential impacts on society.”
LLMs, such as those behind chatbots like ChatGPT, have emerged as a transformative force in AI with profound capabilities including general-purpose knowledge and language skills. However, their sheer size — often more than 100 billion parameters — poses a significant challenge for researchers investigating how the systems operate, especially without adequate computing resources.
The NDIF project aims to address that gap by creating a platform for exploring such large-scale AI systems. NDIF will leverage the resources of the National Center for Supercomputing Applications DeltaAI project, a high-capacity AI-focused cluster in the NSF computing portfolio that combines high-performance data processing and high-speed storage, ideal for designing and deploying new software resources to study LLM inference.
The broader impacts of the NDIF project extend beyond academia. The project has significant implications for a range of scientific disciplines, including computing, medicine, neuroscience, linguistics, social sciences and the humanities.
"In the always evolving realm of emerging technologies like AI, understanding the inner workings of LLMs is paramount,” said Dilma DaSilva, NSF's acting assistant director for Computer and Information Science and Engineering. “As we navigate the complexities of these transformative systems, we must ensure transparency and accountability, paving the way for inclusive and beneficial applications of AI-powered solutions. This investment in deep inference research not only empowers academic exploration, it fosters a culture of ethical and responsible AI development."
NDIF will actively collaborate with public interest technology groups to ensure that innovations in AI uphold principles of ethics, transparency and social responsibility in the deployment of LLMs.
"Understanding the mechanisms behind the strength of modern LLMs is one of the most pressing mysteries facing scientists today, raising fundamental questions about cognition as well as the many societal impacts of AI," said David Bau, principal investigator and assistant professor of computer science at Northeastern University. "NDIF defines a new, fully transparent way to access the largest models that is both powerful and cost-effective. It will democratize scientific access to the largest open AI models."
In addition, the project will help build a next-generation, AI-enabled workforce by providing comprehensive training to students and equipping them to serve as networks of experts. NDIF will support and train graduate students in AI, disseminate educational materials for university and professional use, and make tools for the science of large-scale AI widely available to users in other disciplines.
“NDIF will create a research framework that helps the research community explore the factors that ensure LLMs are safe and secure, which aligns with the goals of the White House Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence,” said Eleni Miltsakaki, the NSF program director for NDIF. “Launching NDIF will break new ground in the scientific study of very large language models, equipping researchers with deep inferencing tools that will enable them to study the internal mechanisms of these models. This investment in knowledge itself will unlock important research problems in every field impacted by large-scale AI."