New NSF awards will advance theoretical foundations of data science research through interdisciplinary collaborations
Data science is an expanding field that requires the expertise of computer scientists, engineers, mathematicians, and statisticians to handle the complex analysis of ever-larger data sets. Data affect how industry, academia and government operate. The U.S. National Science Foundation is committed to leading the nation in foundational data science research.
As part of this effort, NSF announces two new Transdisciplinary Research in Principles of Data Science, or TRIPODS, Phase II awards totaling $20 million that bring together scientists and engineers from different research communities to further the theoretical foundations of data science through integrated research and training activities. TRIPODS is tied to NSF's Harnessing the Data Revolution Big Idea, which aims to accelerate discovery and innovation in data science algorithms, data infrastructure, and education and workforce development.
Phase II of the program will continue to support the development of collaborative institutes to delve deeper into foundational issues in data science, such as design of algorithms for analyzing large, complex, noisy and changing data sets; data that includes historical biases and elements influenced by self-interested and possibly malicious parties; and the need for fair, ethical and understandable results from complex data-driven decision-making processes.
TRIPODS awards aim to achieve these goals and other long-term major impacts in areas ranging from basic science to commerce and law by bringing together electrical engineering, mathematics, statistics, and theoretical computer science communities in synergistic and mutually beneficial ways to develop a unified foundation for data science.
NSF is supporting two new teams over five years focused on these topics, bringing the total number of TRIPODS Phase II Institutes to four. Previous TRIPODS Institutes were announced in 2020.
The Institute for Emerging CORE Methods in Data Science, or EnCORE, is led by the University of California San Diego in collaboration with the University of California, Los Angeles; University of Pennsylvania; and The University of Texas at Austin. EnCORE brings together scientists from multiple disciplines such as statistics, mathematics, electrical engineering, theoretical computer science, machine learning and health science, among others.
EnCORE's team will focus on the four CORE pillars of data science: C for complexities of data, O for optimization, R for responsible learning, and E for education and engagement. The institute is fostering a plan for outreach and broadening participation by engaging students of diverse backgrounds at all levels, from K-12 to postdocs and junior faculty. The project aims to reach a wide demography of students by offering collaborative courses across its partner universities and a flexible co-mentorship plan for multidisciplinary research.
To bring theoretical development into practice, EnCORE will work with industry partners and domain scientists and will forge strong connections with other NSF Harnessing the Data Revolution Institutes across the nation.
Institute for Emerging CORE Methods in Data Science awards: UC San Diego, UCLA, University of Pennsylvania, and The University of Texas at Austin.
The Institute for Data, Econometrics, Algorithms, and Learning, or IDEAL, is a multi-institution and transdisciplinary institute led by the University of Illinois Chicago in collaboration with Northwestern University; Toyota Technological Institute at Chicago; the University of Chicago; and Illinois Institute of Technology, in partnership with members of the Learning Theory team at Google. The institute involves more than 50 researchers working on key aspects of the foundations of data science across computer science, electrical engineering, mathematics, statistics, and fields such as economics, operations research and law.
Research will center around the foundations of machine learning, high-dimensional data analysis and inference, and data science and society. Topics include foundations of deep learning, reinforcement learning, machine learning and logic, network inference, high-dimensional data analysis, trustworthiness and reliability, fairness, and data science with strategic agents.
The institute will broaden research and education participation from underrepresented groups by organizing activities that engage diverse communities of students at all levels, including high school and undergraduate students, as well as teachers (through a partnership with Math Circles of Chicago) and the public (via lectures in partnership with the Museum of Science and Industry).
Institute for Data, Econometrics, Algorithms, and Learning awards: UIC, Toyota Technological Institute, The University of Chicago, IIT, Northwestern University.
"The new 2022 TRIPODS awards address foundational challenges in data science at the core of data-driven discovery and decision making,” said NSF Division Director for Computing and Communication Foundations (CCF) Dilma Da Silva. "CCF is pleased to be able to support these impactful projects."