Deep learning is an increasingly popular form of artificial intelligence that’s routinely used in products and services that impact hundreds of millions of lives, despite the fact that no one quite understands how it works.
The Office of Naval Research has awarded a five-year, $7.5 million grant to a group of engineers, computer scientists, mathematicians and statisticians who think they can unravel the mystery. Their task: develop a theory of deep learning based on rigorous mathematical principles.
The grant to researchers from Rice University, Johns Hopkins University, Texas A&M University, the University of Maryland, the University of Wisconsin, UCLA and Carnegie Mellon University, was made through the Department of Defense’s Multidisciplinary University Research Initiative (MURI).
Richard Baraniuk, the Rice engineering professor who’s leading the effort, has spent nearly three decades studying signal processing in general and machine learning in particular, the branch of AI to which deep learning belongs. Rice computer science professor Moshe Vardi is co-principal investigator.
According to Baraniuk, there’s no question deep learning works, but there are big question marks over its future.
“Deep learning has radically advanced the field of AI, and it is surprisingly effective over a wide range of problems,” said Baraniuk, Rice’s Victor E. Cameron Professor of Electrical and Computer Engineering. “But virtually all of the progress has come from empirical observations, hacks and tricks. Nobody understands exactly why deep neural networks work or how.”
Deep neural networks are made of artificial neurons, pieces of computer code that can learn to perform specific tasks using training examples. “Deep” networks contain millions or even billions of neurons in many layers. Remarkably, deep neural networks don’t need to be explicitly programmed to make humanlike decisions. They learn by themselves, based on the information they are given during training.
Because people don’t understand exactly how deep networks learn, it is impossible to say why they make the decisions they make after they are fully trained. This has raised questions about when it is appropriate to use such systems, and it makes it impossible to predict how often a trained network will make an improper decision and under what circumstances.
Baraniuk said the lack of theoretical principles is holding deep learning back, particularly in application areas like the military, where reliability and predictability are crucial.
“As these systems are deployed – in robots, driverless cars or systems that decide who should go to jail and who should get a credit card or loan – there’s a huge imperative to understand how and why they work so that we can also know how and why they fail,” said Baraniuk, the principal investigator on the MURI grant.
His team includes co-principal investigators Moshe Vardi of Rice, Rama Chellappa of Johns Hopkins, Ronald DeVore of Texas A&M, Thomas Goldstein of the University of Maryland, Robert Nowak of the University of Wisconsin, Stanley Osher of UCLA and Ryan Tibshirani of Carnegie Mellon.
Baraniuk said they will attack the problem from three different perspectives.
“One is mathematical,” he said. “It turns out that deep networks are very easy to describe locally. If you look at what’s going on in a specific neuron, it’s actually easy to describe. But we don’t understand how those pieces – literally millions of them – fit together into a global whole. We call that local to global understanding.”
A second perspective is statistical. “What happens when the input signals, the knobs in the networks, have randomness?” Baraniuk asked. “We’d like to be able to predict how well a network will perform when we turn the knobs. That’s a statistical question and will offer another perspective.”
The third perspective is formal methods, or formal verification, a field that deals with the problem of verifying whether systems are functioning as intended, especially when they are so large or complex that it is impossible to check each line of code or individual component. This component of the MURI research will be led by Vardi, a leading expert in the field.
“Over the past 40 years, formal-methods researchers have developed techniques to reason about and analyze complex computing systems,” Vardi said. “Deep neural networks are essentially large, complex computing systems, so we are going to analyze them using formal-methods techniques.”
Baraniuk said the MURI investigators have each previously worked on pieces of the overall solution, and the grant will enable them to collaborate and drawn upon one another’s work to go in new directions. Ultimately, the goal is to develop a set of rigorous principles that can take the guesswork out of designing, building, training and using deep neural networks.
“Today, it’s like people have a bunch of Legos, and you just put a bunch of them together and see what works,” he said. “If I ask, ‘Why are you putting a yellow Lego there?’ then the answer might be, ‘That was the next one in the pile,’ or, ‘I have a hunch that yellow will be best,’ or, ‘We tried other colors, and we don’t know why, but yellow works best.'”
Baraniuk contrasted this design approach with those you’d find in fields like signal processing or control, which are grounded on established theories.
“Instead of just putting the Legos together in semirandom ways and then testing them, there would be an established set of principles that guide people in putting together a system,” he said. “If someone says, ‘Hey, why are you using red bricks there?’ you’d say, ‘Because the ABC principle says that it makes sense,’ and you could explain, precisely, why that is the case.
“Those principles not only guide the design of the system but also allow you to predict its performance before you build it.”
Baraniuk said the COVID-19 pandemic hasn’t slowed the project, which is already underway.
“Our plans call for an annual workshop, but we’re a distributed team and the majority of our communication was to be done by remote teleconferencing,” he said.