Teaching robots to think on the fly

A research team from Princeton, the University of Texas, and Northeastern University is working under a $7.5 million grant from the U.S. Air Force Office of Scientific Research to pave the way for creating such a system. The basic research the team is doing could someday extend to aircraft controls and many other applications, including controlling disease epidemics or making more accurate predictions about climate change or species survival, said Amir Ali Ahmadi, a professor of operations research and financial engineering at Princeton and a member of the research team.

The goal is to exert measures of control over a “dynamical system” — one that changes as it moves. Most dynamical systems are notoriously difficult to predict and manage. Ahmadi, along with colleagues Charles Fefferman, the Herbert E. Jones, Jr. ’43 University Professor of Mathematics, and Clarence Rowley, the Sin-I Cheng Professor in Engineering Science, are trying to design algorithms that can learn the behavior of dynamical systems from data.

“A dynamical system is any entity in some space that evolves through time,” Ahmadi said. “So, an airplane is a dynamical system; a robot is a dynamical system; the spread of a virus is a dynamical system.”

Gaining control is particularly tough when data is limited, said Ahmadi. In the case of a damaged aircraft, “the plane has changed, and you have less than a minute to come up with a new model of control,” he said.

Predicting future performance based on extremely sparse data is a common problem. It is hard to recommend the best response to a disease outbreak, for example, when very little is known about the spread of illness.

In a recent article in the SIAM Review, Ahmadi’s research team presented an approach that uses additional information to rapidly respond to changing conditions in which little data is available for decision-making. This additional information, which mathematicians call side information, acts in the same way that experience or professional expertise does for a human. For example, a doctor might never have seen a particular disease before, but years of experience will help her make a good judgment on how to treat the patient.

“That is what this entire project is about,” Ahmadi said. “It is about learning a system from very little data and eventually controlling it in a way that we desire.”

Starting simple

Long-term goals, such as aircraft controls, are beyond the scope of the immediate project. Rather, the work under the Air Force grant is focusing on much simpler examples in the hopes of learning more about controlling a system riddled with unknowns.

“In standard control theory, you understand what the controls do. We’re trying to make a more powerful version of that theory in which you don’t know what the controls do, but you learn by applying them,” Fefferman said. He is working with Rowley on relatively simple sub-problems of dynamical systems — for instance, trying to temporarily halt an object as it moves along a straight line at a constant speed. In addition, the researchers want to use as little energy as possible to exert control — just as a pilot would want to do in a plane with limited fuel.

Another problem they may tackle is an advanced version of a problem commonly assigned to undergraduate mechanical engineering majors: controlling an inverted pendulum — similar to trying to balance a broomstick in the palm of your hand. The controller would learn the behaviors of the system almost instantly and without knowing where its mass is centered. To do that, they would create equations for controls based on a few seconds of observation, then modify the controls after recording what they do. The model would be designed to rapidly go through several learn-and-control iterations.

Knowledge versus control

The problems the team explores involve tradeoffs between exploring functionality and exploiting the knowledge gained, Rowley said. “If you exploit your knowledge too soon, the model may not be good enough to land a plane. But if you spend too much time learning its behavior, the plane may crash.”

There is no single technique for controlling a system with unknown dynamics, said Ufuk Topcu, a team member and associate professor at the University of Texas. But one of the keys is to select the most valuable data to work on. “You have to tackle it from multiple angles and chop the big problem into more manageable pieces to identify what’s worth learning,” he said.

When the grant ends, the researchers expect to have algorithms for controlling at least some aspects of a dynamical system. Though their model may not be fast enough to operate in real time, it should be able to show which controls are possible in a changing system and with what degree of certainty they can succeed, Ahmadi said.

The article “Learning Dynamical Systems with Side Information” was published in the February issue of the Society for Industrial and Applied Mathematics Review. Besides Ahmadi, authors include Bachir El Khadir, a former graduate student at Princeton who is now at the investment firm Two Sigma. Support for the project was provided in part by the Air Force Office of Scientific Research, the National Science Foundation, DARPA, the Sloan Research Fellowship, the Google Faculty Award, and an Innovation Award of the Princeton School of Engineering and Applied Science.