Page 27 – MIT EECS

Precision home robots learn with real-to-sim-to-real

Posted on August 7, 2024 by Jane Halpern - News

At the top of many automation wish lists is a particularly time-consuming task: chores.

The moonshot of many roboticists is cooking up the proper hardware and software combination so that a machine can learn “generalist” policies (the rules and strategies that guide robot behavior) that work everywhere, under all conditions. Realistically, though, if you have a home robot, you probably don’t care much about it working for your neighbors. MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) researchers decided, with that in mind, to attempt to find a solution to easily train robust robot policies for very specific environments.

“We aim for robots to perform exceptionally well under disturbances, distractions, varying lighting conditions, and changes in object poses, all within a single environment,” says Marcel Torne Villasevil, MIT CSAIL research assistant in the Improbable AI lab and lead author on a recent paper about the work. “We propose a method to create digital twins on the fly using the latest advances in computer vision. With just their phones, anyone can capture a digital replica of the real world, and the robots can train in a simulated environment much faster than the real world, thanks to GPU parallelization. Our approach eliminates the need for extensive reward engineering by leveraging a few real-world demonstrations to jump-start the training process.”

Taking your robot home

RialTo, of course, is a little more complicated than just a simple wave of a phone and (boom!) home bot at your service. It begins by using your device to scan the target environment using tools like NeRFStudio, ARCode, or Polycam. Once the scene is reconstructed, users can upload it to RialTo’s interface to make detailed adjustments, add necessary joints to the robots, and more.

The refined scene is exported and brought into the simulator. Here, the aim is to develop a policy based on real-world actions and observations, such as one for grabbing a cup on a counter. These real-world demonstrations are replicated in the simulation, providing some valuable data for reinforcement learning. “This helps in creating a strong policy that works well in both the simulation and the real world. An enhanced algorithm using reinforcement learning helps guide this process, to ensure the policy is effective when applied outside of the simulator,” says Torne.

Testing showed that RialTo created strong policies for a variety of tasks, whether in controlled lab settings or more unpredictable real-world environments, improving 67 percent over imitation learning with the same number of demonstrations. The tasks involved opening a toaster, placing a book on a shelf, putting a plate on a rack, placing a mug on a shelf, opening a drawer, and opening a cabinet. For each task, the researchers tested the system’s performance under three increasing levels of difficulty: randomizing object poses, adding visual distractors, and applying physical disturbances during task executions. When paired with real-world data, the system outperformed traditional imitation-learning methods, especially in situations with lots of visual distractions or physical disruptions.

“These experiments show that if we care about being very robust to one particular environment, the best idea is to leverage digital twins instead of trying to obtain robustness with large-scale data collection in diverse environments,” says Pulkit Agrawal, director of Improbable AI Lab, MIT electrical engineering and computer science (EECS) associate professor, MIT CSAIL principal investigator, and senior author on the work.

As far as limitations, RialTo currently takes three days to be fully trained. To speed this up, the team mentions improving the underlying algorithms and using foundation models. Training in simulation also has its limitations, and currently it’s difficult to do effortless sim-to-real transfer and simulate deformable objects or liquids.

The next level

So what’s next for RialTo’s journey? Building on previous efforts, the scientists are working on preserving robustness against various disturbances while improving the model’s adaptability to new environments. “Our next endeavor is this approach to using pre-trained models, accelerating the learning process, minimizing human input, and achieving broader generalization capabilities,” says Torne.

“We’re incredibly enthusiastic about our ‘on-the-fly’ robot programming concept, where robots can autonomously scan their environment and learn how to solve specific tasks in simulation. While our current method has limitations — such as requiring a few initial demonstrations by a human and significant compute time for training these policies (up to three days) — we see it as a significant step towards achieving ‘on-the-fly’ robot learning and deployment,” says Torne. “This approach moves us closer to a future where robots won’t need a preexisting policy that covers every scenario. Instead, they can rapidly learn new tasks without extensive real-world interaction. In my view, this advancement could expedite the practical application of robotics far sooner than relying solely on a universal, all-encompassing policy.”

“To deploy robots in the real world, researchers have traditionally relied on methods such as imitation learning from expert data, which can be expensive, or reinforcement learning, which can be unsafe,” says Zoey Chen, a computer science PhD student at the University of Washington who wasn’t involved in the paper. “RialTo directly addresses both the safety constraints of real-world RL [robot learning], and efficient data constraints for data-driven learning methods, with its novel real-to-sim-to-real pipeline. This novel pipeline not only ensures safe and robust training in simulation before real-world deployment, but also significantly improves the efficiency of data collection. RialTo has the potential to significantly scale up robot learning and allows robots to adapt to complex real-world scenarios much more effectively.”

“Simulation has shown impressive capabilities on real robots by providing inexpensive, possibly infinite data for policy learning,” adds Marius Memmel, a computer science PhD student at the University of Washington who wasn’t involved in the work. “However, these methods are limited to a few specific scenarios, and constructing the corresponding simulations is expensive and laborious. RialTo provides an easy-to-use tool to reconstruct real-world environments in minutes instead of hours. Furthermore, it makes extensive use of collected demonstrations during policy learning, minimizing the burden on the operator and reducing the sim2real gap. RialTo demonstrates robustness to object poses and disturbances, showing incredible real-world performance without requiring extensive simulator construction and data collection.”

Torne wrote this paper alongside senior authors Abhishek Gupta, assistant professor at the University of Washington, and Agrawal. Four other CSAIL members are also credited: EECS PhD student Anthony Simeonov SM ’22, research assistant Zechu Li, undergraduate student April Chan, and Tao Chen PhD ’24. Improbable AI Lab and WEIRD Lab members also contributed valuable feedback and support in developing this project.

This work was supported, in part, by the Sony Research Award, the U.S. government, and Hyundai Motor Co., with assistance from the WEIRD (Washington Embodied Intelligence and Robotics Development) Lab. The researchers presented their work at the Robotics Science and Systems (RSS) conference earlier this month.

Study: When allocating scarce resources with AI, randomization can improve fairness

Posted on July 29, 2024 by Jane Halpern - News

Organizations are increasingly utilizing machine-learning models to allocate scarce resources or opportunities. For instance, such models can help companies screen resumes to choose job interview candidates or aid hospitals in ranking kidney transplant patients based on their likelihood of survival.

When deploying a model, users typically strive to ensure its predictions are fair by reducing bias. This often involves techniques like adjusting the features a model uses to make decisions or calibrating the scores it generates.

However, researchers from MIT and Northeastern University argue that these fairness methods are not sufficient to address structural injustices and inherent uncertainties. In a new paper, they show how randomizing a model’s decisions in a structured way can improve fairness in certain situations.

For example, if multiple companies use the same machine-learning model to rank job interview candidates deterministically — without any randomization — then one deserving individual could be the bottom-ranked candidate for every job, perhaps due to how the model weighs answers provided in an online form. Introducing randomization into a model’s decisions could prevent one worthy person or group from always being denied a scarce resource, like a job interview.

Through their analysis, the researchers found that randomization can be especially beneficial when a model’s decisions involve uncertainty or when the same group consistently receives negative decisions.

They present a framework one could use to introduce a specific amount of randomization into a model’s decisions by allocating resources through a weighted lottery. This method, which an individual can tailor to fit their situation, can improve fairness without hurting the efficiency or accuracy of a model.

“Even if you could make fair predictions, should you be deciding these social allocations of scarce resources or opportunities strictly off scores or rankings? As things scale, and we see more and more opportunities being decided by these algorithms, the inherent uncertainties in these scores can be amplified. We show that fairness may require some sort of randomization,” says Shomik Jain, a graduate student in the Institute for Data, Systems, and Society (IDSS) and lead author of the paper.

Jain is joined on the paper by Kathleen Creel, assistant professor of philosophy and computer science at Northeastern University; and senior author Ashia Wilson, the Lister Brothers Career Development Professor in the Department of Electrical Engineering and Computer Science and a principal investigator in the Laboratory for Information and Decision Systems (LIDS). The research will be presented at the International Conference on Machine Learning.

Considering claims

This work builds off a previous paper in which the researchers explored harms that can occur when one uses deterministic systems at scale. They found that using a machine-learning model to deterministically allocate resources can amplify inequalities that exist in training data, which can reinforce bias and systemic inequality.

“Randomization is a very useful concept in statistics, and to our delight, satisfies the fairness demands coming from both a systemic and individual point of view,” Wilson says.

In this paper, they explored the question of when randomization can improve fairness. They framed their analysis around the ideas of philosopher John Broome, who wrote about the value of using lotteries to award scarce resources in a way that honors all claims of individuals.

A person’s claim to a scarce resource, like a kidney transplant, can stem from merit, deservingness, or need. For instance, everyone has a right to life, and their claims on a kidney transplant may stem from that right, Wilson explains.

“When you acknowledge that people have different claims to these scarce resources, fairness is going to require that we respect all claims of individuals. If we always give someone with a stronger claim the resource, is that fair?” Jain says.

That sort of deterministic allocation could cause systemic exclusion or exacerbate patterned inequality, which occurs when receiving one allocation increases an individual’s likelihood of receiving future allocations. In addition, machine-learning models can make mistakes, and a deterministic approach could cause the same mistake to be repeated.

Randomization can overcome these problems, but that doesn’t mean all decisions a model makes should be randomized equally.

Structured randomization

The researchers use a weighted lottery to adjust the level of randomization based on the amount of uncertainty involved in the model’s decision-making. A decision that is less certain should incorporate more randomization.

“In kidney allocation, usually the planning is around projected lifespan, and that is deeply uncertain. If two patients are only five years apart, it becomes a lot harder to measure. We want to leverage that level of uncertainty to tailor the randomization,” Wilson says.

The researchers used statistical uncertainty quantification methods to determine how much randomization is needed in different situations. They show that calibrated randomization can lead to fairer outcomes for individuals without significantly affecting the utility, or effectiveness, of the model.

“There is a balance to be had between overall utility and respecting the rights of the individuals who are receiving a scarce resource, but oftentimes the tradeoff is relatively small,” says Wilson.

However, the researchers emphasize there are situations where randomizing decisions would not improve fairness and could harm individuals, such as in criminal justice contexts.

But there could be other areas where randomization can improve fairness, such as college admissions, and the researchers plan to study other use cases in future work. They also want to explore how randomization can affect other factors, such as competition or prices, and how it could be used to improve the robustness of machine-learning models.

“We are hoping our paper is a first move toward illustrating that there might be a benefit to randomization. We are offering randomization as a tool. How much you are going to want to do it is going to be up to all the stakeholders in the allocation to decide. And, of course, how they decide is another research question all together,” says Wilson.

Study across multiple brain regions discerns Alzheimer’s vulnerability and resilience factors

Posted on July 29, 2024 by Jane Halpern - News

An open-access MIT study published today in Nature provides new evidence for how specific cells and circuits become vulnerable in Alzheimer’s disease, and hones in on other factors that may help some people show resilience to cognitive decline, even amid clear signs of disease pathology.

To highlight potential targets for interventions to sustain cognition and memory, the authors engaged in a novel comparison of gene expression across multiple brain regions in people with or without Alzheimer’s disease, and conducted lab experiments to test and validate their major findings.

Brain cells all have the same DNA but what makes them differ, both in their identity and their activity, are their patterns of how they express those genes. The new analysis measured gene expression differences in more than 1.3 million cells of more than 70 cell types in six brain regions from 48 tissue donors, 26 of whom died with an Alzheimer’s diagnosis and 22 of whom without. As such, the study provides a uniquely large, far-ranging, and yet detailed accounting of how brain cell activity differs amid Alzheimer’s disease by cell type, by brain region, by disease pathology, and by each person’s cognitive assessment while still alive.

“Specific brain regions are vulnerable in Alzheimer’s and there is an important need to understand how these regions or particular cell types are vulnerable,” says co-senior author Li-Huei Tsai, Picower Professor of Neuroscience and director of The Picower Institute for Learning and Memory and the Aging Brain Initiative at MIT. “And the brain is not just neurons. It’s many other cell types. How these cell types may respond differently, depending on where they are, is something fascinating we are only at the beginning of looking at.”

Co-senior author Manolis Kellis, professor of computer science and head of MIT’s Computational Biology Group, likens the technique used to measure gene expression comparisons, single-cell RNA profiling, to being a much more advanced “microscope” than the ones that first allowed Alois Alzheimer to characterize the disease’s pathology more than a century ago.

“Where Alzheimer saw amyloid protein plaques and phosphorylated tau tangles in his microscope, our single-cell ‘microscope’ tells us, cell by cell and gene by gene, about thousands of subtle yet important biological changes in response to pathology,” says Kellis. “Connecting this information with the cognitive state of patients reveals how cellular responses relate with cognitive loss or resilience, and can help propose new ways to treat cognitive loss. Pathology can precede cognitive symptoms by a decade or two before cognitive decline becomes diagnosed. If there’s not much we can do about the pathology at that stage, we can at least try to safeguard the cellular pathways that maintain cognitive function.”

Hansruedi Mathys, a former MIT postdoc in the Tsai Lab who is now an assistant professor at the University of Pittsburgh; Carles Boix PhD ’22, a former graduate student in Kellis’s lab who is now a postdoc at Harvard Medical School; and Leyla Akay, a graduate student in Tsai’s lab, led the study analyzing the prefrontal cortex, entorhinal cortex, hippocampus, anterior thalamus, angular gyrus, and the midtemporal cortex. The brain samples came from the Religious Order Study and the Rush Memory and Aging Project at Rush University.

Neural vulnerability and Reelin

Some of the earliest signs of amyloid pathology and neuron loss in Alzheimer’s occur in memory-focused regions called the hippocampus and the entorhinal cortex. In those regions, and in other parts of the cerebral cortex, the researchers were able to pinpoint a potential reason why. One type of excitatory neuron in the hippocampus and four in the entorhinal cortex were significantly less abundant in people with Alzheimer’s than in people without. Individuals with depletion of those cells performed significantly worse on cognitive assessments. Moreover, many vulnerable neurons were interconnected in a common neuronal circuit. And just as importantly, several either directly expressed a protein called Reelin, or were directly affected by Reelin signaling. In all, therefore, the findings distinctly highlight especially vulnerable neurons, whose loss is associated with reduced cognition, that share a neuronal circuit and a molecular pathway.

Tsai notes that Reelin has become prominent in Alzheimer’s research because of a recent study of a man in Colombia. He had a rare mutation in the Reelin gene that caused the protein to be more active, and was able to stay cognitively healthy at an advanced age despite having a strong family predisposition to early-onset Alzheimer’s. The new study shows that loss of Reelin-producing neurons is associated with cognitive decline. Taken together, it might mean that the brain benefits from Reelin, but that neurons that produce it may be lost in at least some Alzheimer’s patients.

“We can think of Reelin as having maybe some kind of protective or beneficial effect,” Akay says. “But we don’t yet know what it does or how it could confer resilience.”

In further analysis the researchers also found that specifically vulnerable inhibitory neuron subtypes identified in a previously study from this group in the prefrontal cortex also were involved in Reelin signaling, further reinforcing the significance of the molecule and its signaling pathway.

To further check their results, the team directly examined the human brain tissue samples and the brains of two kinds of Alzheimer’s model mice. Sure enough, those experiments also showed a reduction in Reelin-positive neurons in the human and mouse entorhinal cortex.

Resilience associated with choline metabolism in astrocytes

To find factors that might preserve cognition, even amid pathology, the team examined which genes, in which cells, and in which regions, were most closely associated with cognitive resilience, which they defined as residual cognitive function, above the typical cognitive loss expected given the observed pathology.

Their analysis yielded a surprising and specific answer: across several brain regions, astrocytes that expressed genes associated with antioxidant activity and with choline metabolism and polyamine biosynthesis were significantly associated with sustained cognition, even amid high levels of tau and amyloid. The results reinforced previous research findings led by Tsai and Susan Lundqvist in which they showed that dietary supplement of choline helped astrocytes cope with the dysregulation of lipids caused by the most significant Alzheimer’s risk gene, the APOE4 variant. The antioxidant findings also pointed to a molecule that can be found as a dietary supplement, spermidine, which may have anti-inflammatory properties, although such an association would need further work to be established causally.

As before, the team went beyond the predictions from the single-cell RNA expression analysis to make direct observations in the brain tissue of samples. Those that came from cognitively resilient individuals indeed showed increased expression of several of the astrocyte-expressed genes predicted to be associated with cognitive resilience.

Expression of the gene GPCPD1 in astrocyte cells is associated with cognitive resilience in people with Alzheimer’s pathology. Here, white arrows indicate instances of GPCPD1 expression (blue) in astrocyte cells (denoted by AQP4 staining in magenta). There is much more expression in tissue from the cognitively resilient person (right). Image: Tsai Lab/The Picower Institute

New analysis method, open dataset

To analyze the mountains of single-cell data, the researchers developed a new robust methodology based on groups of coordinately-expressed genes (known as “gene modules”), thus exploiting the expression correlation patterns between functionally-related genes in the same module.

“In principle, the 1.3 million cells we surveyed could use their 20,000 genes in an astronomical number of different combinations,” explains Kellis. “In practice, however, we observe a much smaller subset of coordinated changes. Recognizing these coordinated patterns allow us to infer much more robust changes, because they are based on multiple genes in the same functionally-connected module.”

He offered this analogy: With many joints in their bodies, people could move in all kinds of crazy ways, but in practice they engage in many fewer coordinated movements like walking, running, or dancing. The new method enables scientists to identify such coordinated gene expression programs as a group.

While Kellis and Tsai’s labs already reported several noteworthy findings from the dataset, the researchers expect that many more possibly significant discoveries still wait to be found in the trove of data. To facilitate such discovery the team posted handy analytical and visualization tools along with the data on Kellis’s website.

“The dataset is so immensely rich. We focused on only a few aspects that are salient that we believe are very, very interesting, but by no means have we exhausted what can be learned with this dataset,” Kellis says. “We expect many more discoveries ahead, and we hope that young researchers (of all ages) will dive right in and surprise us with many more insights.”

Going forward, Kellis says, the researchers are studying the control circuitry associated with the differentially expressed genes, to understand the genetic variants, the regulators, and other driver factors that can be modulated to reverse disease circuitry across brain regions, cell types, and different stages of the disease.

Additional authors of the study include Ziting Xia, Jose Davila Velderrain, Ayesha P. Ng, Xueqiao Jiang, Ghada Abdelhady, Kyriaki Galani, Julio Mantero, Neil Band, Benjamin T. James, Sudhagar Babu, Fabiola Galiana-Melendez, Kate Louderback, Dmitry Prokopenko, Rudolph E. Tanzi, and David A. Bennett.

Support for the research came from the National Institutes of Health, The Picower Institute for Learning and Memory, The JPB Foundation, the Cure Alzheimer’s Fund, The Robert A. and Renee E. Belfer Family Foundation, Eduardo Eurnekian, and Joseph DiSabato.

The Department of EECS announces new Career Development chairs

Posted on July 25, 2024 by Jane Halpern - EECS Celebrates Awards, News

The Department is pleased to announce the new crop of career development chair recipients, which include:

Marzyeh Ghassemi has been named the Germeshausen Career Development Professor, effective July 1. Ghassemi earned two bachelor’s degrees in computer science and electrical engineering from New Mexico State University as a Goldwater Scholar; her MSc in biomedical engineering from Oxford University as a Marshall Scholar; and her PhD in computer science from MIT. Following stints as a Visiting Researcher with Alphabet’s Verily and an Assistant Professor at University of Toronto, Ghassemi joined EECS and the Institute for Medical Engineering & Science (IMES) as an assistant professor in July 2021. (IMES is the home of the Harvard-MIT Program in Health Sciences and Technology.) She is affiliated with LIDS, the Jameel Clinic, IDSS and CSAIL. Ghassemi’s research in the Healthy ML Group creates a rigorous quantitative framework in which to design, develop and place ML models in a way that is robust and fair, focusing on health settings. Her contributions range from socially-aware model construction; to improving subgroup- and shift-robust learning methods; to identifying important insights in model deployment scenarios that have implications in policy, health practice and equity.

Among other awards, Ghassemi has been named one of MIT Tech Review’s 35 Innovators Under 35; and has been awarded the 2018 Seth J. Teller Award, the 2023 MIT Prize for Open Data, a 2024 NSF CAREER Award, and the Google Research Scholar Award. She founded the non-profit Association for Health, Inference and Learning (AHLI) and her work has been featured in popular press such as Forbes, Fortune, MIT News, and The Huffington Post.

Kaiming He has been named the Douglas Ross (1954) Career Development Professor of Software Technology, effective July 1. He earned his BS from Tsinghua University in 2007 and his PhD from the Chinese University of Hong Kong in 2011 before joining Microsoft Research Asia (MSRA) as a Researcher and then Facebook AI Research (FAIR) as a Research Scientist. He joined the Department of EECS as an associate professor in February, and is affiliated with the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). His research areas include deep learning and computer vision. He is best-known for his work on Deep Residual Networks (ResNets), which have made significant impact on computer vision and broader artificial intelligence; on visual object detection and segmentation, including Faster R-CNN and Mask R-CNN; and on visual self-supervised learning.

He’s awards include the PAMI Young Researcher Award in 2018; three best paper awards, at CVPR 2009, CVPR 2016, and ICCV 2017; two best paper honorable mentions (at ECCV 2018 and CVPR 2021); and an Everingham Prize for selfless contributions to computer vision.

Cheng-Zhi Anna Huang SM ’08, who will join MIT in a shared position between EECS and Music and Theater Arts as an assistant Professor in September, has been named the Robert N. Noyce Career Development Professor. Huang holds a Canada CIFAR AI Chair at Mila, a BM in music composition and BS in computer science from the University of Southern California, an MS from the MIT Media Lab, and a PhD from Harvard University.

She will help develop graduate programming focused on music technology. Previously, Huang spent eight years with Magenta at Google Brain and DeepMind, spearheading efforts in generative modeling, reinforcement learning, and human-computer interaction to support human-AI partnerships in music-making. She is the creator of Music Transformer and Coconet (which powered the Bach Google Doodle), and was a judge and organizer for the AI Song Contest.

Kuikui Liu has been named the Elting Morison Career Development Professor, effective July 1. Liu earned his BS in mathematics and computer science in 2017, an MS in computer science in 2018, and a PhD in computer science in 2022, all from the University of Washington, before coming to MIT as a Foundations of Data Science Institute postdoc at MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) and an assistant professor in EECS. His research interests are in the design and analysis of Markov chains, with applications to statistical physics, high-dimensional geometry, and statistics. To study these complex stochastic dynamics, he develops and uses mathematical tools from fields such as high-dimensional expanders, geometry of polynomials, algebraic combinatorics, statistical physics, and more.

Among other honors, Liu was the co-recipient of a best paper award at STOC 2019; received the William Chan Memorial Dissertation Award in 2022; and the 2023 EATCS Distinguished Dissertation Award.

Alexander Rives will become the Arthur J. Conner (1888) Career Development Professor, effective Jan 2025, when he is due to join the Department of EECS as an assistant professor, with a core membership in the Broad Institute of MIT and Harvard. Rives received his BS in philosophy and biology from Yale University and is completing his PhD in computer science at NYU.

In his research, Rives is focused on AI for scientific understanding, discovery, and design for biology. Rives has previously worked with Meta, where he founded and led the Evolutionary Scale Modeling team that developed large language models for proteins.

Department of EECS names new chair recipients

Posted on July 25, 2024 by Jane Halpern - EECS Celebrates Awards, News

The Department is pleased to announce the new crop of chair recipients, which include:

Adam Chlipala has been named the Arthur J. Conner (1888) Professor, effective July 1. Chlipala earned his BS from Carnegie Mellon University (CMU) in 2003, and his MS and PhD from Berkeley in 2004 and 2007, respectively. He spent time at Jane Street as a software developer and Harvard as a postdoc before joining MIT in 2011. Chlipala is the head of the Programming Languages and Verification Group in CSAIL, where his research focuses on developing methods for integrating the work of software and digital hardware design and verification. His recent work on formally verified compilation for cryptographic libraries has been adopted by Google for its Chrome web browser.

Among other honors, Chlipala has been awarded a 2013 NSF CAREER award, a Best Paper award at SOSP 2015 for his FSCQ work on file system verification, the Most Influential Paper award at the International Conference on Functional Programming (ICFP) 2018 and two Communication of the ACM (CACM) research highlights. He was elected as ACM Distinguished Member in 2019. In 2023, he was awarded the Department of EECS’s Burgess (1952) & Elizabeth Jamieson Award for Excellence in Teaching.

Yael Kalai has been named the Ellen Swallow Richards (1873) Professor, effective July 1. Kalai completed her PhD at MIT in 2006, previously graduating from the Hebrew University of Jerusalem in 1997 and earning a master’s degree at the Weizmann Institute of Science in 2001. A member of CSAIL, Kalai currently focuses on both the theoretical and real-world applications of cryptography, including work on succinct and easily verifiable non-interactive proofs. Her extensive contributions to the field include the 2021 co-invention of ring signatures (a type of digital signature that could protect the identity of a signee) with Ron Rivest and Adi Shamir. Additionally, Kalai’s work on the widely-adopted Fiat-Shamir heuristic established a better understanding of the paradigm’s security issues. Her work later evolved into key components of cryptocurrency systems, and is now used by companies such as Cryptonote, Monero and Etherium.

Her awards include an Outstanding Master’s Thesis Prize from the Weizmann Institute of Science in 2001, the George M. Sprowls Award for Best Doctoral Thesis in Computer Science in 2007, an MIT Presidential Graduate Fellowship 2003 to 2006, an IBM PhD Fellowship from 2004 to 2006, an International Association for Cryptologic Research (IACR) fellowship, and the 2022 ACM Prize in Computing.

Jing Kong has been named the Jerry Mcafee (1940) Professor In Engineering, effective July 1. She received the B.S in chemistry from Peking University in 1997 and the Ph.D. in chemistry from Stanford University in 2002. Following stints as a research scientist at NASA Ames Research Center and a postdoctoral researcher at Delft University, she joined the EECS faculty in 2004. Kong is a principal investigator in RLE, where she heads the Nanomaterials and Electronics Group. Her research interests focus on the vapor deposition synthesis of low dimensional materials including carbon nanotubes, graphene, and other 2D materials, their characterizations and potential applications.

Kong is a member of the IEEE. Among other awards, she has received the Jerome H. Saltzer Award for Excellence in Teaching, the HKN (Eta Kappa Nu) Best Instructor Award, the 2001 Foresight Distinguished Student Award in Nanotechnology in 2001, the Stanford Annual Reviews Prize in Physical Chemistry in 2002, and the MIT 3M Award in 2005.

Sendhil Mullainathan has been named the Peter de Florez Professor, effective July 1. Mullainathan received his BA in computer science, mathematics, and economics from Cornell University and his PhD from Harvard University, then spent five years at MIT before joining the faculty at Harvard in 2004, and then the University of Chicago in 2018; he joins the department of EECS, with a dual appointment in Economics, this month. Mullainathan’s research builds algorithmic tools to understand complex problems in human behavior, social policy and medicine.

Among other awards, he is a recipient of the MacArthur “Genius Grant,” has been designated a “Young Global Leader” by the World Economic Forum, was labeled a “Top 100 Thinker” by Foreign Policy Magazine, and was named to the “Smart List: 50 people who will change the world” by Wired Magazine (UK). He serves on the board of the MacArthur Foundation, is affiliated with the NBER and BREAD, and is a member of the American Academy of Arts and Sciences.

Yury Polyanskiy has been named the Leverett Howell Cutten ’07 And William King Cutten ’39 Professor, effective July 1. Polyanskiy received his M.S. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology in 2005, and his Ph.D. in electrical engineering from Princeton University in 2010. After a postdoc at Princeton, he joined MIT EECS in 2011; he currently serves as the Education Officer for AI+D within the Department of EECS. Polyanskiy’s research interests span information theory, statistical machine learning, error-correcting codes and wireless communication. His work is fundamentally rooted in a fascination with the flow of information: be it in the traditional setting of digital communication, or (more recently) in the domain of machine learning from data.

Polyanskiy’s excellence in research and teaching was recognized by the Department of EECS with the Jerome Salzer Teaching Award in 2016, and with the IEEE Information Theory society James L. Massey award in 2020. In addition, he was elected an IEEE Fellow in 2024, an Amazon Scholar in 2020, and received the 2013 NSF CAREER award and 2011 IEEE Information Theory Society Paper Award.

Caroline Uhler has been named the Andrew (1956) and Erna Viterbi Professor of Engineering, effective July 1. Uhler holds BSc degrees in math and biology, an MSc in mathematics, and an MEd in mathematics education from the University of Zurich (years spanning 2004-7), and a PhD in statistics from UC Berkeley (2011). Before joining MIT as a faculty member in 2015, she spent three years as an assistant professor at IST Austria. She is a professor in both EECS and the Institute for Data, Systems, and Society (IDSS). She is also affiliated with the Laboratory for Information and Decision Systems (LIDS), The Statistics and Data Science Center, and the Operations Research Center (ORC). Additionally, Uhler is a core member of the Broad Institute of MIT and Harvard, where she is the director of the Eric and Wendy Schmidt Center. Uhler’s research focuses on machine learning methods for integrating and translating between vastly different data modalities and inferring causal or regulatory relationships from such data. She is particularly interested in using these methods to gain mechanistic insights into the link between genome packing and regulation in health and disease.

She is an elected member of the International Statistical Institute, and is the recipient of a Simons Investigator Award, a Sloan Research Fellowship, and an NSF Career Award. Recently, she was named a Fellow of the Institute of Mathematical Statistics (IMS), 2024, and a Fellow of the Society for Industrial and Applied Mathematics (SIAM), Class of 2023.

Vinod Vaikuntanathan has been named the Ford Foundation Professor of Engineering, effective July 1. He earned his BTech degree from the Indian Institute of Technology Madras in 2003, and his SM and PhD degrees from MIT in 2005 and 2009, respectively. After a postdoctoral stint at IBM Research, a year as a researcher at Microsoft, and two years as a faculty member at the University of Toronto, he joined the faculty of MIT EECS in September 2013. A principal investigator at CSAIL and the chief cryptographer at Duality Technologies, Vaikuntanathan’s research focuses upon the foundations of cryptography and its applications to theoretical computer science at large. He is known for his work on fully homomorphic encryption (a powerful cryptographic primitive that enables complex computations on encrypted data), as well as lattice-based cryptography (which lays down a new mathematical foundation for cryptography in the post-quantum world). Recently, he has been interested in the interactions of cryptography with quantum computing, as well as with statistics and machine learning.

Among many other awards, Vaikuntanathan has received the Harold E. Edgerton Faculty Award, the Godel Prize, the Simons Investigator Award, the Distinguished Alumnus Award from IIT Madras, a Best Paper Award from CRYPTO 2024, test of time awards from IEEE FOCS and CRYPTO conferences, and was named a MacVicar Faculty Fellow in 2024.

Large language models don’t behave like people, even though we may expect them to

Posted on July 25, 2024 by Jane Halpern - News

One thing that makes large language models (LLMs) so powerful is the diversity of tasks to which they can be applied. The same machine-learning model that can help a graduate student draft an email could also aid a clinician in diagnosing cancer.

However, the wide applicability of these models also makes them challenging to evaluate in a systematic way. It would be impossible to create a benchmark dataset to test a model on every type of question it can be asked.

In a new paper, MIT researchers took a different approach. They argue that, because humans decide when to deploy large language models, evaluating a model requires an understanding of how people form beliefs about its capabilities.

For example, the graduate student must decide whether the model could be helpful in drafting a particular email, and the clinician must determine which cases would be best to consult the model on.

Building off this idea, the researchers created a framework to evaluate an LLM based on its alignment with a human’s beliefs about how it will perform on a certain task.

They introduce a human generalization function — a model of how people update their beliefs about an LLM’s capabilities after interacting with it. Then, they evaluate how aligned LLMs are with this human generalization function.

Their results indicate that when models are misaligned with the human generalization function, a user could be overconfident or underconfident about where to deploy it, which might cause the model to fail unexpectedly. Furthermore, due to this misalignment, more capable models tend to perform worse than smaller models in high-stakes situations.

“These tools are exciting because they are general-purpose, but because they are general-purpose, they will be collaborating with people, so we have to take the human in the loop into account,” says study co-author Ashesh Rambachan, assistant professor of economics and a principal investigator in the Laboratory for Information and Decision Systems (LIDS).

Rambachan is joined on the paper by lead author Keyon Vafa, a postdoc at Harvard University; and Sendhil Mullainathan, an MIT professor in the departments of Electrical Engineering and Computer Science and of Economics, and a member of LIDS. The research will be presented at the International Conference on Machine Learning.

Human generalization

As we interact with other people, we form beliefs about what we think they do and do not know. For instance, if your friend is finicky about correcting people’s grammar, you might generalize and think they would also excel at sentence construction, even though you’ve never asked them questions about sentence construction.

“Language models often seem so human. We wanted to illustrate that this force of human generalization is also present in how people form beliefs about language models,” Rambachan says.

As a starting point, the researchers formally defined the human generalization function, which involves asking questions, observing how a person or LLM responds, and then making inferences about how that person or model would respond to related questions.

If someone sees that an LLM can correctly answer questions about matrix inversion, they might also assume it can ace questions about simple arithmetic. A model that is misaligned with this function — one that doesn’t perform well on questions a human expects it to answer correctly — could fail when deployed.

With that formal definition in hand, the researchers designed a survey to measure how people generalize when they interact with LLMs and other people.

They showed survey participants questions that a person or LLM got right or wrong and then asked if they thought that person or LLM would answer a related question correctly. Through the survey, they generated a dataset of nearly 19,000 examples of how humans generalize about LLM performance across 79 diverse tasks.

Measuring misalignment

They found that participants did quite well when asked whether a human who got one question right would answer a related question right, but they were much worse at generalizing about the performance of LLMs.

“Human generalization gets applied to language models, but that breaks down because these language models don’t actually show patterns of expertise like people would,” Rambachan says.

People were also more likely to update their beliefs about an LLM when it answered questions incorrectly than when it got questions right. They also tended to believe that LLM performance on simple questions would have little bearing on its performance on more complex questions.

In situations where people put more weight on incorrect responses, simpler models outperformed very large models like GPT-4.

“Language models that get better can almost trick people into thinking they will perform well on related questions when, in actuality, they don’t,” he says.

One possible explanation for why humans are worse at generalizing for LLMs could come from their novelty — people have far less experience interacting with LLMs than with other people.

“Moving forward, it is possible that we may get better just by virtue of interacting with language models more,” he says.

To this end, the researchers want to conduct additional studies of how people’s beliefs about LLMs evolve over time as they interact with a model. They also want to explore how human generalization could be incorporated into the development of LLMs.

“When we are training these algorithms in the first place, or trying to update them with human feedback, we need to account for the human generalization function in how we think about measuring performance,” he says.

In the meanwhile, the researchers hope their dataset could be used a benchmark to compare how LLMs perform related to the human generalization function, which could help improve the performance of models deployed in real-world situations.

“To me, the contribution of the paper is twofold. The first is practical: The paper uncovers a critical issue with deploying LLMs for general consumer use. If people don’t have the right understanding of when LLMs will be accurate and when they will fail, then they will be more likely to see mistakes and perhaps be discouraged from further use. This highlights the issue of aligning the models with people’s understanding of generalization,” says Alex Imas, professor of behavioral science and economics at the University of Chicago’s Booth School of Business, who was not involved with this work. “The second contribution is more fundamental: The lack of generalization to expected problems and domains helps in getting a better picture of what the models are doing when they get a problem ‘correct.’ It provides a test of whether LLMs ‘understand’ the problem they are solving.”

This research was funded, in part, by the Harvard Data Science Initiative and the Center for Applied AI at the University of Chicago Booth School of Business.

Department of EECS names Samuel Madden next faculty head of computer science

Posted on July 24, 2024 by Jane Halpern - News

Sam Madden will serve as the next Faculty Head of Computer Science, effective August 1, 2024. In this role, he succeeds longtime community member Arvind, who passed suddenly on June 17.

Madden earned his BS and MEng from MIT in 1999 and his PhD from the University of California, Berkeley in 2003. A member of the MIT EECS faculty since 2004, he was recognized as the inaugural College of Computing Distinguished Professor of Computing in 2020. A principal investigator in CSAIL, Madden’s research interest is in database systems, focusing on database analytics and query processing, ranging from clouds to sensors to modern high-performance server architectures. He co-directs the Data Systems for AI Lab initiative and the Data Systems Group, investigating issues related to systems and algorithms for data focusing on applying new methodologies for processing data, including applying machine learning methods to data systems and engineering data systems for applying machine learning at scale.

Madden was named one of MIT Technology Review’s Top 35 Under 35 in 2005 and an ACM Fellow in 2020, and is the recipient of several awards, including a NSF CAREER award, a Sloan Foundation Fellowship, the ACM SIGMOD Edgar F. Codd Innovations Award, and “test of time” awards from VLDB, SIGMOD, SIGMOBILE, and SenSys. He is also the co-founder and Chief Scientist at Cambridge Mobile Telematics, which develops technology to make roads safer and drivers better.

AI model identifies certain breast tumor stages likely to progress to invasive cancer

Posted on July 23, 2024 by Jane Halpern - News

Ductal carcinoma in situ (DCIS) is a type of preinvasive tumor that sometimes progresses to a highly deadly form of breast cancer. It accounts for about 25 percent of all breast cancer diagnoses.

Because it is difficult for clinicians to determine the type and stage of DCIS, patients with DCIS are often overtreated. To address this, an interdisciplinary team of researchers from MIT and ETH Zurich developed an AI model that can identify the different stages of DCIS from a cheap and easy-to-obtain breast tissue image. Their model shows that both the state and arrangement of cells in a tissue sample are important for determining the stage of DCIS.

Because such tissue images are so easy to obtain, the researchers were able to build one of the largest datasets of its kind, which they used to train and test their model. When they compared its predictions to conclusions of a pathologist, they found clear agreement in many instances.

In the future, the model could be used as a tool to help clinicians streamline the diagnosis of simpler cases without the need for labor-intensive tests, giving them more time to evaluate cases where it is less clear if DCIS will become invasive.

“We took the first step in understanding that we should be looking at the spatial organization of cells when diagnosing DCIS, and now we have developed a technique that is scalable. From here, we really need a prospective study. Working with a hospital and getting this all the way to the clinic will be an important step forward,” says Caroline Uhler, a professor in the Department of Electrical Engineering and Computer Science (EECS) and the Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center at the Broad Institute of MIT and Harvard and a researcher at MIT’s Laboratory for Information and Decision Systems (LIDS).

Uhler, co-corresponding author of a paper on this research, is joined by lead author Xinyi Zhang, a graduate student in EECS and the Eric and Wendy Schmidt Center; co-corresponding author GV Shivashankar, professor of mechogenomics at ETH Zurich jointly with the Paul Scherrer Institute; and others at MIT, ETH Zurich, and the University of Palermo in Italy. The open-access research was published July 20 in Nature Communications.

Combining imaging with AI

Between 30 and 50 percent of patients with DCIS develop a highly invasive stage of cancer, but researchers don’t know the biomarkers that could tell a clinician which tumors will progress.

Researchers can use techniques like multiplexed staining or single-cell RNA sequencing to determine the stage of DCIS in tissue samples. However, these tests are too expensive to be performed widely, Shivashankar explains.

In previous work, these researchers showed that a cheap imagining technique known as chromatin staining could be as informative as the much costlier single-cell RNA sequencing.

For this research, they hypothesized that combining this single stain with a carefully designed machine-learning model could provide the same information about cancer stage as costlier techniques.

First, they created a dataset containing 560 tissue sample images from 122 patients at three different stages of disease. They used this dataset to train an AI model that learns a representation of the state of each cell in a tissue sample image, which it uses to infer the stage of a patient’s cancer.

However, not every cell is indicative of cancer, so the researchers had to aggregate them in a meaningful way.

They designed the model to create clusters of cells in similar states, identifying eight states that are important markers of DCIS. Some cell states are more indicative of invasive cancer than others. The model determines the proportion of cells in each state in a tissue sample.

Organization matters

“But in cancer, the organization of cells also changes. We found that just having the proportions of cells in every state is not enough. You also need to understand how the cells are organized,” says Shivashankar.

With this insight, they designed the model to consider proportion and arrangement of cell states, which significantly boosted its accuracy.

“The interesting thing for us was seeing how much spatial organization matters. Previous studies had shown that cells which are close to the breast duct are important. But it is also important to consider which cells are close to which other cells,” says Zhang.

When they compared the results of their model with samples evaluated by a pathologist, it had clear agreement in many instances. In cases that were not as clear-cut, the model could provide information about features in a tissue sample, like the organization of cells, that a pathologist could use in decision-making.

This versatile model could also be adapted for use in other types of cancer, or even neurodegenerative conditions, which is one area the researchers are also currently exploring.

“We have shown that, with the right AI techniques, this simple stain can be very powerful. There is still much more research to do, but we need to take the organization of cells into account in more of our studies,” Uhler says.

This research was funded, in part, by the Eric and Wendy Schmidt Center at the Broad Institute, ETH Zurich, the Paul Scherrer Institute, the Swiss National Science Foundation, the U.S. National Institutes of Health, the U.S. Office of Naval Research, the MIT Jameel Clinic for Machine Learning and Health, the MIT-IBM Watson AI Lab, and a Simons Investigator Award.

AI method radically speeds predictions of materials’ thermal properties

Posted on July 22, 2024 by Jane Halpern - News

It is estimated that about 70 percent of the energy generated worldwide ends up as waste heat.

If scientists could better predict how heat moves through semiconductors and insulators, they could design more efficient power generation systems. However, the thermal properties of materials can be exceedingly difficult to model.

The trouble comes from phonons, which are subatomic particles that carry heat. Some of a material’s thermal properties depend on a measurement called the phonon dispersion relation, which can be incredibly hard to obtain, let alone utilize in the design of a system.

A team of researchers from MIT and elsewhere tackled this challenge by rethinking the problem from the ground up. The result of their work is a new machine-learning framework that can predict phonon dispersion relations up to 1,000 times faster than other AI-based techniques, with comparable or even better accuracy. Compared to more traditional, non-AI-based approaches, it could be 1 million times faster.

This method could help engineers design energy generation systems that produce more power, more efficiently. It could also be used to develop more efficient microelectronics, since managing heat remains a major bottleneck to speeding up electronics.

“Phonons are the culprit for the thermal loss, yet obtaining their properties is notoriously challenging, either computationally or experimentally,” says Mingda Li, associate professor of nuclear science and engineering and senior author of a paper on this technique.

Li is joined on the paper by co-lead authors Ryotaro Okabe, a chemistry graduate student; and Abhijatmedhi Chotrattanapituk, an electrical engineering and computer science graduate student; Tommi Jaakkola, the Thomas Siebel Professor of Electrical Engineering and Computer Science at MIT; as well as others at MIT, Argonne National Laboratory, Harvard University, the University of South Carolina, Emory University, the University of California at Santa Barbara, and Oak Ridge National Laboratory. The research appears in Nature Computational Science.

Predicting phonons

Heat-carrying phonons are tricky to predict because they have an extremely wide frequency range, and the particles interact and travel at different speeds.

A material’s phonon dispersion relation is the relationship between energy and momentum of phonons in its crystal structure. For years, researchers have tried to predict phonon dispersion relations using machine learning, but there are so many high-precision calculations involved that models get bogged down.

“If you have 100 CPUs and a few weeks, you could probably calculate the phonon dispersion relation for one material. The whole community really wants a more efficient way to do this,” says Okabe.

The machine-learning models scientists often use for these calculations are known as graph neural networks (GNN). A GNN converts a material’s atomic structure into a crystal graph comprising multiple nodes, which represent atoms, connected by edges, which represent the interatomic bonding between atoms.

While GNNs work well for calculating many quantities, like magnetization or electrical polarization, they are not flexible enough to efficiently predict an extremely high-dimensional quantity like the phonon dispersion relation. Because phonons can travel around atoms on X, Y, and Z axes, their momentum space is hard to model with a fixed graph structure.

To gain the flexibility they needed, Li and his collaborators devised virtual nodes.

They create what they call a virtual node graph neural network (VGNN) by adding a series of flexible virtual nodes to the fixed crystal structure to represent phonons. The virtual nodes enable the output of the neural network to vary in size, so it is not restricted by the fixed crystal structure.

Virtual nodes are connected to the graph in such a way that they can only receive messages from real nodes. While virtual nodes will be updated as the model updates real nodes during computation, they do not affect the accuracy of the model.

“The way we do this is very efficient in coding. You just generate a few more nodes in your GNN. The physical location doesn’t matter, and the real nodes don’t even know the virtual nodes are there,” says Chotrattanapituk.

Cutting out complexity

Since it has virtual nodes to represent phonons, the VGNN can skip many complex calculations when estimating phonon dispersion relations, which makes the method more efficient than a standard GNN.

The researchers proposed three different versions of VGNNs with increasing complexity. Each can be used to predict phonons directly from a material’s atomic coordinates.

Because their approach has the flexibility to rapidly model high-dimensional properties, they can use it to estimate phonon dispersion relations in alloy systems. These complex combinations of metals and nonmetals are especially challenging for traditional approaches to model.

The researchers also found that VGNNs offered slightly greater accuracy when predicting a material’s heat capacity. In some instances, prediction errors were two orders of magnitude lower with their technique.

A VGNN could be used to calculate phonon dispersion relations for a few thousand materials in just a few seconds with a personal computer, Li says.

This efficiency could enable scientists to search a larger space when seeking materials with certain thermal properties, such as superior thermal storage, energy conversion, or superconductivity.

Moreover, the virtual node technique is not exclusive to phonons, and could also be used to predict challenging optical and magnetic properties.

In the future, the researchers want to refine the technique so virtual nodes have greater sensitivity to capture small changes that can affect phonon structure.

“Researchers got too comfortable using graph nodes to represent atoms, but we can rethink that. Graph nodes can be anything. And virtual nodes are a very generic approach you could use to predict a lot of high-dimensional quantities,” Li says.

“The authors’ innovative approach significantly augments the graph neural network description of solids by incorporating key physics-informed elements through virtual nodes, for instance, informing wave-vector dependent band-structures and dynamical matrices,” says Olivier Delaire, associate professor in the Thomas Lord Department of Mechanical Engineering and Materials Science at Duke University, who was not involved with this work. “I find that the level of acceleration in predicting complex phonon properties is amazing, several orders of magnitude faster than a state-of-the-art universal machine-learning interatomic potential. Impressively, the advanced neural net captures fine features and obeys physical rules. There is great potential to expand the model to describe other important material properties: Electronic, optical, and magnetic spectra and band structures come to mind.”

This work is supported by the U.S. Department of Energy, National Science Foundation, a Mathworks Fellowship, a Sow-Hsin Chen Fellowship, the Harvard Quantum Initiative, and the Oak Ridge National Laboratory.

Creating and verifying stable AI-controlled systems in a rigorous and flexible way

Posted on July 22, 2024 by Jane Halpern - News

Neural networks have made a seismic impact on how engineers design controllers for robots, catalyzing more adaptive and efficient machines. Still, these brain-like machine-learning systems are a double-edged sword: Their complexity makes them powerful, but it also makes it difficult to guarantee that a robot powered by a neural network will safely accomplish its task.

The traditional way to verify safety and stability is through techniques called Lyapunov functions. If you can find a Lyapunov function whose value consistently decreases, then you can know that unsafe or unstable situations associated with higher values will never happen. For robots controlled by neural networks, though, prior approaches for verifying Lyapunov conditions didn’t scale well to complex machines.

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and elsewhere have now developed new techniques that rigorously certify Lyapunov calculations in more elaborate systems. Their algorithm efficiently searches for and verifies a Lyapunov function, providing a stability guarantee for the system. This approach could potentially enable safer deployment of robots and autonomous vehicles, including aircraft and spacecraft.

To outperform previous algorithms, the researchers found a frugal shortcut to the training and verification process. They generated cheaper counterexamples — for example, adversarial data from sensors that could’ve thrown off the controller — and then optimized the robotic system to account for them. Understanding these edge cases helped machines learn how to handle challenging circumstances, which enabled them to operate safely in a wider range of conditions than previously possible. Then, they developed a novel verification formulation that enables the use of a scalable neural network verifier, α,β-CROWN, to provide rigorous worst-case scenario guarantees beyond the counterexamples.

“We’ve seen some impressive empirical performances in AI-controlled machines like humanoids and robotic dogs, but these AI controllers lack the formal guarantees that are crucial for safety-critical systems,” says Lujie Yang, MIT electrical engineering and computer science (EECS) PhD student and CSAIL affiliate who is a co-lead author of a new paper on the project alongside Toyota Research Institute researcher Hongkai Dai SM ’12, PhD ’16. “Our work bridges the gap between that level of performance from neural network controllers and the safety guarantees needed to deploy more complex neural network controllers in the real world,” notes Yang.

For a digital demonstration, the team simulated how a quadrotor drone with lidar sensors would stabilize in a two-dimensional environment. Their algorithm successfully guided the drone to a stable hover position, using only the limited environmental information provided by the lidar sensors. In two other experiments, their approach enabled the stable operation of two simulated robotic systems over a wider range of conditions: an inverted pendulum and a path-tracking vehicle. These experiments, though modest, are relatively more complex than what the neural network verification community could have done before, especially because they included sensor models.

“Unlike common machine learning problems, the rigorous use of neural networks as Lyapunov functions requires solving hard global optimization problems, and thus scalability is the key bottleneck,” says Sicun Gao, associate professor of computer science and engineering at the University of California at San Diego, who wasn’t involved in this work. “The current work makes an important contribution by developing algorithmic approaches that are much better tailored to the particular use of neural networks as Lyapunov functions in control problems. It achieves impressive improvement in scalability and the quality of solutions over existing approaches. The work opens up exciting directions for further development of optimization algorithms for neural Lyapunov methods and the rigorous use of deep learning in control and robotics in general.”

Yang and her colleagues’ stability approach has potential wide-ranging applications where guaranteeing safety is crucial. It could help ensure a smoother ride for autonomous vehicles, like aircraft and spacecraft. Likewise, a drone delivering items or mapping out different terrains could benefit from such safety guarantees.

The techniques developed here are very general and aren’t just specific to robotics; the same techniques could potentially assist with other applications, such as biomedicine and industrial processing, in the future.

While the technique is an upgrade from prior works in terms of scalability, the researchers are exploring how it can perform better in systems with higher dimensions. They’d also like to account for data beyond lidar readings, like images and point clouds.

As a future research direction, the team would like to provide the same stability guarantees for systems that are in uncertain environments and subject to disturbances. For instance, if a drone faces a strong gust of wind, Yang and her colleagues want to ensure it’ll still fly steadily and complete the desired task.

Also, they intend to apply their method to optimization problems, where the goal would be to minimize the time and distance a robot needs to complete a task while remaining steady. They plan to extend their technique to humanoids and other real-world machines, where a robot needs to stay stable while making contact with its surroundings.

Russ Tedrake, the Toyota Professor of EECS, Aeronautics and Astronautics, and Mechanical Engineering at MIT, vice president of robotics research at TRI, and CSAIL member, is a senior author of this research. The paper also credits University of California at Los Angeles PhD student Zhouxing Shi and associate professor Cho-Jui Hsieh, as well as University of Illinois Urbana-Champaign assistant professor Huan Zhang. Their work was supported, in part, by Amazon, the National Science Foundation, the Office of Naval Research, and the AI2050 program at Schmidt Sciences. The researchers’ paper will be presented at the 2024 International Conference on Machine Learning.