Efficient technique improves machine-learning models’ reliability

Powerful machine-learning models are being used to help people tackle tough problems such as identifying disease in medical images or detecting road obstacles for autonomous vehicles. But machine-learning models can make mistakes, so in high-stakes settings it’s critical that humans know when to trust a model’s predictions.

Uncertainty quantification is one tool that improves a model’s reliability; the model produces a score along with the prediction that expresses a confidence level that the prediction is correct. While uncertainty quantification can be useful, existing methods typically require retraining the entire model to give it that ability. Training involves showing a model millions of examples so it can learn a task. Retraining then requires millions of new data inputs, which can be expensive and difficult to obtain, and also uses huge amounts of computing resources.

Researchers at MIT and the MIT-IBM Watson AI Lab have now developed a technique that enables a model to perform more effective uncertainty quantification, while using far fewer computing resources than other methods, and no additional data. Their technique, which does not require a user to retrain or modify a model, is flexible enough for many applications.

The technique involves creating a simpler companion model that assists the original machine-learning model in estimating uncertainty. This smaller model is designed to identify different types of uncertainty, which can help researchers drill down on the root cause of inaccurate predictions.

“Uncertainty quantification is essential for both developers and users of machine-learning models. Developers can utilize uncertainty measurements to help develop more robust models, while for users, it can add another layer of trust and reliability when deploying models in the real world. Our work leads to a more flexible and practical solution for uncertainty quantification,” says Maohao Shen, an electrical engineering and computer science graduate student and lead author of a paper on this technique.

Shen wrote the paper with Yuheng Bu, a former postdoc in the Research Laboratory of Electronics (RLE) who is now an assistant professor at the University of Florida; Prasanna Sattigeri, Soumya Ghosh, and Subhro Das, research staff members at the MIT-IBM Watson AI Lab; and senior author Gregory Wornell, the Sumitomo Professor in Engineering who leads the Signals, Information, and Algorithms Laboratory RLE and is a member of the MIT-IBM Watson AI Lab. The research will be presented at the AAAI Conference on Artificial Intelligence.

Quantifying uncertainty

In uncertainty quantification, a machine-learning model generates a numerical score with each output to reflect its confidence in that prediction’s accuracy. Incorporating uncertainty quantification by building a new model from scratch or retraining an existing model typically requires a large amount of data and expensive computation, which is often impractical. What’s more, existing methods sometimes have the unintended consequence of degrading the quality of the model’s predictions.

The MIT and MIT-IBM Watson AI Lab researchers have thus zeroed in on the following problem: Given a pretrained model, how can they enable it to perform effective uncertainty quantification?

They solve this by creating a smaller and simpler model, known as a metamodel, that attaches to the larger, pretrained model and uses the features that larger model has already learned to help it make uncertainty quantification assessments.

“The metamodel can be applied to any pretrained model. It is better to have access to the internals of the model, because we can get much more information about the base model, but it will also work if you just have a final output. It can still predict a confidence score,” Sattigeri says.

They design the metamodel to produce the uncertainty quantification output using a technique that includes both types of uncertainty: data uncertainty and model uncertainty. Data uncertainty is caused by corrupted data or inaccurate labels and can only be reduced by fixing the dataset or gathering new data. In model uncertainty, the model is not sure how to explain the newly observed data and might make incorrect predictions, most likely because it hasn’t seen enough similar training examples. This issue is an especially challenging but common problem when models are deployed. In real-world settings, they often encounter data that are different from the training dataset.

“Has the reliability of your decisions changed when you use the model in a new setting? You want some way to have confidence in whether it is working in this new regime or whether you need to collect training data for this particular new setting,” Wornell says.

Validating the quantification

Once a model produces an uncertainty quantification score, the user still needs some assurance that the score itself is accurate. Researchers often validate accuracy by creating a smaller dataset, held out from the original training data, and then testing the model on the held-out data. However, this technique does not work well in measuring uncertainty quantification because the model can achieve good prediction accuracy while still being over-confident, Shen says.

They created a new validation technique by adding noise to the data in the validation set — this noisy data is more like out-of-distribution data that can cause model uncertainty. The researchers use this noisy dataset to evaluate uncertainty quantifications.

They tested their approach by seeing how well a meta-model could capture different types of uncertainty for various downstream tasks, including out-of-distribution detection and misclassification detection. Their method not only outperformed all the baselines in each downstream task but also required less training time to achieve those results.

This technique could help researchers enable more machine-learning models to effectively perform uncertainty quantification, ultimately aiding users in making better decisions about when to trust predictions.

Moving forward, the researchers want to adapt their technique for newer classes of models, such as large language models that have a different structure than a traditional neural network, Shen says.

The work was funded, in part, by the MIT-IBM Watson AI Lab and the U.S. National Science Foundation.

Paying it forward: computer science and molecular biology major Sherry Nyeo

Since arriving at MIT in fall 2019, senior Sherry Nyeo has conducted groundbreaking work in multiple labs on campus, acted as a mentor to countless other students, and made a lasting mark on the Institute community. But despite her well-earned bragging rights, Nyeo isn’t one to boast. Instead, she takes every opportunity to express just how grateful she is to the professors, alumni, and fellow students who have helped and inspired her during her time at MIT. “I like helping people if I can,” says Nyeo, who is majoring in computer science and molecular biology, “because I got helped so much.”

Nyeo’s passion for science began when she applied for the Selective Science Program at Tainan First Senior High School, widely considered one of the most prestigious high schools in Taiwan. “Preparing for that process made me realize that biology was pretty cool,” she recalls.

When Nyeo was 16, her family moved from Taiwan to Colorado, where she continued to cultivate her interest in STEM. Although she excelled at biology, she initially struggled to master computer science. “[Programming] was really hard for me,” she says. “It was a completely different way of thinking.” When she arrived at MIT, she decided to pursue a degree in computer science precisely because she knew she would find it challenging and because she appreciates how vital data analysis is to the field of biology. After all, she says, when you’re working at the scale of cells and molecules, “you need a lot of data to describe what’s going on.”

In the winter of her first year at MIT, Nyeo began doing hands-on research in laboratories on campus through the Undergraduate Research Opportunities Program (UROP). Her work in the lab of Whitehead Fellow Silvi Rouskin sparked an enduring interest in RNA, which she has come to regard as her “favorite biomolecule.”

Nyeo’s work in the Rouskin lab focused on alternative RNA structures and the roles they play in human and viral biology. While DNA mostly exists as a double helix, RNA can fold itself into a huge variety of structures in order to fulfill different functions. During her time as a student researcher, Nyeo has demonstrated a similar ability to adapt to different circumstances. When MIT campus members evacuated due to the Covid-19 pandemic in March 2020, and her UROP became entirely remote, she treated her time away from the lab as an opportunity to explore the computational side of research. Her work was subsequently included in a Nature Communications paper on the SARS-CoV-2 genome, on which she is listed as a co-author.

Since returning to campus, Nyeo has often worked in multiple labs simultaneously, conducting innovative research while also juggling classes, internships, and several demanding extracurriculars. Through it all, she has continued to pursue her fascination with RNA, a tiny, somewhat unassuming molecule that nonetheless has a massive impact on practically every aspect of our biology. Nyeo, who has shown herself to be equally multifaceted, seems especially well-suited to the study of this remarkable biomolecule.

Although Nyeo’s work in the life sciences keeps her busy, she finds time to nurture a diverse set of other passions. She took a class on experimental ethics, is working on an original screenplay, and has even picked up a minor in German. Since her sophomore year, she has also been a part of the New Engineering Education Transformation (NEET) program, which provides students with multidisciplinary interests the opportunity to collaborate across departments. Through NEET, currently directed by professor of biological engineering Mark Bathe, Nyeo has been able to pursue her interest in bioengineering research and connect to a vast community of students and professors. Most recently, she has been working within the Bathe BioNano Lab to use DNA to engineer new materials at the nanometer scale.

Nyeo hopes to put her skills to use by pursuing a career in biotechnology. She is currently minoring in management and dreams of one day starting her own company. But she doesn’t want to leave academia behind just yet and has begun working on applications for PhD programs in biology. “I originally came in thinking that I would just go straight into the biotech industry,” Nyeo explains. “And then I realized that I don’t dislike research and that I actually enjoy it.”

As part of her current work in the lab of professor of biology David Bartel, Nyeo investigates how viral infection affects RNA metabolism, and she often finds herself using her computational skills to help postdocs with their data analysis. In fact, one of the things Nyeo has most enjoyed about working as a student researcher is the opportunity to join a network of people who provide one another with support and guidance.

Nyeo’s willingness to help others is perhaps the aspect of her personality that best suits her to the study of RNA. Over the past few decades, researchers have discovered an increasingly large number of therapeutic uses for RNA, including cancer immunotherapy and vaccine development. In the summer of 2022, Nyeo worked as an intern at Eli Lilly and Company, where she helped identify potential targets for RNA therapeutics. She may continue to explore this area of research when she eventually enters the biotech industry. In the meantime, however, she’s finding ways to help people closer to home.

Since her first year, Nyeo has been a part of the MIT Biotech Group. When she first joined, the group had a fairly small undergraduate presence, and most events were geared toward graduate students and postdocs. Nyeo immediately dedicated herself to making the group more welcoming for undergraduates. As the director of the Undergraduate Initiative and later the undergraduate student president, she was a leading architect of a new seminar series in which MIT alumni came to campus to teach undergraduates about biotechnology. “There are a lot of technical terms associated with [biotech],” Nyeo explains. “If you just come in as an undergrad, not knowing what’s happening, that can be a bit daunting.”

Between her research in the Bartel lab and her work with NEET and the MIT Biotech Group, Nyeo doesn’t have a lot of free time, but she dedicates most of it to making MIT a friendlier environment for new students. She promotes research opportunities as a UROP panelist and has worked as an associate advisor since her junior year. She helps first-year students choose and register for classes, works with faculty advisors, and provides moral support to students who are feeling overwhelmed with options. “When I came [to MIT], I also didn’t know what I wanted to do,” Nyeo explains. “Upperclassmen helped me a lot with that process, and I want to pay it forward.”

Regina Barzilay, other MIT community members elected to the National Academy of Engineering for 2023

Seven MIT researchers are among the 106 new members and 18 international members elected to the National Academy of Engineering (NAE) this week. Fourteen additional MIT alumni, including one member of the MIT Corporation, were also elected as new members.

One of the highest professional distinctions for engineers, membership to the NAE is given to individuals who have made outstanding contributions to “engineering research, practice, or education, including, where appropriate, significant contributions to the engineering literature” and to “the pioneering of new and developing fields of technology, making major advancements in traditional fields of engineering, or developing/implementing innovative approaches to engineering education.”

The seven MIT researchers elected this year include:

Regina Barzilay, the School of Engineering Distinguished Professor for AI and Health in the Department of Electrical Engineering and Computer Science, principal investigator at the Computer Science and Artificial Intelligence Laboratory, and faculty lead for the MIT Abdul Latif Jameel Clinic for Machine Learning in Health, for machine learning models that understand structures in text, molecules, and medical images.

Markus J. Buehler, the Jerry McAfee (1940) Professor in Engineering from the Department of Civil and Environmental Engineering, for implementing the use of nanomechanics to model and design fracture-resistant bioinspired materials.

Elfatih A.B. Eltahir SM ’93, ScD ’93, the H.M. King Bhumibol Professor in the Department of Civil and Environmental Engineering, for advancing understanding of how climate and land use impact water availability, environmental and human health, and vector-borne diseases.

Neil Gershenfeld, director of the Center for Bits and Atoms, for eliminating boundaries between digital and physical worlds, from quantum computing to digital materials to the internet of things.

Roger D. Kamm SM ’73, PhD ’77, the Cecil and Ida Green Distinguished Professor of Biological and Mechanical Engineering, for contributions to the understanding of mechanics in biology and medicine, and leadership in biomechanics.

David W. Miller ’82, SM ’85, ScD ’88, the Jerome C. Hunsaker Professor in the Department of Aeronautics and Astronautics, for contributions in control technology for space-based telescope design, and leadership in cross-agency guidance of space technology.

David Simchi-Levi, professor of civil and environmental engineering, core faculty member in the Institute for Data, Systems, and Society, and principal investigator at the Laboratory for Information and Decision Systems, for contributions using optimization and stochastic modeling to enhance supply chain management and operations.

Fariborz Maseeh ScD ’90, life member of the MIT Corporation and member of the School of Engineering Dean’s Advisory Council, was also elected as a member for leadership and advances in efficient design, development, and manufacturing of microelectromechanical systems, and for empowering engineering talent through public service.

Thirteen additional alumni were elected to the National Academy of Engineering this year. They are: Mark George Allen SM ’86, PhD ’89; Shorya Awtar ScD ’04; Inderjit Chopra ScD ’77; David Huang ’85, SM ’89, PhD ’93; Eva Lerner-Lam SM ’78; David F. Merrion SM ’59; Virginia Norwood ’47; Martin Gerard Plys ’80, SM ’81, ScD ’84; Mark Prausnitz PhD ’94; Anil Kumar Sachdev ScD ’77; Christopher Scholz PhD ’67; Melody Ann Swartz PhD ’98; and Elias Towe ’80, SM ’81, PhD ’87.

“I am delighted that seven members of MIT’s faculty and many members of the wider MIT community were elected to the National Academy of Engineering this year,” says Anantha Chandrakasan, the dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science. “My warmest congratulations on this recognition of their many contributions to engineering research and education.”

Including this year’s inductees, 156 members of the National Academy of Engineering are current or retired members of the MIT faculty and staff, or members of the MIT Corporation.

Connor Coley, Dylan Hadfield-Menell named AI2050 Early Career Fellows

Department of EECS Assistant Professors Connor Coley and Dylan Hadfield-Menell have been named to the inaugural cohort of AI2050 Early Career Fellows by Schmidt Futures, a philanthropic initiative from Eric and Wendy Schmidt aimed at helping to solve hard problems in AI.

Connor W. Coley is an Assistant Professor at MIT in the Department of Chemical Engineering and the Department of Electrical Engineering and Computer Science. He received his B.S. and Ph.D. in Chemical Engineering from Caltech and MIT, respectively, and did his postdoctoral training at the Broad Institute. His research group at MIT develops new methods at the intersection of data science, chemistry, and laboratory automation to streamline discovery in the chemical sciences with an emphasis on therapeutic discovery. Key research areas in the group include the design of new neural models for representation learning on molecules, data-driven synthesis planning, in silico strategies for predicting the outcomes of organic reactions, model-guided Bayesian optimization, and de novo molecular generation. Besides the AI2050 Early Career Fellowship, Connor is a recipient of C&EN’s “Talented Twelve” award, Forbes Magazine’s “30 Under 30” for Healthcare, the NSF CAREER award, the Bayer Early Excellence in Science Award.

Dylan Hadfield-Menell is an Assistant Professor in the Department of Electrical Engineering and Computer Science. His research focuses on the problem of agent alignment: the challenge of identifying algorithmic solutions to alignment problems that arise from groups of AI systems, principal-agent pairs (i.e., human-robot teams), and societal oversight of ML systems. He aims to develop frameworks that account for uncertainty about the objective being optimized.  Dylan earned his PhD from the University of California Berkeley and his undergraduate degree from MIT; besides the AI2050 Early Career Fellowship, he is a recipient of the Berkeley Fellowship, the NSF Graduate Research Fellowship, and the C.V. Ramamoorthy Distinguished Research Award.

Conceived and co-chaired by Eric Schmidt and James Manyika, AI2050 stems in part from issues raised in the bestselling book, The Age of AI: And our Human Future, co-authored by Schmidt, Henry Kissinger, and Schwarzman College of Computing Dean Dan Huttenlocher. The program will build upon and amplify Schmidt Futures’ efforts totaling $400 million to support outstanding talent working with AI and other modern tools to solve hard problems in science and society. 

Making computer science research more accessible in India

Imagine that you are teaching a technical subject to children in a small village. They are eager to learn, but you face a problem: There are few resources to educate them in their mother tongue.

This is a common experience in India, where the quality of textbooks written in many local languages pales in comparison to those written in English. To address educational inequality, the Indian government launched an initiative in 2020 that would improve the quality of these resources for hundreds of millions of people, but its implementation remains a massive undertaking.

Siddhartha Jayanti, an MIT PhD student in electrical engineering and computer science (EECS) who is an affiliate of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Google Research, encountered this problem first-hand when teaching students in India about math, science, and English. During the summer after his first year as an undergraduate at Princeton University, Jayanti visited the town of Bhimavaram, volunteering as an organizer, teacher, and mentor at a five-week education camp. He worked with economically disadvantaged children from villages across the region. They spoke Telugu, Jayanti’s mother tongue, but faced linguistic barriers because of the complex English used in academic work.

According to the World Economic Forum and U.S. Census data, Telugu is the United States’ fastest-growing language, while Ethnologue estimates over 95 million speakers worldwide, further emphasizing the need for more academic materials in the vernacular.

As a distributed computing and AI researcher with a shared cultural background, Jayanti was in a unique position to help. With millions of Telugu speakers in mind, Jayanti wrote the first original computer science paper to be composed entirely in Telugu in 2018. This research then became publicly accessible on arXiv in 2022, focusing on designing simple, fast, scalable, and reliable multiprocessor algorithms and analyzing fundamental communication and coordination tasks between processors.

Processors are electronic circuitry that execute computer programs, making them notorious for their many moving parts. “Think about processors as people completing a task,” says Jayanti. “If you have one processor, that is like one person doing a task. If you have 200 people instead, then ideally your team will solve problems faster, but this is not always the case. Coordinating multiple processors to achieve speedups requires clever algorithmic design, and there are sometimes fundamental communication barriers that limit how fast we can solve problems.”

To solve computing problems, each process in a multicore system follows a strict procedure, which is also known as a multiprocessor algorithm. Still, there are certain limits on how quickly processors can interact with each other to compute solutions. Jayanti’s paper highlighted a key communication bottleneck for these algorithms, known as generalized wake-up (GWU), where a processor “wakes up” when it has executed its first line of code. 

But the question remains: Can each processor figure out that the others have woken up? Jayanti indicates that the answer is yes, but due to the work each solution requires, there are certain mathematical limits to how quickly GWU can be resolved.

The issue is part of a larger trend: The multicore revolution, where many chip manufacturers are no longer prioritizing faster processing speed. Instead, chips are now commonly designed with multiple cores, or smaller processors within larger CPUs. Multicore chips are now commonplace in many phones and laptops.

“Modern technology requires simple, fast, and reliable multiprocessor algorithms,” says Jayanti. “Huge speedups and better coordination is the goal, but even using multiprocessor algorithms, we can prove that communication problems can only be solved so quickly.”

Overcoming significant linguistic barriers to communicating state-of-the-art research in Telugu, Jayanti invented new technical vocabulary for the paper using Sanskrit, the classical language of India, which heavily influences Telugu. For example, there was no word for technical terms like “shared-memory multiprocessor” in Telugu. Jayanti changed that, coining the word saṁvibhakta-smr̥ti bahusaṁsādhakamu (సంవిభక్తస్మృతి బహుసంసాధకము).

While the term may seem daunting and complex at first, Jayanti’s process was simple: Use Sanskrit root words to coin new words in Telugu. For instance, the Sanskrit root “vibhaj” means “to partition” while “smr̥” means “to remember, recollect, or memorize.” After modifying these words with prefixes and suffixes, the results are “saṁvibhakta” (“shared”) and “smr̥ti” (“memory”), or “saṁvibhakta-smr̥ti” (“shared-memory”) in Telugu.

Passionate about creating educational opportunities in India, Jayanti has visited schools in several states, including Telangana, Andhra Pradesh, and Karnataka. He travels to India yearly, occasionally making stops at universities like the International Centre for Theoretical Sciences and those within the Indian Institutes of Technology.

By creating new technical vocabulary, Jayanti sees his work as an opportunity to empower more people to pursue their dreams in science. His Telugu paper opens the doors for millions of native speakers to access STEM research.

“Knowledge is universal, brings joy, opens doors to new opportunities, and has the power to enlighten and bring people of diverse backgrounds closer together in pursuit of a better world,” says Jayanti. “My scientific learnings and discoveries have brought me in contact with great minds around the world, and I hope that some of my work can open up a gateway for more people worldwide.”

As part of his PhD thesis, Jayanti proposed the Samskrtam Technical Lexicon Project, which would bridge further education gaps by developing a dictionary of modern technical terms in STEM for speakers of local Indian languages and academics. “The project aims to forge a close collaboration between scholars of STEM, Sanskrit, and other vernaculars to expand science-availability in language communities that span over a billion people,” according to Jayanti.

Jayanti’s research also fueled further studies of multicore processing speeds. In 2019, he teamed up with Robert Tarjan, a professor of computer science at Princeton and Turing Award winner, as well as Enric Boix-Adserà, an MIT PhD student in EECS to demonstrate lower bound speed limits for data structures like union-find, where algorithms can create a “union” between disjointed datasets while “finding” whether two items are currently in the same set. 

The team leveraged Jayanti’s research on GWU to prove certain limits on how fast algorithms can be, even harnessing the power of multiple cores. Jayanti and Tarjan have designed some of the fastest algorithms for the concurrent union-find problem yet, making analysis of large graphs like the internet and road networks much more efficient. In fact, these algorithms are close to the mathematical speed barrier for solving union-find.

Jayanti’s 2018 research paper in Telugu was presented along with an abstract in Sanskrit as one of the 14 chapters of his thesis last year, and his team’s 2019 paper was presented at the Symposium on Principles of Distributed Computing. His graduate studies were supported by the U.S. Department of Defense through the National Defense Science and Engineering Graduate Fellowship.

Computers that power self-driving cars could be a huge driver of global carbon emissions

In the future, the energy needed to run the powerful computers on board a global fleet of autonomous vehicles could generate as many greenhouse gas emissions as all the data centers in the world today.

That is one key finding of a new study from MIT researchers that explored the potential energy consumption and related carbon emissions if autonomous vehicles are widely adopted.

The data centers that house the physical computing infrastructure used for running applications are widely known for their large carbon footprint: They currently account for about 0.3 percent of global greenhouse gas emissions, or about as much carbon as the country of Argentina produces annually, according to the International Energy Agency. Realizing that less attention has been paid to the potential footprint of autonomous vehicles, the MIT researchers built a statistical model to study the problem. They determined that 1 billion autonomous vehicles, each driving for one hour per day with a computer consuming 840 watts, would consume enough energy to generate about the same amount of emissions as data centers currently do.

The researchers also found that in over 90 percent of modeled scenarios, to keep autonomous vehicle emissions from zooming past current data center emissions, each vehicle must use less than 1.2 kilowatts of power for computing, which would require more efficient hardware. In one scenario — where 95 percent of the global fleet of vehicles is autonomous in 2050, computational workloads double every three years, and the world continues to decarbonize at the current rate — they found that hardware efficiency would need to double faster than every 1.1 years to keep emissions under those levels.

“If we just keep the business-as-usual trends in decarbonization and the current rate of hardware efficiency improvements, it doesn’t seem like it is going to be enough to constrain the emissions from computing onboard autonomous vehicles. This has the potential to become an enormous problem. But if we get ahead of it, we could design more efficient autonomous vehicles that have a smaller carbon footprint from the start,” says first author Soumya Sudhakar, a graduate student in aeronautics and astronautics.

Sudhakar wrote the paper with her co-advisors Vivienne Sze, associate professor in the Department of Electrical Engineering and Computer Science (EECS) and a member of the Research Laboratory of Electronics (RLE); and Sertac Karaman, associate professor of aeronautics and astronautics and director of the Laboratory for Information and Decision Systems (LIDS). The research appears today in the January-February issue of IEEE Micro and was presented in a TEDx talk.

Modeling emissions

The researchers built a framework to explore the operational emissions from computers on board a global fleet of electric vehicles that are fully autonomous, meaning they don’t require a back-up human driver.

The model is a function of the number of vehicles in the global fleet, the power of each computer on each vehicle, the hours driven by each vehicle, and the carbon intensity of the electricity powering each computer.

“On its own, that looks like a deceptively simple equation. But each of those variables contains a lot of uncertainty because we are considering an emerging application that is not here yet,” Sudhakar says.

For instance, some research suggests that the amount of time driven in autonomous vehicles might increase because people can multitask while driving and the young and the elderly could drive more. But other research suggests that time spent driving might decrease because algorithms could find optimal routes that get people to their destinations faster.

In addition to considering these uncertainties, the researchers also needed to model advanced computing hardware and software that doesn’t exist yet.

To accomplish that, they modeled the workload of a popular algorithm for autonomous vehicles, known as a multitask deep neural network because it can perform many tasks at once. They explored how much energy this deep neural network would consume if it were processing many high-resolution inputs from many cameras with high frame rates, simultaneously.

When they used the probabilistic model to explore different scenarios, Sudhakar was surprised by how quickly the algorithms’ workload added up.

For example, if an autonomous vehicle has 10 deep neural networks processing images from 10 cameras, and that vehicle drives for one hour a day, it will make 21.6 million inferences each day. One billion vehicles would make 21.6 quadrillion inferences. To put that into perspective, all of Facebook’s data centers worldwide make a few trillion inferences each day (1 quadrillion is 1,000 trillion).

“After seeing the results, this makes a lot of sense, but it is not something that is on a lot of people’s radar. These vehicles could actually be using a ton of computer power. They have a 360-degree view of the world, so while we have two eyes, they may have 20 eyes, looking all over the place and trying to understand all the things that are happening at the same time,” Karaman says.

Autonomous vehicles would be used for moving goods, as well as people, so there could be a massive amount of computing power distributed along global supply chains, he says. And their model only considers computing — it doesn’t take into account the energy consumed by vehicle sensors or the emissions generated during manufacturing.

Keeping emissions in check

To keep emissions from spiraling out of control, the researchers found that each autonomous vehicle needs to consume less than 1.2 kilowatts of energy for computing. For that to be possible, computing hardware must become more efficient at a significantly faster pace, doubling in efficiency about every 1.1 years.

One way to boost that efficiency could be to use more specialized hardware, which is designed to run specific driving algorithms. Because researchers know the navigation and perception tasks required for autonomous driving, it could be easier to design specialized hardware for those tasks, Sudhakar says. But vehicles tend to have 10- or 20-year lifespans, so one challenge in developing specialized hardware would be to “future-proof” it so it can run new algorithms.

In the future, researchers could also make the algorithms more efficient, so they would need less computing power. However, this is also challenging because trading off some accuracy for more efficiency could hamper vehicle safety.

Now that they have demonstrated this framework, the researchers want to continue exploring hardware efficiency and algorithm improvements. In addition, they say their model can be enhanced by characterizing embodied carbon from autonomous vehicles — the carbon emissions generated when a car is manufactured — and emissions from a vehicle’s sensors.

While there are still many scenarios to explore, the researchers hope that this work sheds light on a potential problem people may not have considered.

“We are hoping that people will think of emissions and carbon efficiency as important metrics to consider in their designs. The energy consumption of an autonomous vehicle is really critical, not just for extending the battery life, but also for sustainability,” says Sze.

This research was funded, in part, by the National Science Foundation and the MIT-Accenture Fellowship.

Sensing with purpose

Fadel Adib never expected that science would get him into the White House, but in August 2015 the MIT graduate student found himself demonstrating his research to the president of the United States.

Adib, fellow grad student Zachary Kabelac, and their advisor, Dina Katabi, showcased a wireless device that uses Wi-Fi signals to track an individual’s movements.

As President Barack Obama looked on, Adib walked back and forth across the floor of the Oval Office, collapsed onto the carpet to demonstrate the device’s ability to monitor falls, and then sat still so Katabi could explain to the president how the device was measuring his breathing and heart rate.

“Zach started laughing because he could see that my heart rate was 110 as I was demoing the device to the president. I was stressed about it, but it was so exciting. I had poured a lot of blood, sweat, and tears into that project,” Adib recalls.

For Adib, the White House demo was an unexpected — and unforgettable — culmination of a research project he had launched four years earlier when he began his graduate training at MIT. Now, as a newly tenured associate professor in the Department of Electrical Engineering and Computer Science and the Media Lab, he keeps building off that work. Adib, the Doherty Chair of Ocean Utilization, seeks to develop wireless technology that can sense the physical world in ways that were not possible before.

In his Signal Kinetics group, Adib and his students apply knowledge and creativity to global problems like climate change and access to health care. They are using wireless devices for contactless physiological sensing, such as measuring someone’s stress level using Wi-Fi signals. The team is also developing battery-free underwater cameras that could explore uncharted regions of the oceans, tracking pollution and the effects of climate change. And they are combining computer vision and radio frequency identification (RFID) technology to build robots that find hidden items, to streamline factory and warehouse operations and, ultimately, alleviate supply chain bottlenecks.

While these areas may seem quite different, each time they launch a new project, the researchers uncover common threads that tie the disciplines together, Adib says.

“When we operate in a new field, we get to learn. Every time you are at a new boundary, in a sense you are also like a kid, trying to understand these different languages, bring them together, and invent something,” he says.

A science-minded child

A love of learning has driven Adib since he was a young child growing up in Tripoli on the coast of Lebanon. He had been interested in math and science for as long as he could remember, and had boundless energy and insatiable curiosity as a child.

“When my mother wanted me to slow down, she would give me a puzzle to solve,” he recalls.

By the time Adib started college at the American University of Beirut, he knew he wanted to study computer engineering and had his sights set on MIT for graduate school.

Seeking to kick-start his future studies, Adib reached out to several MIT faculty members to ask about summer internships. He received a response from the first person he contacted. Katabi, the Thuan and Nicole Pham Professor in the Department of Electrical Engineering and Computer Science (EECS), and a principal investigator in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the MIT Jameel Clinic, interviewed him and accepted him for a position. He immersed himself in the lab work and, as the end of summer approached, Katabi encouraged him to apply for grad school at MIT and join her lab.

“To me, that was a shock because I felt this imposter syndrome. I thought I was moving like a turtle with my research, but I did not realize that with research itself, because you are at the boundary of human knowledge, you are expected to progress iteratively and slowly,” he says.

As an MIT grad student, he began contributing to a number of projects. But his passion for invention pushed him to embark into unexplored territory. Adib had an idea: Could he use Wi-Fi to see through walls?

“It was a crazy idea at the time, but my advisor let me work on it, even though it was not something the group had been working on at all before. We both thought it was an exciting idea,” he says.

As Wi-Fi signals travel in space, a small part of the signal passes through walls — the same way light passes through windows — and is then reflected by whatever is on the other side. Adib wanted to use these signals to “see” what people on the other side of a wall were doing.

Discovering new applications

There were a lot of ups and downs (“I’d say many more downs than ups at the beginning”), but Adib made progress. First, he and his teammates were able to detect people on the other side of a wall, then they could determine their exact location. Almost by accident, he discovered that the device could be used to monitor someone’s breathing.

“I remember we were nearing a deadline and my friend Zach and I were working on the device, using it to track people on the other side of the wall. I asked him to hold still, and then I started to see him appearing and disappearing over and over again. I thought, could this be his breathing?” Adib says.

Eventually, they enabled their Wi-Fi device to monitor heart rate and other vital signs. The technology was spun out into a startup, which presented Adib with a conundrum once he finished his PhD — whether to join the startup or pursue a career in academia.

He decided to become a professor because he wanted to dig deeper into the realm of invention. But after living through the winter of 2014-2015, when nearly 109 inches of snow fell on Boston (a record), Adib was ready for a change of scenery and a warmer climate. He applied to universities all over the United States, and while he had some tempting offers, Adib ultimately realized he didn’t want to leave MIT. He joined the MIT faculty as an assistant professor in 2016 and was named associate professor in 2020.

“When I first came here as an intern, even though I was thousands of miles from Lebanon, I felt at home. And the reason for that was the people. This geekiness — this embrace of intellect — that is something I find to be beautiful about MIT,” he says.

He’s thrilled to work with brilliant people who are also passionate about problem-solving. The members of his research group are diverse, and they each bring unique perspectives to the table, which Adib says is vital to encourage the intellectual back-and-forth that drives their work.

Diving into a new project

For Adib, research is exploration. Take his work on oceans, for instance. He wanted to make an impact on climate change, and after exploring the problem, he and his students decided to build a battery-free underwater camera.

Adib learned that the ocean, which covers 70 percent of the planet, plays the single largest role in the Earth’s climate system. Yet more than 95 percent of it remains unexplored. That seemed like a problem the Signal Kinetics group could help solve, he says.

But diving into this research area was no easy task. Adib studies Wi-Fi systems, but Wi-Fi does not work underwater. And it is difficult to recharge a battery once it is deployed in the ocean, making it hard to build an autonomous underwater robot that can do large-scale sensing.

So, the team borrowed from other disciplines, building an underwater camera that uses acoustics to power its equipment and capture and transmit images.

“We had to use piezoelectric materials, which come from materials science, to develop transducers, which come from oceanography, and then on top of that we had to marry these things with technology from RF known as backscatter,” he says. “The biggest challenge becomes getting these things to gel together. How do you decode these languages across fields?”

It’s a challenge that continues to motivate Adib as he and his students tackle problems that are too big for one discipline.

He’s excited by the possibility of using his undersea wireless imaging technology to explore distant planets. These same tools could also enhance aquaculture, which could help eradicate food insecurity, or support other emerging industries.

To Adib, the possibilities seem endless.

“With each project, we discover something new, and that opens up a whole new world to explore. The biggest driver of our work in the future will be what we think is impossible, but that we could make possible,” he says.

Six With Ties to MIT Honored as ACM Fellows

The headshots of all six MIT-related Fellows for ACM 2022.

On January 18th, the Association for Computing Machinery (ACM) announced its 2022 Fellows, those it recognizes “for significant contributions in areas including cybersecurity, human-computer interaction, mobile computing, and recommender systems among many other areas.” Included in the crop of new Fellows were six distinguished scientists with ties to MIT. 

Faculty

Constantinos Daskalakis, the Armen Avanessians (1982) Professor in the Department of Electrical Engineering and Computer Science (EECS) at MIT, was honored “for contributions to the foundations of algorithmic game theory, mechanism design, sublinear algorithms, and theoretical machine learning”. Daskalakis is a theoretical computer scientist who works at the interface of game theory, economics, probability theory, statistics, and machine learning. His current work focuses on multi-agent learning, learning from biased and dependent data, causal inference and econometrics.

A native of Greece, Daskalakis joined the MIT faculty in 2009. He is a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL) and is affiliated with the Laboratory for Information and Decision Systems (LIDS) and the Operations Research Center (ORC). He is also an investigator in the Foundations of Data Science Institute. He has previously received such honors as the 2018 Nevanlinna Prize from the International Mathematical Union, the 2018 ACM Grace Murray Hopper Award, the Kalai Game Theory and Computer Science Prize from the Game Theory Society, and the 2008 ACM Doctoral Dissertation Award.

Hiroshi Ishii, the Jerome B. Wiesner Professor of Media Arts and Sciences and an Associate Director of the MIT Media Lab, was honored for “contributions to tangible user interfaces and to human-computer interaction.” Ishii joined the MIT Media Lab in 1995 and established the Tangible Media research group with the goal of making digital tangible by giving physical form to digital information and computation. He is recognized as a founder of “Tangible User Interfaces (TUI).” 

Ishii and his research team have presented their visions of “Tangible Bits” and “Radical Atoms” at a wide variety of academic, design, and artistic venues including ACM SIGGRAPH, Ars Electronica, ICC, Centre Pompidou, Cooper Hewitt Design Museum, and Milan Design Week. The exhibits have served to show that the design of engaging and inspiring tangible interactions requires the rigor of both scientific and artistic review, encapsulated by Ishii’s motto, “Be Artistic and Analytic. Be Poetic and Pragmatic.” Ishii was elected to the CHI Academy in 2006, and in 2019 received the SIGCHI Lifetime Research Award for his fundamental and influential research contributions to the field of human-computer interaction.

Alumni 

Kevin Fu ’98, MNG ’99, PhD ’05 (EECS), Professor of Electrical and Computer Engineering and Professor in the Khoury College of Computer Sciences at Northeastern University, was honored “for contributions to computer security, and especially to the secure engineering of medical devices.” Fu’s research interests include security as it relates to emerging sensor technology in biomedical engineering and cyberphysical systems; his work has important implications for ​​medical devices, autonomous transportation, healthcare delivery, manufacturing, and the Internet of Things.

Prior to joining Northeastern in January 2023, Fu was an Associate Professor at the University of Michigan and an Associate Professor at UMass Amherst; additionally, beginning in 2021, he was Acting Director of Medical Device Cybersecurity within the FDA Center for Devices and Radiological Health (CDRH) and Program Director for Cybersecurity within the FDA Digital Health Center of Excellence (DHCoE). His honors include the Sloan Research Fellow, MIT Technology Review TR35 Innovator of the Year, IEEE Fellow, a Fed100 Award, and an NSF CAREER Award. He has received best paper awards from USENIX Security, IEEE S&P, and ACM SIGCOMM, and his work on pacemaker security received an inaugural Test of Time Award from IEEE Security and Privacy.

Jimmy Lin ’00, MNG ’01, PhD ’04 (EECS), Professor and David R. Cheriton Chair in the School of Computer Science at the University of Waterloo, was honored “for contributions to question answering, information retrieval, and natural language processing.

Lin’s research centers on the challenge of connecting users with relevant information at scale. Over the years, he has worked on systems designed for diverse users, ranging from causal searchers on the web to intelligence analysts, medical doctors, historians, and data scientists. Prior to joining the University of Waterloo, Lin was at the University of Maryland; additionally, he has spent time at Twitter, Cloudera, Microsoft, and the National Library of Medicine (NLM). He is currently the Chief Technology Officer of Primal, a Waterloo-based knowledge graph and deep learning company; previously, he was the Chief Scientist of RSVP.ai, a Waterloo-based startup.

Rafael Pass PhD ’06 (EECS), a Professor of Computer Science at Tel-Aviv University, and the director of the Checkpoint Institute for Information Security, as well as a Professor at Cornell Tech/Cornell University, was honored “for contributions to the foundations of cryptography.” Pass’s work focuses on cryptography and its interplay with computational complexity and game theory, as well as theoretical foundations of blockchains, and connections between cryptography and Kolmogorov complexity. 

His honors include: Winner of the 9th NSA Best Scientific Cybersecurity Paper Competition, 2022; The Best Paper Award at the 41st Annual International Cryptology Conference (CRYPTO), 2021; the Wallenberg Academy Fellow (awarded by the Royal Academy of Science in Sweden); the Alfred P. Sloan Fellowship; AFOSR Young Investigator Award; a Microsoft Research Faculty Fellowship; and an NSF Career Award, among others. Before earning his PhD from MIT, he earned his bachelor’s in Engineering Physics and a master’s in Computer Science, both from the Royal Institute of Technology (KTH) in Sweden.

Jaime Teevan SM ’01, PhD ’07 (EECS), Chief Scientist and a Technical Fellow at Microsoft, was honored “for contributions to human-computer interaction, information retrieval, and productivity.” Teevan is responsible for driving research-backed innovation related to everything from AI to hybrid work in Microsoft’s core products. Previously, she was Technical Advisor to CEO Satya Nadella, and led the Productivity team at Microsoft Research. 

This year, in addition to becoming an ACM Fellow, Teevan was inducted into the ACM SIGIR and CHI Academies. She is also an Affiliate Professor at the University of Washington; before earning her master’s and PhD from MIT, she earned her BS from Yale.

Mohammad Alizadeh named new Industry Officer for EECS

Mohammad Alizadeh has been named the Industry Officer for MIT’s Department of Electrical Engineering and Computer Science. In this role, Alizadeh will oversee the EECS Alliance, the department’s industry outreach program which provides access for EECS students to internships, post graduate employment, networking and collaborations. He succeeds Tomás Palacios and Aude Oliva in the role.

A member of CSAIL, Alizadeh’s research interests are in the areas of computer networks and systems. His current research focuses on machine learning for systems, network protocols, and resource management in a variety of settings, including datacenters and cloud computing, edge computing, Internet video delivery, and large-scale decentralized systems.

Before joining MIT in 2015, Alizadeh spent time at Microsoft Research, Insieme Networks, and Cisco Systems. He earned his MS and PhD in Electrical Engineering from Stanford University, and his BS from the Sharif University of Technology in Iran. Among his many honors, Alizadeh has received the Microsoft Research Faculty Fellowship, VMware Systems Research Award, SIGCOMM Rising Star Award, NSF CAREER Award, Alfred P. Sloan Research Fellowship, SIGCOMM Test of Time Award, and multiple best paper awards.

Subtle biases in AI can influence emergency decisions

It’s no secret that people harbor biases — some unconscious, perhaps, and others painfully overt. The average person might suppose that computers — machines typically made of plastic, steel, glass, silicon, and various metals — are free of prejudice. While that assumption may hold for computer hardware, the same is not always true for computer software, which is programmed by fallible humans and can be fed data that is, itself, compromised in certain respects.

Artificial intelligence (AI) systems — those based on machine learning, in particular — are seeing increased use in medicine for diagnosing specific diseases, for example, or evaluating X-rays. These systems are also being relied on to support decision-making in other areas of health care. Recent research has shown, however, that machine learning models can encode biases against minority subgroups, and the recommendations they make may consequently reflect those same biases.

new study by researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the MIT Jameel Clinic, which was published last month in Communications Medicine, assesses the impact that discriminatory AI models can have, especially for systems that are intended to provide advice in urgent situations. “We found that the manner in which the advice is framed can have significant repercussions,” explains the paper’s lead author, Hammaad Adam, a PhD student at MIT’s Institute for Data Systems and Society. “Fortunately, the harm caused by biased models can be limited (though not necessarily eliminated) when the advice is presented in a different way.” The other co-authors of the paper are Aparna Balagopalan and Emily Alsentzer, both PhD students, and the professors Fotini Christia and Marzyeh Ghassemi.

AI models used in medicine can suffer from inaccuracies and inconsistencies, in part because the data used to train the models are often not representative of real-world settings. Different kinds of X-ray machines, for instance, can record things differently and hence yield different results. Models trained predominately on white people, moreover, may not be as accurate when applied to other groups. The Communications Medicine paper is not focused on issues of that sort but instead addresses problems that stem from biases and on ways to mitigate the adverse consequences.

A group of 954 people (438 clinicians and 516 nonexperts) took part in an experiment to see how AI biases can affect decision-making. The participants were presented with call summaries from a fictitious crisis hotline, each involving a male individual undergoing a mental health emergency. The summaries contained information as to whether the individual was Caucasian or African American and would also mention his religion if he happened to be Muslim. A typical call summary might describe a circumstance in which an African American man was found at home in a delirious state, indicating that “he has not consumed any drugs or alcohol, as he is a practicing Muslim.” Study participants were instructed to call the police if they thought the patient was likely to turn violent; otherwise, they were encouraged to seek medical help.

The participants were randomly divided into a control or “baseline” group plus four other groups designed to test responses under slightly different conditions. “We want to understand how biased models can influence decisions, but we first need to understand how human biases can affect the decision-making process,” Adam notes. What they found in their analysis of the baseline group was rather surprising: “In the setting we considered, human participants did not exhibit any biases. That doesn’t mean that humans are not biased, but the way we conveyed information about a person’s race and religion, evidently, was not strong enough to elicit their biases.”

The other four groups in the experiment were given advice that either came from a biased or unbiased model, and that advice was presented in either a “prescriptive” or a “descriptive” form. A biased model would be more likely to recommend police help in a situation involving an African American or Muslim person than would an unbiased model. Participants in the study, however, did not know which kind of model their advice came from, or even that models delivering the advice could be biased at all. Prescriptive advice spells out what a participant should do in unambiguous terms, telling them they should call the police in one instance or seek medical help in another. Descriptive advice is less direct: A flag is displayed to show that the AI system perceives a risk of violence associated with a particular call; no flag is shown if the threat of violence is deemed small.  

A key takeaway of the experiment is that participants “were highly influenced by prescriptive recommendations from a biased AI system,” the authors wrote. But they also found that “using descriptive rather than prescriptive recommendations allowed participants to retain their original, unbiased decision-making.” In other words, the bias incorporated within an AI model can be diminished by appropriately framing the advice that’s rendered. Why the different outcomes, depending on how advice is posed? When someone is told to do something, like call the police, that leaves little room for doubt, Adam explains. However, when the situation is merely described — classified with or without the presence of a flag — “that leaves room for a participant’s own interpretation; it allows them to be more flexible and consider the situation for themselves.”

Second, the researchers found that the language models that are typically used to offer advice are easy to bias. Language models represent a class of machine learning systems that are trained on text, such as the entire contents of Wikipedia and other web material. When these models are “fine-tuned” by relying on a much smaller subset of data for training purposes — just 2,000 sentences, as opposed to 8 million web pages — the resultant models can be readily biased.  

Third, the MIT team discovered that decision-makers who are themselves unbiased can still be misled by the recommendations provided by biased models. Medical training (or the lack thereof) did not change responses in a discernible way. “Clinicians were influenced by biased models as much as non-experts were,” the authors stated.

“These findings could be applicable to other settings,” Adam says, and are not necessarily restricted to health care situations. When it comes to deciding which people should receive a job interview, a biased model could be more likely to turn down Black applicants. The results could be different, however, if instead of explicitly (and prescriptively) telling an employer to “reject this applicant,” a descriptive flag is attached to the file to indicate the applicant’s “possible lack of experience.”

The implications of this work are broader than just figuring out how to deal with individuals in the midst of mental health crises, Adam maintains.  “Our ultimate goal is to make sure that machine learning models are used in a fair, safe, and robust way.”