Self-powered sensor automatically harvests magnetic energy

MIT researchers have developed a battery-free, self-powered sensor that can harvest energy from its environment.

Because it requires no battery that must be recharged or replaced, and because it requires no special wiring, such a sensor could be embedded in a hard-to-reach place, like inside the inner workings of a ship’s engine. There, it could automatically gather data on the machine’s power consumption and operations for long periods of time.

The researchers built a temperature-sensing device that harvests energy from the magnetic field generated in the open air around a wire. One could simply clip the sensor around a wire that carries electricity — perhaps the wire that powers a motor — and it will automatically harvest and store energy which it uses to monitor the motor’s temperature.

“This is ambient power — energy that I don’t have to make a specific, soldered connection to get. And that makes this sensor very easy to install,” says Steve Leeb, the Emanuel E. Landsman Professor of Electrical Engineering and Computer Science (EECS) and professor of mechanical engineering, a member of the Research Laboratory of Electronics, and senior author of a paper on the energy-harvesting sensor.

In the paper, which appeared as the featured article in the January issue of the IEEE Sensors Journal, the researchers offer a design guide for an energy-harvesting sensor that lets an engineer balance the available energy in the environment with their sensing needs.

The paper lays out a roadmap for the key components of a device that can sense and control the flow of energy continually during operation.

The versatile design framework is not limited to sensors that harvest magnetic field energy, and can be applied to those that use other power sources, like vibrations or sunlight. It could be used to build networks of sensors for factories, warehouses, and commercial spaces that cost less to install and maintain.

“We have provided an example of a battery-less sensor that does something useful, and shown that it is a practically realizable solution. Now others will hopefully use our framework to get the ball rolling to design their own sensors,” says lead author Daniel Monagle, an EECS graduate student.

Monagle and Leeb are joined on the paper by EECS graduate student Eric Ponce.

John Donnal, an associate professor of weapons and controls engineering at the U.S. Naval Academy who was not involved with this work, studies techniques to monitor ship systems. Getting access to power on a ship can be difficult, he says, since there are very few outlets and strict restrictions as to what equipment can be plugged in.

“Persistently measuring the vibration of a pump, for example, could give the crew real-time information on the health of the bearings and mounts, but powering a retrofit sensor often requires so much additional infrastructure that the investment is not worthwhile,” Donnal adds. “Energy-harvesting systems like this could make it possible to retrofit a wide variety of diagnostic sensors on ships and significantly reduce the overall cost of maintenance.”

A how-to guide

The researchers had to meet three key challenges to develop an effective, battery-free, energy-harvesting sensor.

First, the system must be able to cold start, meaning it can fire up its electronics with no initial voltage. They accomplished this with a network of integrated circuits and transistors that allow the system to store energy until it reaches a certain threshold. The system will only turn on once it has stored enough power to fully operate.

Second, the system must store and convert the energy it harvests efficiently, and without a battery. While the researchers could have included a battery, that would add extra complexities to the system and could pose a fire risk.

“You might not even have the luxury of sending out a technician to replace a battery. Instead, our system is maintenance-free. It harvests energy and operates itself,” Monagle adds.

To avoid using a battery, they incorporate internal energy storage that can include a series of capacitors. Simpler than a battery, a capacitor stores energy in the electrical field between conductive plates. Capacitors can be made from a variety of materials, and their capabilities can be tuned to a range of operating conditions, safety requirements, and available space.

The team carefully designed the capacitors so they are big enough to store the energy the device needs to turn on and start harvesting power, but small enough that the charge-up phase doesn’t take too long.

In addition, since a sensor might go weeks or even months before turning on to take a measurement, they ensured the capacitors can hold enough energy even if some leaks out over time.

Finally, they developed a series of control algorithms that dynamically measure and budget the energy collected, stored, and used by the device. A microcontroller, the “brain” of the energy management interface, constantly checks how much energy is stored and infers whether to turn the sensor on or off, take a measurement, or kick the harvester into a higher gear so it can gather more energy for more complex sensing needs.

“Just like when you change gears on a bike, the energy management interface looks at how the harvester is doing, essentially seeing whether it is pedaling too hard or too soft, and then it varies the electronic load so it can maximize the amount of power it is harvesting and match the harvest to the needs of the sensor,” Monagle explains.

Self-powered sensor

Using this design framework, they built an energy management circuit for an off-the-shelf temperature sensor. The device harvests magnetic field energy and uses it to continually sample temperature data, which it sends to a smartphone interface using Bluetooth.

The researchers used super-low-power circuits to design the device, but quickly found that these circuits have tight restrictions on how much voltage they can withstand before breaking down. Harvesting too much power could cause the device to explode.

To avoid that, their energy harvester operating system in the microcontroller automatically adjusts or reduces the harvest if the amount of stored energy becomes excessive.

They also found that communication — transmitting data gathered by the temperature sensor — was by far the most power-hungry operation.

“Ensuring the sensor has enough stored energy to transmit data is a constant challenge that involves careful design,” Monagle says.

In the future, the researchers plan to explore less energy-intensive means of transmitting data, such as using optics or acoustics. They also want to more rigorously model and predict how much energy might be coming into a system, or how much energy a sensor might need to take measurements, so a device could effectively gather even more data.

“If you only make the measurements you think you need, you may miss something really valuable. With more information, you might be able to learn something you didn’t expect about a device’s operations. Our framework lets you balance those considerations,” Leeb says.  

“This paper is well-documented regarding what a practical self-powered sensor node should internally entail for realistic scenarios. The overall design guidelines, particularly on the cold-start issue, are very helpful,” says Jinyeong Moon, an assistant professor of electrical and computer engineering at Florida A&M University-Florida State University College of Engineering who was not involved with this work. “Engineers planning to design a self-powering module for a wireless sensor node will greatly benefit from these guidelines, easily ticking off traditionally cumbersome cold-start-related checklists.”

The work is supported, in part, by the Office of Naval Research and The Grainger Foundation.

Reasoning and reliability in AI

In order for natural language to be an effective form of communication, the parties involved need to be able to understand words and their context, assume that the content is largely shared in good faith and is trustworthy, reason about the information being shared, and then apply it to real-world scenarios. MIT PhD students interning with the MIT-IBM Watson AI Lab — Athul Paul Jacob SM ’22, Maohao Shen SM ’23, Victor Butoi, and Andi Peng SM ’23 — are working to attack each step of this process that’s baked into natural language models, so that the AI systems can be more dependable and accurate for users.

To achieve this, Jacob’s research strikes at the heart of existing natural language models to improve the output, using game theory. His interests, he says, are two-fold: “One is understanding how humans behave, using the lens of multi-agent systems and language understanding, and the second thing is, ‘How do you use that as an insight to build better AI systems?’” His work stems from the board game “Diplomacy,” where his research team developed a system that could learn and predict human behaviors and negotiate strategically to achieve a desired, optimal outcome.

“This was a game where you need to build trust; you need to communicate using language. You need to also play against six other players at the same time, which were very different from all the kinds of task domains people were tackling in the past,” says Jacob, referring to other games like poker and GO that researchers put to neural networks. “In doing so, there were a lot of research challenges. One was, ‘How do you model humans? How do you know whether when humans tend to act irrationally?’” Jacob and his research mentors — including Associate Professor Jacob Andreas and Assistant Professor Gabriele Farina of the MIT Department of Electrical Engineering and Computer Science (EECS), and the MIT-IBM Watson AI Lab’s Yikang Shen — recast the problem of language generation as a two-player game.

Using “generator” and “discriminator” models, Jacob’s team developed a natural language system to produce answers to questions and then observe the answers and determine if they are correct. If they are, the AI system receives a point; if not, no point is rewarded. Language models notoriously tend to hallucinate, making them less trustworthy; this no-regret learning algorithm collaboratively takes a natural language model and encourages the system’s answers to be more truthful and reliable, while keeping the solutions close to the pre-trained language model’s priors. Jacob says that using this technique in conjunction with a smaller language model could, likely, make it competitive with the same performance of a model many times bigger.  

Once a language model generates a result, researchers ideally want its confidence in its generation to align with its accuracy, but this frequently isn’t the case. Hallucinations can occur with the model reporting high confidence when it should be low. Maohao Shen and his group, with mentors Gregory Wornell, Sumitomo Professor of Engineering in EECS, and lab researchers with IBM Research Subhro Das, Prasanna Sattigeri, and Soumya Ghosh — are looking to fix this through uncertainty quantification (UQ). “Our project aims to calibrate language models when they are poorly calibrated,” says Shen. Specifically, they’re looking at the classification problem. For this, Shen allows a language model to generate free text, which is then converted into a multiple-choice classification task. For instance, they might ask the model to solve a math problem and then ask it if the answer it generated is correct as “yes, no, or maybe.” This helps to determine if the model is over- or under-confident.

Automating this, the team developed a technique that helps tune the confidence output by a pre-trained language model. The researchers trained an auxiliary model using the ground-truth information in order for their system to be able to correct the language model. “If your model is over-confident in its prediction, we are able to detect it and make it less confident, and vice versa,” explains Shen. The team evaluated their technique on multiple popular benchmark datasets to show how well it generalizes to unseen tasks to realign the accuracy and confidence of language model predictions. “After training, you can just plug in and apply this technique to new tasks without any other supervision,” says Shen. “The only thing you need is the data for that new task.”

Victor Butoi also enhances model capability, but instead, his lab team — which includes John Guttag, the Dugald C. Jackson Professor of Computer Science and Electrical Engineering in EECS; lab researchers Leonid Karlinsky and Rogerio Feris of IBM Research; and lab affiliates Hilde Kühne of the University of Bonn and Wei Lin of Graz University of Technology — is creating techniques to allow vision-language models to reason about what they’re seeing, and is designing prompts to unlock new learning abilities and understand key phrases.

Compositional reasoning is just another aspect of the decision-making process that we ask machine-learning models to perform in order for them to be helpful in real-world situations, explains Butoi. “You need to be able to think about problems compositionally and solve subtasks,” says Butoi, “like, if you’re saying the chair is to the left of the person, you need to recognize both the chair and the person. You need to understand directions.” And then once the model understands “left,” the research team wants the model to be able to answer other questions involving “left.”

Surprisingly, vision-language models do not reason well about composition, Butoi explains, but they can be helped to, using a model that can “lead the witness”, if you will. The team developed a model that was tweaked using a technique called low-rank adaptation of large language models (LoRA) and trained on an annotated dataset called Visual Genome, which has objects in an image and arrows denoting relationships, like directions. In this case, the trained LoRA model would be guided to say something about “left” relationships, and this caption output would then be used to provide context and prompt the vision-language model, making it a “significantly easier task,” says Butoi.

In the world of robotics, AI systems also engage with their surroundings using computer vision and language. The settings may range from warehouses to the home. Andi Peng and mentors MIT’s H.N. Slater Professor in Aeronautics and Astronautics Julie Shah and Chuang Gan, of the lab and the University of Massachusetts at Amherst, are focusing on assisting people with physical constraints, using virtual worlds. For this, Peng’s group is developing two embodied AI models — a “human” that needs support and a helper agent — in a simulated environment called ThreeDWorld. Focusing on human/robot interactions, the team leverages semantic priors captured by large language models to aid the helper AI to infer what abilities the “human” agent might not be able to do and the motivation behind actions of the “human,” using natural language. The team’s looking to strengthen the helper’s sequential decision-making, bidirectional communication, ability to understand the physical scene, and how best to contribute.

“A lot of people think that AI programs should be autonomous, but I think that an important part of the process is that we build robots and systems for humans, and we want to convey human knowledge,” says Peng. “We don’t want a system to do something in a weird way; we want them to do it in a human way that we can understand.”

New hope for early pancreatic cancer intervention via AI-based risk prediction

The first documented case of pancreatic cancer dates back to the 18th century. Since then, researchers have undertaken a protracted and challenging odyssey to understand the elusive and deadly disease. To date, there is no better cancer treatment than early intervention. Unfortunately, the pancreas, nestled deep within the abdomen, is particularly elusive for early detection. 

MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) scientists, alongside Limor Appelbaum, a staff scientist in the Department of Radiation Oncology at Beth Israel Deaconess Medical Center (BIDMC), were eager to better identify potential high-risk patients. They set out to develop two machine-learning models for early detection of pancreatic ductal adenocarcinoma (PDAC), the most common form of the cancer. To access a broad and diverse database, the team synced up with a federated network company, using electronic health record data from various institutions across the United States. This vast pool of data helped ensure the models’ reliability and generalizability, making them applicable across a wide range of populations, geographical locations, and demographic groups.

The two models  the “PRISM” neural network, and the logistic regression model (a statistical technique for probability), outperformed current methods. The team’s comparison showed that while standard screening criteria identify about 10 percent of PDAC cases using a five-times higher relative risk threshold, Prism can detect 35 percent of PDAC cases at this same threshold. 

Using AI to detect cancer risk is not a new phenomena  algorithms analyze mammograms, CT scans for lung cancer, and assist in the analysis of Pap smear tests and HPV testing, to name a few applications. “The PRISM models stand out for their development and validation on an extensive database of over 5 million patients, surpassing the scale of most prior research in the field,” says Kai Jia, an MIT PhD student in electrical engineering and computer science (EECS), MIT CSAIL affiliate, and first author on an open-access paper in eBioMedicine outlining the new work. “The model uses routine clinical and lab data to make its predictions, and the diversity of the U.S. population is a significant advancement over other PDAC models, which are usually confined to specific geographic regions, like a few health-care centers in the U.S. Additionally, using a unique regularization technique in the training process enhanced the models’ generalizability and interpretability.” 

“This report outlines a powerful approach to use big data and artificial intelligence algorithms to refine our approach to identifying risk profiles for cancer,” says David Avigan, a Harvard Medical School professor and the cancer center director and chief of hematology and hematologic malignancies at BIDMC, who was not involved in the study. “This approach may lead to novel strategies to identify patients with high risk for malignancy that may benefit from focused screening with the potential for early intervention.” 

Prismatic perspectives

The journey toward the development of PRISM began over six years ago, fueled by firsthand experiences with the limitations of current diagnostic practices. “Approximately 80-85 percent of pancreatic cancer patients are diagnosed at advanced stages, where cure is no longer an option,” says senior author Appelbaum, who is also a Harvard Medical School instructor as well as radiation oncologist. “This clinical frustration sparked the idea to delve into the wealth of data available in electronic health records (EHRs).”

The CSAIL group’s close collaboration with Appelbaum made it possible to understand the combined medical and machine learning aspects of the problem better, eventually leading to a much more accurate and transparent model. “The hypothesis was that these records contained hidden clues — subtle signs and symptoms that could act as early warning signals of pancreatic cancer,” she adds. “This guided our use of federated EHR networks in developing these models, for a scalable approach for deploying risk prediction tools in health care.”

Both PrismNN and PrismLR models analyze EHR data, including patient demographics, diagnoses, medications, and lab results, to assess PDAC risk. PrismNN uses artificial neural networks to detect intricate patterns in data features like age, medical history, and lab results, yielding a risk score for PDAC likelihood. PrismLR uses logistic regression for a simpler analysis, generating a probability score of PDAC based on these features. Together, the models offer a thorough evaluation of different approaches in predicting PDAC risk from the same EHR data.

One paramount point for gaining the trust of physicians, the team notes, is better understanding how the models work, known in the field as interpretability. The scientists pointed out that while logistic regression models are inherently easier to interpret, recent advancements have made deep neural networks somewhat more transparent. This helped the team to refine the thousands of potentially predictive features derived from EHR of a single patient to approximately 85 critical indicators. These indicators, which include patient age, diabetes diagnosis, and an increased frequency of visits to physicians, are automatically discovered by the model but match physicians’ understanding of risk factors associated with pancreatic cancer. 

The path forward

Despite the promise of the PRISM models, as with all research, some parts are still a work in progress. U.S. data alone are the current diet for the models, necessitating testing and adaptation for global use. The path forward, the team notes, includes expanding the model’s applicability to international datasets and integrating additional biomarkers for more refined risk assessment.

“A subsequent aim for us is to facilitate the models’ implementation in routine health care settings. The vision is to have these models function seamlessly in the background of health care systems, automatically analyzing patient data and alerting physicians to high-risk cases without adding to their workload,” says Jia. “A machine-learning model integrated with the EHR system could empower physicians with early alerts for high-risk patients, potentially enabling interventions well before symptoms manifest. We are eager to deploy our techniques in the real world to help all individuals enjoy longer, healthier lives.” 

Jia wrote the paper alongside Applebaum and MIT EECS Professor and CSAIL Principal Investigator Martin Rinard, who are both senior authors of the paper. Researchers on the paper were supported during their time at MIT CSAIL, in part, by the Defense Advanced Research Projects Agency, Boeing, the National Science Foundation, and Aarno Labs. TriNetX provided resources for the project, and the Prevent Cancer Foundation also supported the team.

Researchers 3D print components for a portable mass spectrometer

Eight figures, labeled from “a” to “h,” show the evolution of the design. The figure starts very rectangular and simple, and then more elements are added until it becomes rectangular with a complex lattice.

Mass spectrometers, devices that identify chemical substances, are widely used in applications like crime scene analysis, toxicology testing, and geological surveying. But these machines are bulky, expensive, and easy to damage, which limits where they can be effectively deployed.

Using additive manufacturing, MIT researchers produced a mass filter, which is the core component of a mass spectrometer, that is far lighter and cheaper than the same type of filter made with traditional techniques and materials.

Their miniaturized filter, known as a quadrupole, can be completely fabricated in a matter of hours for a few dollars. The 3D-printed device is as precise as some commercial-grade mass filters that can cost more than $100,000 and take weeks to manufacture.

Built from durable and heat-resistant glass-ceramic resin, the filter is 3D printed in one step, so no assembly is required. Assembly often introduces defects that can hamper the performance of quadrupoles.

This lightweight, cheap, yet precise quadrupole is one important step in Luis Fernando Velásquez-García’s 20-year quest to produce a 3D-printed, portable mass spectrometer.

“We are not the first ones to try to do this. But we are the first ones who succeeded at doing this. There are other miniaturized quadrupole filters, but they are not comparable with professional-grade mass filters. There are a lot of possibilities for this hardware if the size and cost could be smaller without adversely affecting the performance,” says Velásquez-García, a principal research scientist in MIT’s Microsystems Technology Laboratories (MTL) and senior author of a paper detailing the miniaturized quadrupole.

For instance, a scientist could bring a portable mass spectrometer to remote areas of the rainforest, using it to rapidly analyze potential pollutants without shipping samples back to a lab. And a lightweight device would be cheaper and easier to send into space, where it could monitor chemicals in Earth’s atmosphere or on those of distant planets.

Velásquez-García is joined on the paper by lead author Colin Eckhoff, an MIT graduate student in electrical engineering and computer science (EECS); Nicholas Lubinsky, a former MIT postdoc; and Luke Metzler and Randall Pedder of Ardara Technologies. The research is published in Advanced Science.

Size matters

At the heart of a mass spectrometer is the mass filter. This component uses electric or magnetic fields to sort charged particles based on their mass-to-charge ratio. In this way, the device can measure the chemical components in a sample to identify an unknown substance.

A quadrupole, a common type of mass filter, is composed of four metallic rods surrounding an axis. Voltages are applied to the rods, which produce an electromagnetic field. Depending on the properties of the electromagnetic field, ions with a specific mass-to-charge ratio will swirl around through the middle of the filter, while other particles escape out the sides. By varying the mix of voltages, one can target ions with different mass-to-charge ratios.

While fairly simple in design, a typical stainless-steel quadrupole might weigh several kilograms. But miniaturizing a quadrupole is no easy task. Making the filter smaller usually introduces errors during the manufacturing process. Plus, smaller filters collect fewer ions, which makes chemical analysis less sensitive.

“You can’t make quadrupoles arbitrarily smaller — there is a tradeoff,” Velásquez-García adds.

His team balanced this tradeoff by leveraging additive manufacturing to make miniaturized quadrupoles with the ideal size and shape to maximize precision and sensitivity.

They fabricate the filter from a glass-ceramic resin, which is a relatively new printable material that can withstand temperatures up to 900 degrees Celsius and performs well in a vacuum.

The device is produced using vat photopolymerization, a process where a piston pushes into a vat of liquid resin until it nearly touches an array of LEDs at the bottom. These illuminate, curing the resin that remains in the minuscule gap between the piston and the LEDs. A tiny layer of cured polymer is then stuck to the piston, which rises up and repeats the cycle, building the device one tiny layer at a time.

“This is a relatively new technology for printing ceramics that allows you to make very precise 3D objects. And one key advantage of additive manufacturing is that you can aggressively iterate the designs,” Velásquez-García says.

Since the 3D printer can form practically any shape, the researchers designed a quadrupole with hyperbolic rods. This shape is ideal for mass filtering but difficult to make with conventional methods. Many commercial filters employ rounded rods instead, which can reduce performance.

They also printed an intricate network of triangular lattices surrounding the rods, which provides durability while ensuring the rods remain positioned correctly if the device is moved or shaken.

To finish the quadrupole, the researchers used a technique called electroless plating to coat the rods with a thin metal film, which makes them electrically conductive. They cover everything but the rods with a masking chemical and then submerge the quadrupole in a chemical bath heated to a precise temperature and stirring conditions. This deposits a thin metal film on the rods uniformly without damaging the rest of the device or shorting the rods.

“In the end, we made quadrupoles that were the most compact but also the most precise that could be made, given the constraints of our 3D printer,” Velásquez-García says.

Maximizing performance

To test their 3D-printed quadrupoles, the team swapped them into a commercial system and found that they could attain higher resolutions than other types of miniature filters. Their quadrupoles, which are about 12 centimeters in length, are one-quarter the density of comparable stainless-steel filters.

In addition, further experiments suggest that their 3D-printed quadrupoles could achieve precision that is on par with that of largescale commercial filters.

“Mass spectrometry is one of the most important of all scientific tools, and Velásquez-Garcia and co-workers describe the design, construction, and performance of a quadrupole mass filter that has several advantages over earlier devices,” says Graham Cooks, the Henry Bohn Hass Distinguished Professor of Chemistry in the Aston Laboratories for Mass Spectrometry at Purdue University, who was not involved with this work. “The advantages derive from these facts: It is much smaller and lighter than most commercial counterparts and it is fabricated monolithically, using additive construction. … It is an open question as to how well the performance will compare with that of quadrupole ion traps, which depend on the same electric fields for mass measurement but which do not have the stringent geometrical requirements of quadrupole mass filters.”

“This paper represents a real advance in the manufacture of quadrupole mass filters (QMF). The authors bring together their knowledge of manufacture using advanced materials, QMF drive electronics, and mass spectrometry to produce a novel system with good performance at low cost,” adds Steve Taylor, professor of electrical engineering and electronics at the University of Liverpool, who was also not involved with this paper. “Since QMFs are at the heart of the ‘analytical engine’ in many other types of mass spectrometry systems, the paper has an important significance across the whole mass spectrometry field, which worldwide represents a multibillion-dollar industry.”

In the future, the researchers plan to boost the quadrupole’s performance by making the filters longer. A longer filter can enable more precise measurements since more ions that are supposed to be filtered out will escape as the chemical travels along its length. They also intend to explore different ceramic materials that could better transfer heat.

“Our vision is to make a mass spectrometer where all the key components can be 3D printed, contributing to a device with much less weight and cost without sacrificing performance. There is still a lot of work to do, but this is a great start,” Velásquez-Garcia adds.

This work was funded by Empiriko Corporation.

Automated system teaches users when to collaborate with an AI assistant

Artificial intelligence models that pick out patterns in images can often do so better than human eyes — but not always. If a radiologist is using an AI model to help her determine whether a patient’s X-rays show signs of pneumonia, when should she trust the model’s advice and when should she ignore it?

A customized onboarding process could help this radiologist answer that question, according to researchers at MIT and the MIT-IBM Watson AI Lab. They designed a system that teaches a user when to collaborate with an AI assistant.

In this case, the training method might find situations where the radiologist trusts the model’s advice — except she shouldn’t because the model is wrong. The system automatically learns rules for how she should collaborate with the AI, and describes them with natural language.

During onboarding, the radiologist practices collaborating with the AI using training exercises based on these rules, receiving feedback about her performance and the AI’s performance.

The researchers found that this onboarding procedure led to about a 5 percent improvement in accuracy when humans and AI collaborated on an image prediction task. Their results also show that just telling the user when to trust the AI, without training, led to worse performance.

Importantly, the researchers’ system is fully automated, so it learns to create the onboarding process based on data from the human and AI performing a specific task. It can also adapt to different tasks, so it can be scaled up and used in many situations where humans and AI models work together, such as in social media content moderation, writing, and programming.

“So often, people are given these AI tools to use without any training to help them figure out when it is going to be helpful. That’s not what we do with nearly every other tool that people use — there is almost always some kind of tutorial that comes with it. But for AI, this seems to be missing. We are trying to tackle this problem from a methodological and behavioral perspective,” says Hussein Mozannar, a graduate student in the Social and Engineering Systems doctoral program within the Institute for Data, Systems, and Society (IDSS) and lead author of a paper about this training process.

The researchers envision that such onboarding will be a crucial part of training for medical professionals.

“One could imagine, for example, that doctors making treatment decisions with the help of AI will first have to do training similar to what we propose. We may need to rethink everything from continuing medical education to the way clinical trials are designed,” says senior author David Sontag, a professor of EECS, a member of the MIT-IBM Watson AI Lab and the MIT Jameel Clinic, and the leader of the Clinical Machine Learning Group of the Computer Science and Artificial Intelligence Laboratory (CSAIL).

Mozannar, who is also a researcher with the Clinical Machine Learning Group, is joined on the paper by Jimin J. Lee, an undergraduate in electrical engineering and computer science; Dennis Wei, a senior research scientist at IBM Research; and Prasanna Sattigeri and Subhro Das, research staff members at the MIT-IBM Watson AI Lab. The paper will be presented at the Conference on Neural Information Processing Systems.

Training that evolves

Existing onboarding methods for human-AI collaboration are often composed of training materials produced by human experts for specific use cases, making them difficult to scale up. Some related techniques rely on explanations, where the AI tells the user its confidence in each decision, but research has shown that explanations are rarely helpful, Mozannar says.

“The AI model’s capabilities are constantly evolving, so the use cases where the human could potentially benefit from it are growing over time. At the same time, the user’s perception of the model continues changing. So, we need a training procedure that also evolves over time,” he adds.

To accomplish this, their onboarding method is automatically learned from data. It is built from a dataset that contains many instances of a task, such as detecting the presence of a traffic light from a blurry image.

The system’s first step is to collect data on the human and AI performing this task. In this case, the human would try to predict, with the help of AI, whether blurry images contain traffic lights.

The system embeds these data points onto a latent space, which is a representation of data in which similar data points are closer together. It uses an algorithm to discover regions of this space where the human collaborates incorrectly with the AI. These regions capture instances where the human trusted the AI’s prediction but the prediction was wrong, and vice versa.

Perhaps the human mistakenly trusts the AI when images show a highway at night.

After discovering the regions, a second algorithm utilizes a large language model to describe each region as a rule, using natural language. The algorithm iteratively fine-tunes that rule by finding contrasting examples. It might describe this region as “ignore AI when it is a highway during the night.”

These rules are used to build training exercises. The onboarding system shows an example to the human, in this case a blurry highway scene at night, as well as the AI’s prediction, and asks the user if the image shows traffic lights. The user can answer yes, no, or use the AI’s prediction.

If the human is wrong, they are shown the correct answer and performance statistics for the human and AI on these instances of the task. The system does this for each region, and at the end of the training process, repeats the exercises the human got wrong.

“After that, the human has learned something about these regions that we hope they will take away in the future to make more accurate predictions,” Mozannar says.

Onboarding boosts accuracy

The researchers tested this system with users on two tasks — detecting traffic lights in blurry images and answering multiple choice questions from many domains (such as biology, philosophy, computer science, etc.).

They first showed users a card with information about the AI model, how it was trained, and a breakdown of its performance on broad categories. Users were split into five groups: Some were only shown the card, some went through the researchers’ onboarding procedure, some went through a baseline onboarding procedure, some went through the researchers’ onboarding procedure and were given recommendations of when they should or should not trust the AI, and others were only given the recommendations.

Only the researchers’ onboarding procedure without recommendations improved users’ accuracy significantly, boosting their performance on the traffic light prediction task by about 5 percent without slowing them down. However, onboarding was not as effective for the question-answering task. The researchers believe this is because the AI model, ChatGPT, provided explanations with each answer that convey whether it should be trusted.

But providing recommendations without onboarding had the opposite effect — users not only performed worse, they took more time to make predictions.

“When you only give someone recommendations, it seems like they get confused and don’t know what to do. It derails their process. People also don’t like being told what to do, so that is a factor as well,” Mozannar says.

Providing recommendations alone could harm the user if those recommendations are wrong, he adds. With onboarding, on the other hand, the biggest limitation is the amount of available data. If there aren’t enough data, the onboarding stage won’t be as effective, he says.

In the future, he and his collaborators want to conduct larger studies to evaluate the short- and long-term effects of onboarding. They also want to leverage unlabeled data for the onboarding process, and find methods to effectively reduce the number of regions without omitting important examples.

“People are adopting AI systems willy-nilly, and indeed AI offers great potential, but these AI agents still sometimes makes mistakes. Thus, it’s crucial for AI developers to devise methods that help humans know when it’s safe to rely on the AI’s suggestions,” says Dan Weld, professor emeritus at the Paul G. Allen School of Computer Science and Engineering at the University of Washington, who was not involved with this research. “Mozannar et al. have created an innovative method for identifying situations where the AI is trustworthy, and (importantly) to describe them to people in a way that leads to better human-AI team interactions.”

This work is funded, in part, by the MIT-IBM Watson AI Lab.

A flexible solution to help artists improve animation

Artists who bring to life heroes and villains in animated movies and video games could have more control over their animations, thanks to a new technique introduced by MIT researchers.

Their method generates mathematical functions known as barycentric coordinates, which define how 2D and 3D shapes can bend, stretch, and move through space. For example, an artist using their tool could choose functions that make the motions of a 3D cat’s tail fit their vision for the “look” of the animated feline.

This gif shows how researchers used their technique to provide a smoother motion for a cat’s tail. Image courtesy of the researchers.

Many other techniques for this problem are inflexible, providing only a single option for the barycentric coordinate functions for a certain animated character. Each function may or may not be the best one for a particular animation. The artist would have to start from scratch with a new approach each time they want to try for a slightly different look.

“As researchers, we can sometimes get stuck in a loop of solving artistic problems without consulting with artists. What artists care about is flexibility and the ‘look’ of their final product. They don’t care about the partial differential equations your algorithm solves behind the scenes,” says Ana Dodik, lead author of a paper on this technique.

Beyond its artistic applications, this technique could be used in areas such as medical imaging, architecture, virtual reality, and even in computer vision as a tool to help robots figure out how objects move in the real world.

Dodik, an electrical engineering and computer science (EECS) graduate student, wrote the paper with Oded Stein, assistant professor at the University of Southern California’s Viterbi School of Engineering; Vincent Sitzmann, assistant professor of EECS who leads the Scene Representation Group in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL); and senior author Justin Solomon, an associate professor of EECS and leader of the CSAIL Geometric Data Processing Group. The research was recently presented at SIGGRAPH Asia.

A generalized approach

When an artist animates a 2D or 3D character, one common technique is to surround the complex shape of the character with a simpler set of points connected by line segments or triangles, called a cage. The animator drags these points to move and deform the character inside the cage. The key technical problem is to determine how the character moves when the cage is modified; this motion is determined by the design of a particular barycentric coordinate function.

Traditional approaches use complicated equations to find cage-based motions that are extremely smooth, avoiding kinks that could develop in a shape when it is stretched or bent to the extreme. But there are many notions of how the artistic idea of “smoothness” translates into math, each of which leads to a different set of barycentric coordinate functions.

The MIT researchers sought a general approach that allows artists to have a say in designing or choosing among smoothness energies for any shape. Then the artist could preview the deformation and choose the smoothness energy that looks the best to their taste.

Although flexible design of barycentric coordinates is a modern idea, the basic mathematical construction of barycentric coordinates dates back centuries. Introduced by the German mathematician August Möbius in 1827, barycentric coordinates dictate how each corner of a shape exerts influence over the shape’s interior.

In a triangle, which is the shape Möbius used in his calculations, barycentric coordinates are easy to design — but when the cage isn’t a triangle, the calculations become messy. Making barycentric coordinates for a complicated cage is especially difficult because, for complex shapes, each barycentric coordinate must meet a set of constraints while being as smooth as possible.

Diverging from past work, the team used a special type of neural network to model the unknown barycentric coordinate functions. A neural network, loosely based on the human brain, processes an input using many layers of interconnected nodes.

While neural networks are often applied in AI applications that mimic human thought, in this project neural networks are used for a mathematical reason. The researchers’ network architecture knows how to output barycentric coordinate functions that satisfy all the constraints exactly. They build the constraints directly into the network, so when it generates solutions, they are always valid. This construction helps artists design interesting barycentric coordinates without having to worry about mathematical aspects of the problem.

“The tricky part was building in the constraints. Standard tools didn’t get us all the way there, so we really had to think outside the box,” Dodik says.

Virtual triangles

The researchers drew on the triangular barycentric coordinates Möbius introduced nearly 200 years ago. These triangular coordinates are simple to compute and satisfy all the necessary constraints, but modern cages are much more complex than triangles.

To bridge the gap, the researchers’ method covers a shape with overlapping virtual triangles that connect triplets of points on the outside of the cage.

“Each virtual triangle defines a valid barycentric coordinate function. We just need a way of combining them,” she says.

That is where the neural network comes in. It predicts how to combine the virtual triangles’ barycentric coordinates to make a more complicated, but smooth function.

Using their method, an artist could try one function, look at the final animation, and then tweak the coordinates to generate different motions until they arrive at an animation that looks the way they want.

“From a practical perspective, I think the biggest impact is that neural networks give you a lot of flexibility that you didn’t previously have,” Dodik says.

The researchers demonstrated how their method could generate more natural-looking animations than other approaches, like a cat’s tail that curves smoothly when it moves instead of folding rigidly near the vertices of the cage.

In the future, they want to try different strategies to accelerate the neural network. They also want to build this method into an interactive interface that would enable an artist to easily iterate on animations in real time.

This research was funded, in part, by the U.S. Army Research Office, the U.S. Air Force Office of Scientific Research, the U.S. National Science Foundation, the CSAIL Systems that Learn Program, the MIT-IBM Watson AI Lab, the Toyota-CSAIL Joint Research Center, Adobe Systems, a Google Research Award, the Singapore Defense Science and Technology Agency, and the Amazon Science Hub.

Leveraging language to understand machines

Natural language conveys ideas, actions, information, and intent through context and syntax; further, there are volumes of it contained in databases. This makes it an excellent source of data to train machine-learning systems on. Two master’s of engineering students in the 6A MEng Thesis Program at MIT, Irene Terpstra ’23 and Rujul Gandhi ’22, are working with mentors in the MIT-IBM Watson AI Lab to use this power of natural language to build AI systems.

As computing is becoming more advanced, researchers are looking to improve the hardware that they run on; this means innovating to create new computer chips. And, since there is literature already available on modifications that can be made to achieve certain parameters and performance, Terpstra and her mentors and advisors Anantha Chandrakasan, MIT School of Engineering dean and the Vannevar Bush Professor of Electrical Engineering and Computer Science, and IBM’s researcher Xin Zhang, are developing an AI algorithm that assists in chip design.

“I’m creating a workflow to systematically analyze how these language models can help the circuit design process. What reasoning powers do they have, and how can it be integrated into the chip design process?” says Terpstra. “And then on the other side, if that proves to be useful enough, [we’ll] see if they can automatically design the chips themselves, attaching it to a reinforcement learning algorithm.”

To do this, Terpstra’s team is creating an AI system that can iterate on different designs. It means experimenting with various pre-trained large language models (like ChatGPT, Llama 2, and Bard), using an open-source circuit simulator language called NGspice, which has the parameters of the chip in code form, and a reinforcement learning algorithm. With text prompts, researchers will be able to query how the physical chip should be modified to achieve a certain goal in the language model and produced guidance for adjustments. This is then transferred into a reinforcement learning algorithm that updates the circuit design and outputs new physical parameters of the chip.

“The final goal would be to combine the reasoning powers and the knowledge base that is baked into these large language models and combine that with the optimization power of the reinforcement learning algorithms and have that design the chip itself,” says Terpstra.

Rujul Gandhi works with the raw language itself. As an undergraduate at MIT, Gandhi explored linguistics and computer sciences, putting them together in her MEng work. “I’ve been interested in communication, both between just humans and between humans and computers,” Gandhi says.

Robots or other interactive AI systems are one area where communication needs to be understood by both humans and machines. Researchers often write instructions for robots using formal logic. This helps ensure that commands are being followed safely and as intended, but formal logic can be difficult for users to understand, while natural language comes easily. To ensure this smooth communication, Gandhi and her advisors Yang Zhang of IBM and MIT assistant professor Chuchu Fan are building a parser that converts natural language instructions into a machine-friendly form. Leveraging the linguistic structure encoded by the pre-trained encoder-decoder model T5, and a dataset of annotated, basic English commands for performing certain tasks, Gandhi’s system identifies the smallest logical units, or atomic propositions, which are present in a given instruction.

“Once you’ve given your instruction, the model identifies all the smaller sub-tasks you want it to carry out,” Gandhi says. “Then, using a large language model, each sub-task can be compared against the available actions and objects in the robot’s world, and if any sub-task can’t be carried out because a certain object is not recognized, or an action is not possible, the system can stop right there to ask the user for help.”

This approach of breaking instructions into sub-tasks also allows her system to understand logical dependencies expressed in English, like, “do task X until event Y happens.” Gandhi uses a dataset of step-by-step instructions across robot task domains like navigation and manipulation, with a focus on household tasks. Using data that are written just the way humans would talk to each other has many advantages, she says, because it means a user can be more flexible about how they phrase their instructions.

Another of Gandhi’s projects involves developing speech models. In the context of speech recognition, some languages are considered “low resource” since they might not have a lot of transcribed speech available, or might not have a written form at all. “One of the reasons I applied to this internship at the MIT-IBM Watson AI Lab was an interest in language processing for low-resource languages,” she says. “A lot of language models today are very data-driven, and when it’s not that easy to acquire all of that data, that’s when you need to use the limited data efficiently.” 

Speech is just a stream of sound waves, but humans having a conversation can easily figure out where words and thoughts start and end. In speech processing, both humans and language models use their existing vocabulary to recognize word boundaries and understand the meaning. In low- or no-resource languages, a written vocabulary might not exist at all, so researchers can’t provide one to the model. Instead, the model can make note of what sound sequences occur together more frequently than others, and infer that those might be individual words or concepts. In Gandhi’s research group, these inferred words are then collected into a pseudo-vocabulary that serves as a labeling method for the low-resource language, creating labeled data for further applications.

The applications for language technology are “pretty much everywhere,” Gandhi says. “You could imagine people being able to interact with software and devices in their native language, their native dialect. You could imagine improving all the voice assistants that we use. You could imagine it being used for translation or interpretation.”

Building technology that empowers city residents

Kwesi Afrifa came to MIT from his hometown of Accra, Ghana, in 2020 to pursue an interdisciplinary major in urban planning and computer science. Growing up amid the many moving parts of a large, densely populated city, he had often observed aspects of urban life that could be made more efficient. He decided to apply his interest in computing and coding to address these problems by creating software tools for city planners.

Now a senior, Afrifa works at the City Form Lab led by Andres Sevstuk, collaborating on an open-source, Python-based tool that allows researchers and policymakers to analyze pedestrians’ behaviors. The package, which launches next month, will make it more feasible for researchers and city planners to investigate how changes to a city’s structural characteristics impact walkability and the pedestrian experience.

During his first two years at MIT, Afrifa worked in the Civic Data Design Lab led by Associate Professor Sarah Williams, where he helped build sensing tools and created an online portal for people living in Kibera, Nairobi, to access the internet and participate in survey research.

After graduation, he will go on to work as a software engineer at a startup in New York. After several years, he hopes to start his own company, building urban data tools for integration into mapping and location-based software applications.

“I see it as my duty to make city systems more efficient, deepen the connection between residents and their communities, and make existing in them better for everyone, including groups which have often been marginalized,” he says.

“Cities are special places”

Afrifa believes that in urban settings, technology has a unique power to both accelerate development and empower citizens.

He witnessed such unifying power in high school, when he created the website ghanabills.com, which aggregated bills of parliament in Ghana, providing easy access to this information as well as a place for people to engage in discussion on the bills. He describes the effect of this technology as a “democratizing force.”

Afrifa also explored the connection between cities and community as an executive member of Code for Good, a program that connects MIT students interested in software with nonprofits throughout the Boston area. He served as a mentor for students and worked on finding nonprofits to match them up with.

Language and visibility

Sharing African languages and cultures is also important to Afrifa. In his first two years at MIT, he and other African students across the country started the Mandla app, which he describes as a Duolingo for African languages. It had gamified lessons, voice translations, and other interactive features for learning. “We wanted to solve the problem of language revitalization and bring African languages to the broader diaspora,” he says. At its peak a year ago, the app had 50,000 daily active users.

Although the Mandla App was discontinued due to lack of funding, Afrifa has found other ways to promote African culture at MIT. He is currently collaborating with architecture graduate students TJ Bayowa and Courage Kpodo on a “A Tale of Two Coasts,” an upcoming short film and multimedia installation that delves into the intricate connections between perceptions of African art and identity spanning two coasts of the Atlantic Ocean. This ongoing collaboration, which Afrifa says is still taking shape, is something he hopes to expand beyond MIT.

Afrifa has written and directed two productions for the Black Theater Guild (BTG). “It’s been very rewarding to conceptualize ideas, write stories and have this amazing community of people come together and produce it,” he says. Photo credit: Gretchen Ertl

Discovering arts

As a child, Afrifa enjoyed writing poetry. Growing up with parents who loved literature, Afrifa was encouraged to become involved with the theater and art scene of Accra. He didn’t expect to continue this interest at MIT, but then he discovered the Black Theater Guild (BTG).

The theater group had been active at MIT from the 1990s to around 2005. It was revived by Afrifa in his sophomore year when Professor Jay Scheib, head of Music and Theater Arts at MIT, encouraged him to write, direct, and produce more of his work after his final project for 21M.710 (Script Analysis), a dramaturgy class taught by Scheib.

Since then, the BTG has held two productions in the past two years: “Nkrumah’s Last Day,” in spring 2022, and “Shooting the Sheriff,” in spring 2023, both of which were written and directed by Afrifa. “It’s been very rewarding to conceptualize ideas, write stories and have this amazing community of people come together and produce it,” he says.

When asked if he will continue to pursue theater post-grad, Afrifa says: “That’s 100 percent the goal.”

Researchers safely integrate fragile 2D materials into devices

Two-dimensional materials, which are only a few atoms thick, can exhibit some incredible properties, such as the ability to carry electric charge extremely efficiently, which could boost the performance of next-generation electronic devices.

But integrating 2D materials into devices and systems like computer chips is notoriously difficult. These ultrathin structures can be damaged by conventional fabrication techniques, which often rely on the use of chemicals, high temperatures, or destructive processes like etching.

To overcome this challenge, researchers from MIT and elsewhere have developed a new technique to integrate 2D materials into devices in a single step while keeping the surfaces of the materials and the resulting interfaces pristine and free from defects.

Their method relies on engineering surface forces available at the nanoscale to allow the 2D material to be physically stacked onto other prebuilt device layers. Because the 2D material remains undamaged, the researchers can take full advantage of its unique optical and electrical properties.

They used this approach to fabricate arrays of 2D transistors that achieved new functionalities compared to devices produced using conventional fabrication techniques. Their method, which is versatile enough to be used with many materials, could have diverse applications in high-performance computing, sensing, and flexible electronics.

Core to unlocking these new functionalities is the ability to form clean interfaces, held together by special forces that exist between all matter, called van der Waals forces.

However, such van der Waals integration of materials into fully functional devices is not always easy, says Farnaz Niroui, assistant professor of electrical engineering and computer science (EECS), a member of the Research Laboratory of Electronics (RLE), and senior author of a new paper describing the work.

“Van der Waals integration has a fundamental limit,” she explains. “Since these forces depend on the intrinsic properties of the materials, they cannot be readily tuned. As a result, there are some materials that cannot be directly integrated with each other using their van der Waals interactions alone. We have come up with a platform to address this limit to help make van der Waals integration more versatile, to promote the development of 2D-materials-based devices with new and improved functionalities.”

Niroui wrote the paper with lead author Peter Satterthwaite, an electrical engineering and computer science graduate student; Jing Kong, professor of EECS and a member of RLE; and others at MIT, Boston University, National Tsing Hua University in Taiwan, the National Science and Technology Council of Taiwan, and National Cheng Kung University in Taiwan. The research is published today in Nature Electronics.  

The developed platform leverages industry-compatible toolsets, allowing for the process to be scaled. Here, lead author Peter Satterthwaite uses a modified alignment tool in MIT.nano to do a patterned, aligned integration. Image: Courtesy of Weikun Zhu

Advantageous attraction

Making complex systems such as a computer chip with conventional fabrication techniques can get messy. Typically, a rigid material like silicon is chiseled down to the nanoscale, then interfaced with other components like metal electrodes and insulating layers to form an active device. Such processing can cause damage to the materials.

Recently, researchers have focused on building devices and systems from the bottom up, using 2D materials and a process that requires sequential physical stacking. In this approach, rather than using chemical glues or high temperatures to bond a fragile 2D material to a conventional surface like silicon, researchers leverage van der Waals forces to physically integrate a layer of 2D material onto a device.

Van der Waals forces are natural forces of attraction that exist between all matter. For example, a gecko’s feet can stick to the wall temporarily due to van der Waals forces. Though all materials exhibit a van der Waals interaction, depending on the material, the forces are not always strong enough to hold them together. For instance, a popular semiconducting 2D material known as molybdenum disulfide will stick to gold, a metal, but won’t directly transfer to insulators like silicon dioxide by just coming into physical contact with that surface.

However, heterostructures made by integrating semiconductor and insulating layers are key building blocks of an electronic device. Previously, this integration has been enabled by bonding the 2D material to an intermediate layer like gold, then using this intermediate layer to transfer the 2D material onto the insulator, before removing the intermediate layer using chemicals or high temperatures.

Instead of using this sacrificial layer, the MIT researchers embed the low-adhesion insulator in a high-adhesion matrix. This adhesive matrix is what makes the 2D material stick to the embedded low-adhesion surface, providing the forces needed to create a van der Waals interface between the 2D material and the insulator.

Making the matrix

To make electronic devices, they form a hybrid surface of metals and insulators on a carrier substrate. This surface is then peeled off and flipped over to reveal a completely smooth top surface that contains the building blocks of the desired device.

This smoothness is important, since gaps between the surface and 2D material can hamper van der Waals interactions. Then, the researchers prepare the 2D material separately, in a completely clean environment, and bring it into direct contact with the prepared device stack.

“Once the hybrid surface is brought into contact with the 2D layer, without needing any high-temperatures, solvents, or sacrificial layers, it can pick up the 2D layer and integrate it with the surface. This way, we are allowing a van der Waals integration that would be traditionally forbidden, but now is possible and allows formation of fully functioning devices in a single step,” Satterthwaite explains.

This single-step process keeps the 2D material interface completely clean, which enables the material to reach its fundamental limits of performance without being held back by defects or contamination.

And because the surfaces also remain pristine, researchers can engineer the surface of the 2D material to form features or connections to other components. For example, they used this technique to create p-type transistors, which are generally challenging to make with 2D materials. Their transistors have improved on previous studies, and can provide a platform toward studying and achieving the performance needed for practical electronics.

Their approach can be done at scale to make larger arrays of devices. The adhesive matrix technique can also be used with a range of materials, and even with other forces to enhance the versatility of this platform. For instance, the researchers integrated graphene onto a device, forming the desired van der Waals interfaces using a matrix made with a polymer. In this case, adhesion relies on chemical interactions rather than van der Waals forces alone.

In the future, the researchers want to build on this platform to enable integration of a diverse library of 2D materials to study their intrinsic properties without the influence of processing damage, and develop new device platforms that leverage these superior functionalities.  

This research is funded, in part, by the U.S. National Science Foundation, the U.S. Department of Energy, the BUnano Cross-Disciplinary Fellowship at Boston University, and the U.S. Army Research Office. The fabrication and characterization procedures were carried out, largely, in the MIT.nano shared facilities.

MIT group releases white papers on governance of AI

Providing a resource for U.S. policymakers, a committee of MIT leaders and scholars has released a set of policy briefs that outlines a framework for the governance of artificial intelligence. The approach includes extending current regulatory and liability approaches in pursuit of a practical way to oversee AI.

The aim of the papers is to help enhance U.S. leadership in the area of artificial intelligence broadly, while limiting harm that could result from the new technologies and encouraging exploration of how AI deployment could be beneficial to society.

The main policy paper, “A Framework for U.S. AI Governance: Creating a Safe and Thriving AI Sector,” suggests AI tools can often be regulated by existing U.S. government entities that already oversee the relevant domains. The recommendations also underscore the importance of identifying the purpose of AI tools, which would enable regulations to fit those applications.

“As a country we’re already regulating a lot of relatively high-risk things and providing governance there,” says Dan Huttenlocher, dean of the MIT Schwarzman College of Computing, who helped steer the project, which stemmed from the work of an ad hoc MIT committee. “We’re not saying that’s sufficient, but let’s start with things where human activity is already being regulated, and which society, over time, has decided are high risk. Looking at AI that way is the practical approach.”

“The framework we put together gives a concrete way of thinking about these things,” says Asu Ozdaglar, the deputy dean of academics in the MIT Schwarzman College of Computing and head of MIT’s Department of Electrical Engineering and Computer Science (EECS), who also helped oversee the effort.

The project includes multiple additional policy papers and comes amid heightened interest in AI over last year as well as considerable new industry investment in the field. The European Union is currently trying to finalize AI regulations using its own approach, one that assigns broad levels of risk to certain types of applications. In that process, general-purpose AI technologies such as language models have become a new sticking point. Any governance effort faces the challenges of regulating both general and specific AI tools, as well as an array of potential problems including misinformation, deepfakes, surveillance, and more.

“We felt it was important for MIT to get involved in this because we have expertise,” says David Goldston, director of the MIT Washington Office. “MIT is one of the leaders in AI research, one of the places where AI first got started. Since we are among those creating technology that is raising these important issues, we feel an obligation to help address them.”

Purpose, intent, and guardrails

The main policy brief outlines how current policy could be extended to cover AI, using existing regulatory agencies and legal liability frameworks where possible. The U.S. has strict licensing laws in the field of medicine, for example. It is already illegal to impersonate a doctor; if AI were to be used to prescribe medicine or make a diagnosis under the guise of being a doctor, it should be clear that would violate the law just as strictly human malfeasance would. As the policy brief notes, this is not just a theoretical approach; autonomous vehicles, which deploy AI systems, are subject to regulation in the same manner as other vehicles.

An important step in making these regulatory and liability regimes, the policy brief emphasizes, is having AI providers define the purpose and intent of AI applications in advance. Examining new technologies on this basis would then make clear which existing sets of regulations, and regulators, are germane to any given AI tool.

However, it is also the case that AI systems may exist at multiple levels, in what technologists call a “stack” of systems that together deliver a particular service. For example, a general-purpose language model may underlie a specific new tool. In general, the brief notes, the provider of a specific service might be primarily liable for problems with it. However, “when a component system of a stack does not perform as promised, it may be reasonable for the provider of that component to share responsibility,” as the first brief states. The builders of general-purpose tools should thus also be accountable should their technologies be implicated in specific problems.

“That makes governance more challenging to think about, but the foundation models should not be completely left out of consideration,” Ozdaglar says. “In a lot of cases, the models are from providers, and you develop an application on top, but they are part of the stack. What is the responsibility there? If systems are not on top of the stack, it doesn’t mean they should not be considered.”

Having AI providers clearly define the purpose and intent of AI tools, and requiring guardrails to prevent misuse, could also help determine the extent to which either companies or end users are accountable for specific problems. The policy brief states that a good regulatory regime should be able to identify what it calls a “fork in the toaster” situation — when an end user could reasonably be held responsible for knowing the problems that misuse of a tool could produce.

Responsive and flexible

While the policy framework involves existing agencies, it includes the addition of some new oversight capacity as well. For one thing, the policy brief calls for advances in auditing of new AI tools, which could move forward along a variety of paths, whether government-initiated, user-driven, or deriving from legal liability proceedings. There would need to be public standards for auditing, the paper notes, whether established by a nonprofit entity along the lines of the Public Company Accounting Oversight Board (PCAOB), or through a federal entity similar to the National Institute of Standards and Technology (NIST).

And the paper does call for the consideration of creating a new, government-approved “self-regulatory organization” (SRO) agency along the functional lines of FINRA, the government-created Financial Industry Regulatory Authority. Such an agency, focused on AI, could accumulate domain-specific knowledge that would allow it to be responsive and flexible when engaging with a rapidly changing AI industry.

“These things are very complex, the interactions of humans and machines, so you need responsiveness,” says Huttenlocher, who is also the Henry Ellis Warren Professor in Computer Science and Artificial Intelligence and Decision-Making in EECS. “We think that if government considers new agencies, it should really look at this SRO structure. They are not handing over the keys to the store, as it’s still something that’s government-chartered and overseen.”

As the policy papers make clear, there are several additional particular legal matters that will need addressing in the realm of AI. Copyright and other intellectual property issues related to AI generally are already the subject of litigation.

And then there are what Ozdaglar calls “human plus” legal issues, where AI has capacities that go beyond what humans are capable of doing. These include things like mass-surveillance tools, and the committee recognizes they may require special legal consideration.

“AI enables things humans cannot do, such as surveillance or fake news at scale, which may need special consideration beyond what is applicable for humans,” Ozdaglar says. “But our starting point still enables you to think about the risks, and then how that risk gets amplified because of the tools.”

The set of policy papers addresses a number of regulatory issues in detail. For instance, one paper, “Labeling AI-Generated Content: Promises, Perils, and Future Directions,” by Chloe Wittenberg, Ziv Epstein, Adam J. Berinsky, and David G. Rand, builds on prior research experiments about media and audience engagement to assess specific approaches for denoting AI-produced material. Another paper, “Large Language Models,” by Yoon Kim, Jacob Andreas, and Dylan Hadfield-Menell, examines general-purpose language-based AI innovations.

“Part of doing this properly”

As the policy briefs make clear, another element of effective government engagement on the subject involves encouraging more research about how to make AI beneficial to society in general.

For instance, the policy paper, “Can We Have a Pro-Worker AI? Choosing a path of machines in service of minds,” by Daron Acemoglu, David Autor, and Simon Johnson, explores the possibility that AI might augment and aid workers, rather than being deployed to replace them — a scenario that would provide better long-term economic growth distributed throughout society.

This range of analyses, from a variety of disciplinary perspectives, is something the ad hoc committee wanted to bring to bear on the issue of AI regulation from the start — broadening the lens that can be brought to policymaking, rather than narrowing it to a few technical questions.

“We do think academic institutions have an important role to play both in terms of expertise about technology, and the interplay of technology and society,” says Huttenlocher. “It reflects what’s going to be important to governing this well, policymakers who think about social systems and technology together. That’s what the nation’s going to need.”

Indeed, Goldston notes, the committee is attempting to bridge a gap between those excited and those concerned about AI, by working to advocate that adequate regulation accompanies advances in the technology.

As Goldston puts it, the committee releasing these papers is “is not a group that is antitechnology or trying to stifle AI. But it is, nonetheless, a group that is saying AI needs governance and oversight. That’s part of doing this properly. These are people who know this technology, and they’re saying that AI needs oversight.”

Huttenlocher adds, “Working in service of the nation and the world is something MIT has taken seriously for many, many decades. This is a very important moment for that.”

In addition to Huttenlocher, Ozdaglar, and Goldston, the ad hoc committee members are: Daron Acemoglu, Institute Professor and the Elizabeth and James Killian Professor of Economics in the School of Arts, Humanities, and Social Sciences; Jacob Andreas, associate professor in EECS; David Autor, the Ford Professor of Economics; Adam Berinsky, the Mitsui Professor of Political Science; Cynthia Breazeal, dean for Digital Learning and professor of media arts and sciences; Dylan Hadfield-Menell, the Tennenbaum Career Development Assistant Professor of Artificial Intelligence and Decision-Making; Simon Johnson, the Kurtz Professor of Entrepreneurship in the MIT Sloan School of Management; Yoon Kim, the NBX Career Development Assistant Professor in EECS; Sendhil Mullainathan, the Roman Family University Professor of Computation and Behavioral Science at the University of Chicago Booth School of Business; Manish Raghavan, assistant professor of information technology at MIT Sloan; David Rand, the Erwin H. Schell Professor at MIT Sloan and a professor of brain and cognitive sciences; Antonio Torralba, the Delta Electronics Professor of Electrical Engineering and Computer Science; and Luis Videgaray, a senior lecturer at MIT Sloan.