Toward a code-breaking quantum computer

The most recent email you sent was likely encrypted using a tried-and-true method that relies on the idea that even the fastest computer would be unable to efficiently break a gigantic number into factors.

Quantum computers, on the other hand, promise to rapidly crack complex cryptographic systems that a classical computer might never be able to unravel. This promise is based on a quantum factoring algorithm proposed in 1994 by Peter Shor, who is now a professor at MIT.

But while researchers have taken great strides in the last 30 years, scientists have yet to build a quantum computer powerful enough to run Shor’s algorithm.

As some researchers work to build larger quantum computers, others have been trying to improve Shor’s algorithm so it could run on a smaller quantum circuit. About a year ago, New York University computer scientist Oded Regev proposed a major theoretical improvement. His algorithm could run faster, but the circuit would require more memory.

Building off those results, MIT researchers have proposed a best-of-both-worlds approach that combines the speed of Regev’s algorithm with the memory-efficiency of Shor’s. This new algorithm is as fast as Regev’s, requires fewer quantum building blocks known as qubits, and has a higher tolerance to quantum noise, which could make it more feasible to implement in practice.

In the long run, this new algorithm could inform the development of novel encryption methods that can withstand the code-breaking power of quantum computers.

“If large-scale quantum computers ever get built, then factoring is toast and we have to find something else to use for cryptography. But how real is this threat? Can we make quantum factoring practical? Our work could potentially bring us one step closer to a practical implementation,” says Vinod Vaikuntanathan, the Ford Foundation Professor of Engineering, a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL), and senior author of a paper describing the algorithm.

The paper’s lead author is Seyoon Ragavan, a graduate student in the MIT Department of Electrical Engineering and Computer Science. The research will be presented at the 2024 International Cryptology Conference.

Cracking cryptography

To securely transmit messages over the internet, service providers like email clients and messaging apps typically rely on RSA, an encryption scheme invented by MIT researchers Ron Rivest, Adi Shamir, and Leonard Adleman in the 1970s (hence the name “RSA”). The system is based on the idea that factoring a 2,048-bit integer (a number with 617 digits) is too hard for a computer to do in a reasonable amount of time.

That idea was flipped on its head in 1994 when Shor, then working at Bell Labs, introduced an algorithm which proved that a quantum computer could factor quickly enough to break RSA cryptography.

“That was a turning point. But in 1994, nobody knew how to build a large enough quantum computer. And we’re still pretty far from there. Some people wonder if they will ever be built,” says Vaikuntanathan.

It is estimated that a quantum computer would need about 20 million qubits to run Shor’s algorithm. Right now, the largest quantum computers have around 1,100 qubits.

A quantum computer performs computations using quantum circuits, just like a classical computer uses classical circuits. Each quantum circuit is composed of a series of operations known as quantum gates. These quantum gates utilize qubits, which are the smallest building blocks of a quantum computer, to perform calculations.

But quantum gates introduce noise, so having fewer gates would improve a machine’s performance. Researchers have been striving to enhance Shor’s algorithm so it could be run on a smaller circuit with fewer quantum gates.

That is precisely what Regev did with the circuit he proposed a year ago.

“That was big news because it was the first real improvement to Shor’s circuit from 1994,” Vaikuntanathan says.

The quantum circuit Shor proposed has a size proportional to the square of the number being factored. That means if one were to factor a 2,048-bit integer, the circuit would need millions of gates.

Regev’s circuit requires significantly fewer quantum gates, but it needs many more qubits to provide enough memory. This presents a new problem.

“In a sense, some types of qubits are like apples or oranges. If you keep them around, they decay over time. You want to minimize the number of qubits you need to keep around,” explains Vaikuntanathan.

He heard Regev speak about his results at a workshop last August. At the end of his talk, Regev posed a question: Could someone improve his circuit so it needs fewer qubits? Vaikuntanathan and Ragavan took up that question.

Quantum ping-pong

To factor a very large number, a quantum circuit would need to run many times, performing operations that involve computing powers, like 2 to the power of 100.

But computing such large powers is costly and difficult to perform on a quantum computer, since quantum computers can only perform reversible operations. Squaring a number is not a reversible operation, so each time a number is squared, more quantum memory must be added to compute the next square.

The MIT researchers found a clever way to compute exponents using a series of Fibonacci numbers that requires simple multiplication, which is reversible, rather than squaring. Their method needs just two quantum memory units to compute any exponent.

“It is kind of like a ping-pong game, where we start with a number and then bounce back and forth, multiplying between two quantum memory registers,” Vaikuntanathan adds.

They also tackled the challenge of error correction. The circuits proposed by Shor and Regev require every quantum operation to be correct for their algorithm to work, Vaikuntanathan says. But error-free quantum gates would be infeasible on a real machine.

They overcame this problem using a technique to filter out corrupt results and only process the right ones.

The end-result is a circuit that is significantly more memory-efficient. Plus, their error correction technique would make the algorithm more practical to deploy.

“The authors resolve the two most important bottlenecks in the earlier quantum factoring algorithm. Although still not immediately practical, their work brings quantum factoring algorithms closer to reality,” adds Regev.

In the future, the researchers hope to make their algorithm even more efficient and, someday, use it to test factoring on a real quantum circuit.

“The elephant-in-the-room question after this work is: Does it actually bring us closer to breaking RSA cryptography? That is not clear just yet; these improvements currently only kick in when the integers are much larger than 2,048 bits. Can we push this algorithm and make it more feasible than Shor’s even for 2,048-bit integers?” says Ragavan.

This work is funded by an Akamai Presidential Fellowship, the U.S. Defense Advanced Research Projects Agency, the National Science Foundation, the MIT-IBM Watson AI Lab, a Thornton Family Faculty Research Innovation Fellowship, and a Simons Investigator Award.

A framework for solving parabolic partial differential equations

Computer graphics and geometry processing research provide the tools needed to simulate physical phenomena like fire and flames, aiding the creation of visual effects in video games and movies as well as the fabrication of complex geometric shapes using tools like 3D printing.

Under the hood, mathematical problems called partial differential equations (PDEs) model these natural processes. Among the many PDEs used in physics and computer graphics, a class called second-order parabolic PDEs explain how phenomena can become smooth over time. The most famous example in this class is the heat equation, which predicts how heat diffuses along a surface or in a volume over time.

Researchers in geometry processing have designed numerous algorithms to solve these problems on curved surfaces, but their methods often apply only to linear problems or to a single PDE. A more general approach by researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) tackles a general class of these potentially nonlinear problems. 

In a paper recently published in the Transactions on Graphics journal and presented at the SIGGRAPH conference, they describe an algorithm that solves different nonlinear parabolic PDEs on triangle meshes by splitting them into three simpler equations that can be solved with techniques graphics researchers already have in their software toolkit. This framework can help better analyze shapes and model complex dynamical processes.

“We provide a recipe: If you want to numerically solve a second-order parabolic PDE, you can follow a set of three steps,” says lead author Leticia Mattos Da Silva SM ’23, an MIT PhD student in electrical engineering and computer science (EECS) and CSAIL affiliate. “For each of the steps in this approach, you’re solving a simpler problem using simpler tools from geometry processing, but at the end, you get a solution to the more challenging second-order parabolic PDE.”

To accomplish this, Da Silva and her coauthors used Strang splitting, a technique that allows geometry processing researchers to break the PDE down into problems they know how to solve efficiently.

First, their algorithm advances a solution forward in time by solving the heat equation (also called the “diffusion equation”), which models how heat from a source spreads over a shape. Picture using a blow torch to warm up a metal plate — this equation describes how heat from that spot would diffuse over it. 
This step can be completed easily with linear algebra.

Now, imagine that the parabolic PDE has additional nonlinear behaviors that are not described by the spread of heat. This is where the second step of the algorithm comes in: it accounts for the nonlinear piece by solving a Hamilton-Jacobi (HJ) equation, a first-order nonlinear PDE. 

While generic HJ equations can be hard to solve, Mattos Da Silva and coauthors prove that their splitting method applied to many important PDEs yields an HJ equation that can be solved via convex optimization algorithms. Convex optimization is a standard tool for which researchers in geometry processing already have efficient and reliable software. In the final step, the algorithm advances a solution forward in time using the heat equation again to advance the more complex second-order parabolic PDE forward in time.


Among other applications, the framework could help simulate fire and flames more efficiently. “There’s a huge pipeline that creates a video with flames being simulated, but at the heart of it is a PDE solver,” says Mattos Da Silva. For these pipelines, an essential step is solving the G-equation, a nonlinear parabolic PDE that models the front propagation of the flame and can be solved using the researchers’ framework.

The team’s algorithm can also solve the diffusion equation in the logarithmic domain, where it becomes nonlinear. Senior author Justin Solomon, associate professor of EECS and leader of the CSAIL Geometric Data Processing Group, previously developed a state-of-the-art technique for optimal transport that requires taking the logarithm of the result of heat diffusion. Mattos Da Silva’s framework provided more reliable computations by doing diffusion directly in the logarithmic domain. This enabled a more stable way to, for example, find a geometric notion of average among distributions on surface meshes like a model of a koala.

Even though their framework focuses on general, nonlinear problems, it can also be used to solve linear PDE. For instance, the method solves the Fokker-Planck equation, where heat diffuses in a linear way, but there are additional terms that drift in the same direction heat is spreading. In a straightforward application, the approach modeled how swirls would evolve over the surface of a triangulated sphere. The result resembles purple-and-brown latte art.

The researchers note that this project is a starting point for tackling the nonlinearity in other PDEs that appear in graphics and geometry processing head-on. For example, they focused on static surfaces but would like to apply their work to moving ones, too. Moreover, their framework solves problems involving a single parabolic PDE, but the team would also like to tackle problems involving coupled parabolic PDE. These types of problems arise in biology and chemistry, where the equation describing the evolution of each agent in a mixture, for example, is linked to the others’ equations.

Mattos Da Silva and Solomon wrote the paper with Oded Stein, assistant professor at the University of Southern California’s Viterbi School of Engineering. Their work was supported, in part, by an MIT Schwarzman College of Computing Fellowship funded by Google, a MathWorks Fellowship, the Swiss National Science Foundation, the U.S. Army Research Office, the U.S. Air Force Office of Scientific Research, the U.S. National Science Foundation, MIT-IBM Watson AI Lab, the Toyota-CSAIL Joint Research Center, Adobe Systems, and Google Research.

3 Questions: How to prove humanity online

As artificial intelligence agents become more advanced, it could become increasingly difficult to distinguish between AI-powered users and real humans on the internet. In a new white paper, researchers from MIT, OpenAI, Microsoft, and other tech companies and academic institutions propose the use of personhood credentials, a verification technique that enables someone to prove they are a real human online, while preserving their privacy.

MIT News spoke with two co-authors of the paper, Nouran Soliman, an electrical engineering and computer science graduate student, and Tobin South, a graduate student in the Media Lab, about the need for such credentials, the risks associated with them, and how they could be implemented in a safe and equitable way.

Q: Why do we need personhood credentials?

Tobin South: AI capabilities are rapidly improving. While a lot of the public discourse has been about how chatbots keep getting better, sophisticated AI enables far more capabilities than just a better ChatGPT, like the ability of AI to interact online autonomously. AI could have the ability to create accounts, post content, generate fake content, pretend to be human online, or algorithmically amplify content at a massive scale. This unlocks a lot of risks. You can think of this as a “digital imposter” problem, where it is getting harder to distinguish between sophisticated AI and humans. Personhood credentials are one potential solution to that problem.

Nouran Soliman: Such advanced AI capabilities could help bad actors run large-scale attacks or spread misinformation. The internet could be filled with AIs that are resharing content from real humans to run disinformation campaigns. It is going to become harder to navigate the internet, and social media specifically. You could imagine using personhood credentials to filter out certain content and moderate content on your social media feed or determine the trust level of information you receive online.

Q: What is a personhood credential, and how can you ensure such a credential is secure?

South: Personhood credentials allow you to prove you are human without revealing anything else about your identity. These credentials let you take information from an entity like the government, who can guarantee you are human, and then through privacy technology, allow you to prove that fact without sharing any sensitive information about your identity. To get a personhood credential, you are going to have to show up in person or have a relationship with the government, like a tax ID number. There is an offline component. You are going to have to do something that only humans can do. AIs can’t turn up at the DMV, for instance. And even the most sophisticated AIs can’t fake or break cryptography. So, we combine two ideas — the security that we have through cryptography and the fact that humans still have some capabilities that AIs don’t have — to make really robust guarantees that you are human.

Soliman: But personhood credentials can be optional. Service providers can let people choose whether they want to use one or not. Right now, if people only want to interact with real, verified people online, there is no reasonable way to do it. And beyond just creating content and talking to people, at some point AI agents are also going to take actions on behalf of people. If I am going to buy something online, or negotiate a deal, then maybe in that case I want to be certain I am interacting with entities that have personhood credentials to ensure they are trustworthy.

South: Personhood credentials build on top of an infrastructure and a set of security technologies we’ve had for decades, such as the use of identifiers like an email account to sign into online services, and they can complement those existing methods.

Q: What are some of the risks associated with personhood credentials, and how could you reduce those risks?

Soliman: One risk comes from how personhood credentials could be implemented. There is a concern about concentration of power. Let’s say one specific entity is the only issuer, or the system is designed in such a way that all the power is given to one entity. This could raise a lot of concerns for a part of the population — maybe they don’t trust that entity and don’t feel it is safe to engage with them. We need to implement personhood credentials in such a way that people trust the issuers and ensure that people’s identities remain completely isolated from their personhood credentials to preserve privacy.

South: If the only way to get a personhood credential is to physically go somewhere to prove you are human, then that could be scary if you are in a sociopolitical environment where it is difficult or dangerous to go to that physical location. That could prevent some people from having the ability to share their messages online in an unfettered way, possibly stifling free expression. That’s why it is important to have a variety of issuers of personhood credentials, and an open protocol to make sure that freedom of expression is maintained.

Soliman: Our paper is trying to encourage governments, policymakers, leaders, and researchers to invest more resources in personhood credentials. We are suggesting that researchers study different implementation directions and explore the broader impacts personhood credentials could have on the community. We need to make sure we create the right policies and rules about how personhood credentials should be implemented.

South: AI is moving very fast, certainly much faster than the speed at which governments adapt. It is time for governments and big companies to start thinking about how they can adapt their digital systems to be ready to prove that someone is human, but in a way that is privacy-preserving and safe, so we can be ready when we reach a future where AI has these advanced capabilities. 

Four Recipients Announced for new Transformative Research Funds

The Department of EECS is pleased to announce the four inaugural recipients of the Transformative Research Fund, an exciting new funding opportunity designed to facilitate bold and pivotal research, especially that which applies recent breakthrough technologies (such as generative AI) to important problems with broad societal impact. The Transformative Research Funds, which were made possible through the generous support of The SPC Foundation and Dick Thornton, exist to “place necessary bets on novel ideas and explore exciting, untested directions”–a priority which became particularly pressing with the breakneck-paced advances in AI and digital technologies. These new technologies have the potential for transformative change in fields ranging from healthcare to communications, commerce, the arts, sciences, and education. 

As Department Head Asu Ozdaglar explains, “We need a roadmap for the future that combines frontier knowledge in AI and computing technologies with domain knowledge in several fields, including engineering, sciences, social sciences, and humanities.   Transformative Research Funds will support groundbreaking projects that will apply the highest level of expertise in computer science, AI, and electrical engineering to some of humanity’s greatest problems.”

The initial call for proposals this year yielded a dozen responses; the following four proposals were selected to receive up to $200K in funding through the initiative, which will reopen for a new round of proposals next year. 

Laura Lewis, the Athinoula A. Martinos Associate Professor, proposes “Technology for measuring brain fluid clearance in the home sleep environment”, a study relating to Alzheimer’s Disease in which machine learning strategies will be used to help analyze biometric data collected by EEG. 

Mina Konaković Luković, Assistant Professor, proposes “Vision-Activated Directed Evolution”, in which researchers will combine robotics and a vision-based AI system to meet multidimensional objectives and solve complex problems in directed evolution. 

Stefanie Mueller, TIBCO Career Development Associate Professor, proposes “Real-time Radiation Feedback for Breast Cancer Treatment via AI-enabled Electrochemical Impedance Sensing” in collaboration with Dana-Farber Cancer Institute of Brigham and Women’s Hospital. The project is aimed at giving medical technicians real-time feedback on the effects of radiation on their patients’ tissue, allowing them to adjust radiation locations and dosages for greater cancer treatment efficacy. 

Abigail Bodner (EECS+EAPS) and Tess Smidt have teamed to present their proposal,
“Multi-Scale Climate Turbulence with Euclidean Neural Networks”, which aims to apply formal Euclidean symmetry-equivariant neural networks (ENNs) to the challenge of learning and representing mappings between 2D and 3D ocean turbulence, a representation problem within climate change modeling. 

Duane Boning named vice provost for international activities

Duane Boning ’84, SM ’86, and PhD ’91 has been named the next MIT vice provost for international activities (VPIA), effective Sept. 1. Boning, the Clarence J. LeBel Professor in Electrical Engineering and Computer Science (EECS) at MIT, succeeds Japan Steel Industry Professor Richard Lester, who has served as VPIA since 2015.

The VPIA provides intellectual leadership, guidance, and oversight of MIT’s international policies and engagements. In this role, Boning will conduct strategic reviews of the portfolio of international activities, advise the administration on global strategic priorities, and work with academic unit leaders and researchers to develop major new global programs and projects. Boning will also help coordinate faculty and administrative reviews of certain international projects to identify and manage U.S. national security, human rights, and economic and other risks.

“Duane has an exceptional record of accomplishment and will provide the forward-looking and collaborative leadership needed to guide the Institute’s international engagements and policies,” says Provost Cynthia Barnhart. “I am thrilled to welcome him to the role.”

Boning’s ties to MIT are long and lasting, first receiving his SB, SM and PhD degrees in EECS at the Institute, in 1984, 1986 and 1991, respectively. His tenure includes several campus leadership positions, including as associate department head of EECS from 2004 to 2011, and associate chair of the faculty from 2019 to 2021. He is the associate director for computation and CAD for the Microsystems Technology Laboratories, where he leads the MTL Statistical Metrology Group.

In 2016, Boning became the engineering faculty co-director of the MIT Leaders for Global Operations (LGO) program. With LGO Sloan faculty co-director Retsef Levi, Boning led the formation of MIT’s Machine Intelligence for Manufacturing & Operations (MIMO), which extends LGO activities in machine intelligence through additional industrial research projects, seminars, and workshops.

His experiences as a researcher and an educator have helped him appreciate the benefits of MIT’s international collaboration efforts, Boning says. “Taking on the VPIA role is about me wanting to continue and amplify that appreciation into the future, where I think it’s going to become even more important for MIT to remain and be engaged in the world.”

Among his previous leadership roles in international collaborations, Boning served as faculty director of the MIT/Masdar Institute Cooperative Program from 2011 to 2018, and director/faculty lead of the MIT Skoltech Initiative from 2011 to 2013.

Boning says the office of the VPIA can act as a driver and initiator of international engagement, but he looks forward to being a “a facilitator or convener, a coalescing point to find out where there are international opportunities and to bring people to them.”

“Finding ways to support higher MIT institutional priorities through international activities will be important,” he adds, citing as an example of these priorities the Climate Project at MIT launched by President Sally Kornbluth in 2023. “We will be puzzling out how our international components can best contribute to that and other initiatives.”

Lester will step into the role of interim vice president of climate (VPC), reporting to Kornbluth, while the search for a permanent VPC continues. Lester expects to complete his interim role and return to his MIT research activities at the end of the calendar year.

Formative experiences

Boning’s participation in the Cambridge-MIT Institute was one of his first experiences in international research and education. “It was eye-opening, seeing, ‘oh, you mean they don’t have weekly problem sets here?’” he jokes. “It showed me very different approaches to education that can also work, and how I might try some of those ideas in my own context.”

He looks back on the Cambridge experience and later work in manufacturing research with the Singapore-MIT Alliance for Research and Technology “with fondness in my heart,” he says. “It enabled me to see how international activities can benefit my own research and the research of my colleagues around me.”

His leadership in larger programs such as LGO and the MIT/Masdar program taught him the importance of creating and recruiting for MIT’s international collaborations, “by finding appropriate ways to connect with the passions of MIT faculty,” Boning says.

Boning says he will also draw on his experiences in departmental and faculty-level governance to guide him in his new role. “I recognize how broad MIT is and how widespread the different practices and cultures are in different schools and departments and programs across MIT,” he explains. “It’s given me a broader appreciation of faculty, staff, administration — everybody across all corners of the Institute and how they contribute to MIT’s mission.”

Future goals

Barnhart praised Lester, the outgoing VPIA, saying that “Richard’s body of work as vice provost for international activities is impressive and impactful. He has applied his commendable leadership skills, sharp intellect, and broad vision to transforming the ways MIT engages and collaborates with partners across the globe.”

She noted that Lester had expanded the reach of MIT’s research and education missions through numerous international collaborations, especially in Africa and Asia. As convenor and co-chair of the MIT China Strategy Group, Lester led the preparation and implementation of an influential November 2022 report on how MIT should approach its interactions and collaborations with China.

Boning cites the China report as an excellent example of how the VPIA can identify best practices and address head-on the values and complexities of international collaboration. “We have to live up to the reputation of the mission of MIT in intellectual development and freedom, while also recognizing that there are risks that need to be managed and choices that need to be made,” he says.

Boning’s field of expertise — semiconductor and photonics manufacturing and design — has become a topic of intense interest and attention in innovation and economic circles, and he intends to stay engaged fully in research as a result. As VPIA, he may have to step back from some of his teaching, however, “and that is the piece I will miss the most. I will miss any semester when I am not in the classroom with students,” he says.

“But I’m curious about what the future is going to bring — boundless new opportunities, new technologies, AI — and how MIT can best facilitate the wise application of these for the world’s problems,” Boning adds. “I’m looking forward to lots of conversations with faculty colleagues and the whole community around what MIT can be doing, what we should be doing, and how we can best do it to support MIT’s mission through international activities.”

Student Spotlight: Ryn Moore ’24

Ryn Moore spins fire in a simple loop, resulting in a bright circle as the camera captures several rotations of the quickly moving torch.

Ryn Moore graduated this spring, majoring in 6-1 Electrical Science and Engineering and minoring in Biomedical Engineering. Despite a challenging courseload, Ryn took full advantage of MIT’s extensive range of quirky activities and clubs–including one where participants literally get to play with fire. Ryn chose the following questions from a long list designed to showcase their personality and experience at MIT.

Tell us about one interest or hobby you’ve discovered since you came to MIT.

Oh my goodness, I’ve picked up so many hobbies, it’s really hard to choose just one. I think the one that is most unique/something that’s really hard to come across outside of MIT is fire spinning, though. Getting involved with Spinning Arts has been really rewarding because the club is full of really kind, encouraging people. I was a dancer growing up, so fire spinning has been a way to continue that with a new twist. Plus the fire makes really cool noises as it whooshes past your ears.

Moore spins in a complex pattern. (Photo courtesy Ryn Moore)

Who’s your favorite artist?

I really love listening to The Mountain Goats. They have over 600 songs in shockingly diverse styles and genres, so I just never get bored of their discography. Some of my favorite songs include “Amy (Spent Gladiator 1)“, “Rain in Soho“, and “Oceanographer’s Choice“.

Are you a re-reader or a re-watcher—and if so, what are your comfort books, shows, or movies?

I’m definitely a re-reader/re-watcher, but only when I’m feeling down. My friends always know that I need hugs and support if they see me watching Dr. Horrible’s Singalong Blog.

What’s your favorite room or building within MIT, and what’s special about it to you?

36-112 holds a really special place in my heart. This is one of the rooms that Tech Squares – MIT’s square and round dancing club – meets in, so I’ve spent a lot of time there over the years engaging in one of my favorite hobbies. I’ve spent so much time dancing in that room, it actually feels weird to have classes in there and be learning instead of dancing!

Tell me about one teacher from your past — here at MIT, at your high school, or even earlier — who had an influence on the person you’ve become.

I think the professor who has been most influential for me here at MIT is Kyle Keane. He taught Principles and Practices of Assistive Technology, and his outlook on disability, justice, and purpose in life was extremely inspiring. He is so passionate about everything he does, and he helped me see that there are people who care about the things I care about and careers out there that will let me pursue things I’m passionate about in a professional capacity.

Tell us about your favorite game.

I am part of an ILG called ET (which I cannot speak highly enough of), and whenever we are just hanging out, we like to try to spoonerize or portmanteau as many phrases as possible. So if someone says ‘baby duck’ someone else will respond with ‘buck.’ Or if someone says ‘laser maze’ someone else will say ‘mazer lase.’ The thing that makes this really funny is that it’s not an explicit game we play, it just comes up in conversation, so you never know when someone is going to spoonerize something you say.

Helping robots practice skills independently to adapt to unfamiliar environments

The phrase “practice makes perfect” is usually reserved for humans, but it’s also a great maxim for robots newly deployed in unfamiliar environments.

Picture a robot arriving in a warehouse. It comes packaged with the skills it was trained on, like placing an object, and now it needs to pick items from a shelf it’s not familiar with. At first, the machine struggles with this, since it needs to get acquainted with its new surroundings. To improve, the robot will need to understand which skills within an overall task it needs improvement on, then specialize (or parameterize) that action.

A human onsite could program the robot to optimize its performance, but researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and The AI Institute have developed a more effective alternative. Presented at the Robotics: Science and Systems Conference last month, their “Estimate, Extrapolate, and Situate” (EES) algorithm enables these machines to practice on their own, potentially helping them improve at useful tasks in factories, households, and hospitals. 

Sizing up the situation

To help robots get better at activities like sweeping floors, EES works with a vision system that locates and tracks the machine’s surroundings. Then, the algorithm estimates how reliably the robot executes an action (like sweeping) and whether it would be worthwhile to practice more. EES forecasts how well the robot could perform the overall task if it refines that particular skill, and finally, it practices. The vision system subsequently checks whether that skill was done correctly after each attempt.

EES could come in handy in places like a hospital, factory, house, or coffee shop. For example, if you wanted a robot to clean up your living room, it would need help practicing skills like sweeping. According to Nishanth Kumar SM ’24 and his colleagues, though, EES could help that robot improve without human intervention, using only a few practice trials.

“Going into this project, we wondered if this specialization would be possible in a reasonable amount of samples on a real robot,” says Kumar, co-lead author of a paper describing the work, PhD student in electrical engineering and computer science, and a CSAIL affiliate. “Now, we have an algorithm that enables robots to get meaningfully better at specific skills in a reasonable amount of time with tens or hundreds of data points, an upgrade from the thousands or millions of samples that a standard reinforcement learning algorithm requires.”

See Spot sweep

EES’s knack for efficient learning was evident when implemented on Boston Dynamics’ Spot quadruped during research trials at The AI Institute. The robot, which has an arm attached to its back, completed manipulation tasks after practicing for a few hours. In one demonstration, the robot learned how to securely place a ball and ring on a slanted table in roughly three hours. In another, the algorithm guided the machine to improve at sweeping toys into a bin within about two hours. Both results appear to be an upgrade from previous frameworks, which would have likely taken more than 10 hours per task.

“We aimed to have the robot collect its own experience so it can better choose which strategies will work well in its deployment,” says co-lead author Tom Silver SM ’20, PhD ’24, an electrical engineering and computer science (EECS) alumnus and CSAIL affiliate who is now an assistant professor at Princeton University. “By focusing on what the robot knows, we sought to answer a key question: In the library of skills that the robot has, which is the one that would be most useful to practice right now?”

EES could eventually help streamline autonomous practice for robots in new deployment environments, but for now, it comes with a few limitations. For starters, they used tables that were low to the ground, which made it easier for the robot to see its objects. Kumar and Silver also 3D printed an attachable handle that made the brush easier for Spot to grab. The robot didn’t detect some items and identified objects in the wrong places, so the researchers counted those errors as failures.

Giving robots homework

The researchers note that the practice speeds from the physical experiments could be accelerated further with the help of a simulator. Instead of physically working at each skill autonomously, the robot could eventually combine real and virtual practice. They hope to make their system faster with less latency, engineering EES to overcome the imaging delays the researchers experienced. In the future, they may investigate an algorithm that reasons over sequences of practice attempts instead of planning which skills to refine.

“Enabling robots to learn on their own is both incredibly useful and extremely challenging,” says Danfei Xu, an assistant professor in the School of Interactive Computing at Georgia Tech and a research scientist at NVIDIA AI, who was not involved with this work. “In the future, home robots will be sold to all sorts of households and expected to perform a wide range of tasks. We can’t possibly program everything they need to know beforehand, so it’s essential that they can learn on the job. However, letting robots loose to explore and learn without guidance can be very slow and might lead to unintended consequences. The research by Silver and his colleagues introduces an algorithm that allows robots to practice their skills autonomously in a structured way. This is a big step towards creating home robots that can continuously evolve and improve on their own.”

Silver and Kumar’s co-authors are The AI Institute researchers Stephen Proulx and Jennifer Barry, plus four CSAIL members: Northeastern University PhD student and visiting researcher Linfeng Zhao, MIT EECS PhD student Willie McClinton, and MIT EECS professors Leslie Pack Kaelbling and Tomás Lozano-Pérez. Their work was supported, in part, by The AI Institute, the U.S. National Science Foundation, the U.S. Air Force Office of Scientific Research, the U.S. Office of Naval Research, the U.S. Army Research Office, and MIT Quest for Intelligence, with high-performance computing resources from the MIT SuperCloud and Lincoln Laboratory Supercomputing Center.

A new approach to fine-tuning quantum materials

Quantum materials — those with electronic properties that are governed by the principles of quantum mechanics, such as correlation and entanglement — can exhibit exotic behaviors under certain conditions, such as the ability to transmit electricity without resistance, known as superconductivity. However, in order to get the best performance out of these materials, they need to be properly tuned, in the same way that race cars require tuning as well. A team led by Mingda Li, an associate professor in MIT’s Department of Nuclear Science and Engineering (NSE), has demonstrated a new, ultra-precise way to tweak the characteristics of quantum materials, using a particular class of these materials, Weyl semimetals, as an example.

The new technique is not limited to Weyl semimetals. “We can use this method for any inorganic bulk material, and for thin films as well,” maintains NSE postdoc Manasi Mandal, one of two lead authors of an open-access paper — published recently in Applied Physics Reviews — that reported on the group’s findings.

The experiment described in the paper focused on a specific type of Weyl semimetal, a tantalum phosphide (TaP) crystal. Materials can be classified by their electrical properties: metals conduct electricity readily, whereas insulators impede the free flow of electrons. A semimetal lies somewhere in between. It can conduct electricity, but only in a narrow frequency band or channel. Weyl semimetals are part of a wider category of so-called topological materials that have certain distinctive features. For instance, they possess curious electronic structures — kinks or “singularities” called Weyl nodes, which are swirling patterns around a single point (configured in either a clockwise or counterclockwise direction) that resemble hair whorls or, more generally, vortices. The presence of Weyl nodes confers unusual, as well as useful, electrical properties. And a key advantage of topological materials is that their sought-after qualities can be preserved, or “topologically protected,” even when the material is disturbed.

“That’s a nice feature to have,” explains Abhijatmedhi Chotrattanapituk, a PhD student in MIT’s Department of Electrical Engineering and Computer Science and the other lead author of the paper. “When you try to fabricate this kind of material, you don’t have to be exact. You can tolerate some imperfections, some level of uncertainty, and the material will still behave as expected.”

Like water in a dam

The “tuning” that needs to happen relates primarily to the Fermi level, which is the highest energy level occupied by electrons in a given physical system or material. Mandal and Chotrattanapituk suggest the following analogy: Consider a dam that can be filled with varying levels of water. One can raise that level by adding water or lower it by removing water. In the same way, one can adjust the Fermi level of a given material simply by adding or subtracting electrons.

To fine-tune the Fermi level of the Weyl semimetal, Li’s team did something similar, but instead of adding actual electrons, they added negative hydrogen ions (each consisting of a proton and two electrons) to the sample. The process of introducing a foreign particle, or defect, into the TaP crystal — in this case by substituting a hydrogen ion for a tantalum atom — is called doping. And when optimal doping is achieved, the Fermi level will coincide with the energy level of the Weyl nodes. That’s when the material’s desired quantum properties will be most fully realized.

For Weyl semimetals, the Fermi level is especially sensitive to doping. Unless that level is set close to the Weyl nodes, the material’s properties can diverge significantly from the ideal. The reason for this extreme sensitivity owes to the peculiar geometry of the Weyl node. If one were to think of the Fermi level as the water level in a reservoir, the reservoir in a Weyl semimetal is not shaped like a cylinder; it’s shaped like an hourglass, and the Weyl node is located at the narrowest point, or neck, of that hourglass. Adding too much or too little water would miss the neck entirely, just as adding too many or too few electrons to the semimetal would miss the node altogether.

Fire up the hydrogen

To reach the necessary precision, the researchers utilized MIT’s two-stage “Tandem” ion accelerator — located at the Center for Science and Technology with Accelerators and Radiation (CSTAR) — and buffeted the TaP sample with high-energy ions coming out of the powerful (1.7 million volt) accelerator beam. Hydrogen ions were chosen for this purpose because they are the smallest negative ions available and thus alter the material less than a much larger dopant would. “The use of advanced accelerator techniques allows for greater precision than was ever before possible, setting the Fermi level to milli-electron volt [thousandths of an electron volt] accuracy,” says Kevin Woller, the principal research scientist who leads the CSTAR lab. “Additionally, high-energy beams allow for the doping of bulk crystals beyond the limitations of thin films only a few tens of nanometers thick.”

The procedure, in other words, involves bombarding the sample with hydrogen ions until a sufficient number of electrons are taken in to make the Fermi level just right. The question is: how long do you run the accelerator, and how do you know when enough is enough? The point being that you want to tune the material until the Fermi level is neither too low nor too high.

“The longer you run the machine, the higher the Fermi level gets,” Chotrattanapituk says. “The difficulty is that we cannot measure the Fermi level while the sample is in the accelerator chamber.” The normal way to handle that would be to irradiate the sample for a certain amount of time, take it out, measure it, and then put it back in if the Fermi level is not high enough. “That can be practically impossible,” Mandal adds.

To streamline the protocol, the team has devised a theoretical model that first predicts how many electrons are needed to increase the Fermi level to the preferred level and translates that to the number of negative hydrogen ions that must be added to the sample. The model can then tell them how long the sample ought to be kept in the accelerator chamber.

The good news, Chotrattanapituk says, is that their simple model agrees within a factor of 2 with trusted conventional models that are much more computationally intensive and may require access to a supercomputer. The group’s main contributions are two-fold, he notes: offering a new, accelerator-based technique for precision doping and providing a theoretical model that can guide the experiment, telling researchers how much hydrogen should be added to the sample depending on the energy of the ion beam, the exposure time, and the size and thickness of the sample.

The band structure of pristine (left) and irradiated TaP (right). H⁻ ion implantation is depicted on the TaP crystal. The red horizontal plane indicates the Fermi level. The illustration shows the perturbation in Weyl nodes’ positions from the pristine to irradiated TaP as new Weyl nodes were introduced from band crossing due to H⁻ ion irradiation. Image: Ella Maru Studio

Fine things to come with fine-tuning

This could pave the way to a major practical advance, Mandal notes, because their approach can potentially bring the Fermi level of a sample to the requisite value in a matter of minutes — a task that, by conventional methods, has sometimes taken weeks without ever reaching the required degree of milli-eV precision.

Li believes that an accurate and convenient method for fine-tuning the Fermi level could have broad applicability. “When it comes to quantum materials, the Fermi level is practically everything,” he says. “Many of the effects and behaviors that we seek only manifest themselves when the Fermi level is at the right location.” With a well-adjusted Fermi level, for example, one could raise the critical temperature at which materials become superconducting. Thermoelectric materials, which convert temperature differences into an electrical voltage, similarly become more efficient when the Fermi level is set just right. Precision tuning might also play a helpful role in quantum computing.

Thomas Zac Ward, a senior scientist at the Oak Ridge National Laboratory, offered a bullish assessment: “This work provides a new route for the experimental exploration of the critical, yet still poorly understand, behaviors of emerging materials. The ability to precisely control the Fermi level of a topological material is an important milestone that can help bring new quantum information and microelectronics device architectures to fruition.”

LLMs develop their own understanding of reality as their language abilities improve

Ask a large language model (LLM) like GPT-4 to smell a rain-soaked campsite, and it’ll politely decline. Ask the same system to describe that scent to you, and it’ll wax poetic about “an air thick with anticipation” and “a scent that is both fresh and earthy,” despite having neither prior experience with rain nor a nose to help it make such observations. One possible explanation for this phenomenon is that the LLM is simply mimicking the text present in its vast training data, rather than working with any real understanding of rain or smell.

But does the lack of eyes mean that language models can’t ever “understand” that a lion is “larger” than a house cat? Philosophers and scientists alike have long considered the ability to assign meaning to language a hallmark of human intelligence — and pondered what essential ingredients enable us to do so.

Peering into this enigma, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have uncovered intriguing results suggesting that language models may develop their own understanding of reality as a way to improve their generative abilities. The team first developed a set of small Karel puzzles, which consisted of coming up with instructions to control a robot in a simulated environment. They then trained an LLM on the solutions, but without demonstrating how the solutions actually worked. Finally, using a machine learning technique called “probing,” they looked inside the model’s “thought process” as it generates new solutions. 

After training on over 1 million random puzzles, they found that the model spontaneously developed its own conception of the underlying simulation, despite never being exposed to this reality during training. Such findings call into question our intuitions about what types of information are necessary for learning linguistic meaning — and whether LLMs may someday understand language at a deeper level than they do today.

“At the start of these experiments, the language model generated random instructions that didn’t work. By the time we completed training, our language model generated correct instructions at a rate of 92.4 percent,” says MIT electrical engineering and computer science (EECS) PhD student and CSAIL affiliate Charles Jin, who is the lead author of a new paper on the work. “This was a very exciting moment for us because we thought that if your language model could complete a task with that level of accuracy, we might expect it to understand the meanings within the language as well. This gave us a starting point to explore whether LLMs do in fact understand text, and now we see that they’re capable of much more than just blindly stitching words together.”

Inside the mind of an LLM

The probe helped Jin witness this progress firsthand. Its role was to interpret what the LLM thought the instructions meant, unveiling that the LLM developed its own internal simulation of how the robot moves in response to each instruction. As the model’s ability to solve puzzles improved, these conceptions also became more accurate, indicating that the LLM was starting to understand the instructions. Before long, the model was consistently putting the pieces together correctly to form working instructions.

Jin notes that the LLM’s understanding of language develops in phases, much like how a child learns speech in multiple steps. Starting off, it’s like a baby babbling: repetitive and mostly unintelligible. Then, the language model acquires syntax, or the rules of the language. This enables it to generate instructions that might look like genuine solutions, but they still don’t work.

The LLM’s instructions gradually improve, though. Once the model acquires meaning, it starts to churn out instructions that correctly implement the requested specifications, like a child forming coherent sentences.

Separating the method from the model: A “Bizarro World”

The probe was only intended to “go inside the brain of an LLM” as Jin characterizes it, but there was a remote possibility that it also did some of the thinking for the model. The researchers wanted to ensure that their model understood the instructions independently of the probe, instead of the probe inferring the robot’s movements from the LLM’s grasp of syntax.

“Imagine you have a pile of data that encodes the LM’s thought process,” suggests Jin. “The probe is like a forensics analyst: You hand this pile of data to the analyst and say, ‘Here’s how the robot moves, now try and find the robot’s movements in the pile of data.’ The analyst later tells you that they know what’s going on with the robot in the pile of data. But what if the pile of data actually just encodes the raw instructions, and the analyst has figured out some clever way to extract the instructions and follow them accordingly? Then the language model hasn’t really learned what the instructions mean at all.”

To disentangle their roles, the researchers flipped the meanings of the instructions for a new probe. In this “Bizarro World,” as Jin calls it, directions like “up” now meant “down” within the instructions moving the robot across its grid. 

“If the probe is translating instructions to robot positions, it should be able to translate the instructions according to the bizarro meanings equally well,” says Jin. “But if the probe is actually finding encodings of the original robot movements in the language model’s thought process, then it should struggle to extract the bizarro robot movements from the original thought process.”

As it turned out, the new probe experienced translation errors, unable to interpret a language model that had different meanings of the instructions. This meant the original semantics were embedded within the language model, indicating that the LLM understood what instructions were needed independently of the original probing classifier.

“This research directly targets a central question in modern artificial intelligence: are the surprising capabilities of large language models due simply to statistical correlations at scale, or do large language models develop a meaningful understanding of the reality that they are asked to work with? This research indicates that the LLM develops an internal model of the simulated reality, even though it was never trained to develop this model,” says Martin Rinard, an MIT professor in EECS, CSAIL member, and senior author on the paper.

This experiment further supported the team’s analysis that language models can develop a deeper understanding of language. Still, Jin acknowledges a few limitations to their paper: They used a very simple programming language and a relatively small model to glean their insights. In an upcoming work, they’ll look to use a more general setting. While Jin’s latest research doesn’t outline how to make the language model learn meaning faster, he believes future work can build on these insights to improve how language models are trained.

“An intriguing open question is whether the LLM is actually using its internal model of reality to reason about that reality as it solves the robot navigation problem,” says Rinard. “While our results are consistent with the LLM using the model in this way, our experiments are not designed to answer this next question.”

“There is a lot of debate these days about whether LLMs are actually ‘understanding’ language or rather if their success can be attributed to what is essentially tricks and heuristics that come from slurping up large volumes of text,” says Ellie Pavlick, assistant professor of computer science and linguistics at Brown University, who was not involved in the paper. “These questions lie at the heart of how we build AI and what we expect to be inherent possibilities or limitations of our technology. This is a nice paper that looks at this question in a controlled way — the authors exploit the fact that computer code, like natural language, has both syntax and semantics, but unlike natural language, the semantics can be directly observed and manipulated for experimental purposes. The experimental design is elegant, and their findings are optimistic, suggesting that maybe LLMs can learn something deeper about what language ‘means.’”

Jin and Rinard’s paper was supported, in part, by grants from the U.S. Defense Advanced Research Projects Agency (DARPA). 

MIT researchers use large language models to flag problems in complex systems

Identifying one faulty turbine in a wind farm, which can involve looking at hundreds of signals and millions of data points, is akin to finding a needle in a haystack.

Engineers often streamline this complex problem using deep-learning models that can detect anomalies in measurements taken repeatedly over time by each turbine, known as time-series data.

But with hundreds of wind turbines recording dozens of signals each hour, training a deep-learning model to analyze time-series data is costly and cumbersome. This is compounded by the fact that the model may need to be retrained after deployment, and wind farm operators may lack the necessary machine-learning expertise.

In a new study, MIT researchers found that large language models (LLMs) hold the potential to be more efficient anomaly detectors for time-series data. Importantly, these pretrained models can be deployed right out of the box.

The researchers developed a framework, called SigLLM, which includes a component that converts time-series data into text-based inputs an LLM can process. A user can feed these prepared data to the model and ask it to start identifying anomalies. The LLM can also be used to forecast future time-series data points as part of an anomaly detection pipeline.

While LLMs could not beat state-of-the-art deep learning models at anomaly detection, they did perform as well as some other AI approaches. If researchers can improve the performance of LLMs, this framework could help technicians flag potential problems in equipment like heavy machinery or satellites before they occur, without the need to train an expensive deep-learning model.

“Since this is just the first iteration, we didn’t expect to get there from the first go, but these results show that there’s an opportunity here to leverage LLMs for complex anomaly detection tasks,” says Sarah Alnegheimish, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on SigLLM.

Her co-authors include Linh Nguyen, an EECS graduate student; Laure Berti-Equille, a research director at the French National Research Institute for Sustainable Development; and senior author Kalyan Veeramachaneni, a principal research scientist in the Laboratory for Information and Decision Systems. The research will be presented at the IEEE Conference on Data Science and Advanced Analytics.

An off-the-shelf solution

Large language models are autoregressive, which means they can understand that the newest values in sequential data depend on previous values. For instance, models like GPT-4 can predict the next word in a sentence using the words that precede it.

Since time-series data are sequential, the researchers thought the autoregressive nature of LLMs might make them well-suited for detecting anomalies in this type of data.

However, they wanted to develop a technique that avoids fine-tuning, a process in which engineers retrain a general-purpose LLM on a small amount of task-specific data to make it an expert at one task. Instead, the researchers deploy an LLM off the shelf, with no additional training steps.

But before they could deploy it, they had to convert time-series data into text-based inputs the language model could handle.

They accomplished this through a sequence of transformations that capture the most important parts of the time series while representing data with the fewest number of tokens. Tokens are the basic inputs for an LLM, and more tokens require more computation.

“If you don’t handle these steps very carefully, you might end up chopping off some part of your data that does matter, losing that information,” Alnegheimish says.

Once they had figured out how to transform time-series data, the researchers developed two anomaly detection approaches.

Approaches for anomaly detection

For the first, which they call Prompter, they feed the prepared data into the model and prompt it to locate anomalous values.

“We had to iterate a number of times to figure out the right prompts for one specific time series. It is not easy to understand how these LLMs ingest and process the data,” Alnegheimish adds.

For the second approach, called Detector, they use the LLM as a forecaster to predict the next value from a time series. The researchers compare the predicted value to the actual value. A large discrepancy suggests that the real value is likely an anomaly.

With Detector, the LLM would be part of an anomaly detection pipeline, while Prompter would complete the task on its own. In practice, Detector performed better than Prompter, which generated many false positives.

“I think, with the Prompter approach, we were asking the LLM to jump through too many hoops. We were giving it a harder problem to solve,” says Veeramachaneni.

When they compared both approaches to current techniques, Detector outperformed transformer-based AI models on seven of the 11 datasets they evaluated, even though the LLM required no training or fine-tuning.

In the future, an LLM may also be able to provide plain language explanations with its predictions, so an operator could be better able to understand why an LLM identified a certain data point as anomalous.

However, state-of-the-art deep learning models outperformed LLMs by a wide margin, showing that there is still work to do before an LLM could be used for anomaly detection.

“What will it take to get to the point where it is doing as well as these state-of-the-art models? That is the million-dollar question staring at us right now. An LLM-based anomaly detector needs to be a game-changer for us to justify this sort of effort,” Veeramachaneni says.

Moving forward, the researchers want to see if finetuning can improve performance, though that would require additional time, cost, and expertise for training.

Their LLM approaches also take between 30 minutes and two hours to produce results, so increasing the speed is a key area of future work. The researchers also want to probe LLMs to understand how they perform anomaly detection, in the hopes of finding a way to boost their performance.

“When it comes to complex tasks like anomaly detection in time series, LLMs really are a contender. Maybe other complex tasks can be addressed with LLMs, as well?” says Alnegheimish.

This research was supported by SES S.A., Iberdrola and ScottishPower Renewables, and Hyundai Motor Company.