Leading quantum at an inflection point

Danna Freedman is seeking the early adopters.

She is the faculty director of the nascent MIT Quantum Initiative, or QMIT. In this new role, Freedman is giving shape to an ambitious, Institute-wide effort to apply quantum breakthroughs to the most consequential challenges in science, technology, industry, and national security.

The interdisciplinary endeavor, the newest of MIT President Sally Kornbluth’s strategic initiatives, will bring together MIT researchers and domain experts from a range of industries to identify and tackle practical challenges wherever quantum solutions could achieve the greatest impact.

“We’ve already seen how the breadth of progress in quantum has created opportunities to rethink the future of security and encryption, imagine new modes of navigation, and even measure gravitational waves more precisely to observe the cosmos in an entirely new way,” says Freedman, the Frederick George Keyes Professor of Chemistry. “What can we do next? We’re investing in the promise of quantum, and where the legacy will be in 20 years.”

QMIT — the name is a nod to the “qubit,” the basic unit of quantum information — will formally launch on Dec. 8 with an all-day event on campus. Over time, the initiative plans to establish a physical home in the heart of campus for academic, public, and corporate engagement with state-of-the-art integrated quantum systems. Beyond MIT’s campus, QMIT will also work closely with the U.S. government and MIT Lincoln Laboratory, applying the lab’s capabilities in quantum hardware development, systems engineering, and rapid prototyping to national security priorities.

“The MIT Quantum Initiative seizes a timely opportunity in service to the nation’s scientific, economic, and technological competitiveness,” says Ian A. Waitz, MIT’s vice president for research. “With quantum capabilities approaching an inflection point, QMIT will engage students and researchers across all our schools and the college, as well as companies around the world, in thinking about what a step change in sensing and computational power will mean for a wide range of fields. Incredible opportunities exist in health and life sciences, fundamental physics research, cybersecurity, materials science, sensing the world around us, and more.”

Identifying the right questions

Quantum phenomena are as foundational to our world as light or gravity. At an extremely small scale, the interactions of atoms and subatomic particles are controlled by a different set of rules than the physical laws of the macro-sized world. These rules are called quantum mechanics.

“Quantum, in a sense, is what underlies everything,” says Freedman.

By leveraging quantum properties, quantum devices can process information at incredible speed to solve complex problems that aren’t feasible on classical supercomputers, and to enable ultraprecise sensing and measurement. Those improvements in speed and precision will become most powerful when optimized in relation to specific use cases, and as part of a complete quantum system. QMIT will focus on collaboration across domains to co-develop quantum tools, such as computers, sensors, networks, simulations, and algorithms, alongside the intended users of these systems.

As it develops, QMIT will be organized into programmatic pillars led by top researchers in quantum including Paola Cappellaro, Ford Professor of Engineering and professor of nuclear science and engineering and of physics; Isaac Chuang, Julius A. Stratton Professor in Electrical Engineering and Physics; Pablo Jarillo-Herrero, Cecil and Ida Green Professor of Physics; William Oliver, Henry Ellis Warren (1894) Professor of Electrical Engineering and Computer Science and professor of physics; Vladan Vuletić, Lester Wolfe Professor of Physics; and Jonilyn Yoder, associate leader of the Quantum-Enabled Computation Group at MIT Lincoln Laboratory.

While supporting the core of quantum research in physics, engineering, mathematics, and computer science, QMIT promises to expand the community at its frontiers, into astronomy, biology, chemistry, materials science, and medicine.

“If you provide a foundation that somebody can integrate with, that accelerates progress a lot,” says Freedman. “Perhaps we want to figure out how a quantum simulator we’ve built can model photosynthesis, if that’s the right question — or maybe the right question is to study 10 failed catalysts to see why they failed.”

“We are going to figure out what real problems exist that we could approach with quantum tools, and work toward them in the next five years,” she adds. “We are going to change the forward momentum of quantum in a way that supports impact.”

The MIT Quantum Initiative will be administratively housed in the Research Laboratory of Electronics (RLE), with support from the Office of the Vice President for Research (VPR) and the Office of Innovation and Strategy.

QMIT is a natural expansion of MIT’s Center for Quantum Engineering (CQE), a research powerhouse that engages more than 80 principal investigators across the MIT campus and Lincoln Laboratory to accelerate the practical application of quantum technologies.

“CQE has cultivated a tremendously strong ecosystem of students and researchers, engaging with U.S. government sponsors and industry collaborators, including through the popular Quantum Annual Research Conference (QuARC) and professional development classes,” says Marc Baldo, the Dugald C. Jackson Professor in Electrical Engineering and director of RLE.

“With the backing of former vice president for research Maria Zuber, former Lincoln Lab director Eric Evans, and Marc Baldo, we launched CQE and its industry membership group in 2019 to help bridge MIT’s research efforts in quantum science and engineering,” says Oliver, CQE’s director, who also spent 20 years at Lincoln Laboratory, most recently as a Laboratory Fellow. “We have an important opportunity now to deepen our commitment to quantum research and education, and especially in engaging students from across the Institute in thinking about how to leverage quantum science and engineering to solve hard problems.”

Two years ago, Peter Fisher, the Thomas A. Frank (1977) Professor of Physics, in his role as associate vice president for research computing and data, assembled a faculty group led by Cappellaro and involving Baldo, Oliver, Freedman, and others, to begin to build an initiative that would span the entire Institute. Now, capitalizing on CQE’s success, Oliver will lead the new MIT Quantum Initiative’s quantum computing pillar, which will broaden the work of CQE into a larger effort that focuses on quantum computing, industry engagement, and connecting with end users.

The “MIT-hard” problem

QMIT will build upon the Institute’s historic leadership in quantum science and engineering. In the spring of 1981, MIT hosted the first Physics of Computation Conference at the Endicott House, bringing together nearly 50 physics and computing researchers to consider the practical promise of quantum — an intellectual moment that is now widely regarded as the kickoff of the second quantum revolution. (The first was the fundamental articulation of quantum mechanics 100 years ago.)

Today, research in quantum science and engineering produces a steady stream of “firsts” in the lab and a growing number of startup companies.

In collaboration with partners in industry and government, MIT researchers develop advances in areas like quantum sensing, which involves the use of atomic-scale systems to measure certain properties, like distance and acceleration, with extreme precision. Quantum sensing could be used in applications like brain imaging devices that capture more detail, or air traffic control systems with greater positional accuracy.

Another key area of research is quantum simulation, which uses the power of quantum computers to accurately emulate complex systems. This could fuel the discovery of new materials for energy-efficient electronics or streamline the identification of promising molecules for drug development.

“Historically, when we think about the most well-articulated challenges that quantum will solve,” Freedman says, “the best ones have come from inside of MIT. We’re open to technological solutions to problems, and nontraditional approaches to science. In many respects, we are the early adopters.”

But she also draws a sharp distinction between blue-sky thinking about what quantum might do, and the deeply technical, deeply collaborative work of actually drawing the roadmap. “That’s the ‘MIT-hard’ problem,” she says.

The QMIT launch event on Dec. 8 will feature talks and discussions featuring MIT faculty, including Nobel laureates and industry leaders.

MIT researchers propose a new model for legible, modular software

Coding with large language models (LLMs) holds huge promise, but it also exposes some long-standing flaws in software: code that’s messy, hard to change safely, and often opaque about what’s really happening under the hood. Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are charting a more “modular” path ahead. 

Their new approach breaks systems into “concepts,” separate pieces of a system, each designed to do one job well, and “synchronizations,” explicit rules that describe exactly how those pieces fit together. The result is software that’s more modular, transparent, and easier to understand. A small domain-specific language (DSL) makes it possible to express synchronizations simply, in a form that LLMs can reliably generate. In a real-world case study, the team showed how this method can bring together features that would otherwise be scattered across multiple services.

The team, including Daniel Jackson, an MIT professor of electrical engineering and computer science (EECS) and CSAIL associate director, and Eagon Meng, an EECS PhD student, CSAIL affiliate, and designer of the new synchronization DSL, explore this approach in their paper “What You See Is What It Does: A Structural Pattern for Legible Software,” which they presented at the Splash Conference in Singapore in October. The challenge, they explain, is that in most modern systems, a single feature is never fully self-contained. Adding a “share” button to a social platform like Instagram, for example, doesn’t live in just one service. Its functionality is split across code that handles posting, notification, authenticating users, and more. All these pieces, despite being scattered across the code, must be carefully aligned, and any change risks unintended side effects elsewhere.

Jackson calls this “feature fragmentation,” a central obstacle to software reliability. “The way we build software today, the functionality is not localized. You want to understand how ‘sharing’ works, but you have to hunt for it in three or four different places, and when you find it, the connections are buried in low-level code,” says Jackson.

Concepts and synchronizations are meant to tackle this problem. A concept bundles up a single, coherent piece of functionality, like sharing, liking, or following, along with its state and the actions it can take. Synchronizations, on the other hand, describe at a higher level how those concepts interact. Rather than writing messy low-level integration code, developers can use a small domain-specific language to spell out these connections directly. In this DSL, the rules are simple and clear: one concept’s action can trigger another, so that a change in one piece of state can be kept in sync with another.

“Think of concepts as modules that are completely clean and independent. Synchronizations then act like contracts — they say exactly how concepts are supposed to interact. That’s powerful because it makes the system both easier for humans to understand and easier for tools like LLMs to generate correctly,” says Jackson. “Why can’t we read code like a book? We believe that software should be legible and written in terms of our understanding: our hope is that concepts map to familiar phenomena, and synchronizations represent our intuition about what happens when they come together,” says Meng. 

The benefits extend beyond clarity. Because synchronizations are explicit and declarative, they can be analyzed, verified, and of course generated by an LLM. This opens the door to safer, more automated software development, where AI assistants can propose new features without introducing hidden side effects.

In their case study, the researchers assigned features like liking, commenting, and sharing each to a single concept — like a microservices architecture, but more modular. Without this pattern, these features were spread across many services, making them hard to locate and test. Using the concepts-and-synchronizations approach, each feature became centralized and legible, while the synchronizations spelled out exactly how the concepts interacted.

The study also showed how synchronizations can factor out common concerns like error handling, response formatting, or persistent storage. Instead of embedding these details in every service, synchronization can handle them once, ensuring consistency across the system. 

More advanced directions are also possible. Synchronizations could coordinate distributed systems, keeping replicas on different servers in step, or allow shared databases to interact cleanly. Weakening synchronization semantics could enable eventual consistency while still preserving clarity at the architectural level.

Jackson sees potential for a broader cultural shift in software development. One idea is the creation of “concept catalogs,” shared libraries of well-tested, domain-specific concepts. Application development could then become less about stitching code together from scratch and more about selecting the right concepts and writing the synchronizations between them. “Concepts could become a new kind of high-level programming language, with synchronizations as the programs written in that language.”

“It’s a way of making the connections in software visible,” says Jackson. “Today, we hide those connections in code. But if you can see them explicitly, you can reason about the software at a much higher level. You still have to deal with the inherent complexity of features interacting. But now it’s out in the open, not scattered and obscured.”

“Building software for human use on abstractions from underlying computing machines has burdened the world with software that is all too often costly, frustrating, even dangerous, to understand and use,” says University of Virginia Associate Professor Kevin Sullivan, who wasn’t involved in the research. “The impacts (such as in health care) have been devastating. Meng and Jackson flip the script and insist on building interactive software on abstractions from human understanding, which they call ‘concepts.’ They combine expressive mathematical logic and natural language to specify such purposeful abstractions, providing a basis for verifying their meanings, composing them into systems, and refining them into programs fit for human use. It’s a new and important direction in the theory and practice of software design that bears watching.”

“It’s been clear for many years that we need better ways to describe and specify what we want software to do,” adds Thomas Ball, Lancaster University honorary professor and University of Washington affiliate faculty, who also wasn’t involved in the research. “LLMs’ ability to generate code has only added fuel to the specification fire. Meng and Jackson’s work on concept design provides a promising way to describe what we want from software in a modular manner. Their concepts and specifications are well-suited to be paired with LLMs to achieve the designer’s intent.”

Looking ahead, the researchers hope their work can influence how both industry and academia think about software architecture in the age of AI. “If software is to become more trustworthy, we need ways of writing it that make its intentions transparent,” says Jackson. “Concepts and synchronizations are one step toward that goal.”

This work was partially funded by the Machine Learning Applications (MLA) Initiative of CSAIL Alliances. At the time of funding, the initiative board was British Telecom, Cisco, and Ernst and Young. 

3 Questions: How AI is helping us monitor and support vulnerable ecosystems

A recent study from Oregon State University estimated that more than 3,500 animal species are at risk of extinction because of factors including habitat alterations, natural resources being overexploited, and climate change.

To better understand these changes and protect vulnerable wildlife, conservationists like MIT EECS PhD student and Computer Science and Artificial Intelligence Laboratory (CSAIL) researcher Justin Kay are developing computer vision algorithms that carefully monitor animal populations. A member of the lab of MIT Department of Electrical Engineering and Computer Science assistant professor and CSAIL principal investigator Sara Beery, Kay is currently working on tracking salmon in the Pacific Northwest, where they provide crucial nutrients to predators like birds and bears, while managing the population of prey, like bugs.

With all that wildlife data, though, researchers have lots of information to sort through and many AI models to choose from to analyze it all. Kay and his colleagues at CSAIL and the University of Massachusetts Amherst are developing AI methods that make this data-crunching process much more efficient, including a new approach called “consensus-driven active model selection” (or “CODA”) that helps conservationists choose which AI model to use. Their work was named a Highlight Paper at the International Conference on Computer Vision (ICCV) in October.

That research was supported, in part, by the National Science Foundation, Natural Sciences and Engineering Research Council of Canada, and Abdul Latif Jameel Water and Food Systems Lab (J-WAFS). Here, Kay discusses this project, among other conservation efforts.

Q: In your paper, you pose the question of which AI models will perform the best on a particular dataset. With as many as 1.9 million pre-trained models available in the HuggingFace Models repository alone, how does CODA help us address that challenge?

A: Until recently, using AI for data analysis has typically meant training your own model. This requires significant effort to collect and annotate a representative training dataset, as well as iteratively train and validate models. You also need a certain technical skill set to run and modify AI training code. The way people interact with AI is changing, though — in particular, there are now millions of publicly available pre-trained models that can perform a variety of predictive tasks very well. This potentially enables people to use AI to analyze their data without developing their own model, simply by downloading an existing model with the capabilities they need. But this poses a new challenge: Which model, of the millions available, should they use to analyze their data? 

Typically, answering this model selection question also requires you to spend a lot of time collecting and annotating a large dataset, albeit for testing models rather than training them. This is especially true for real applications where user needs are specific, data distributions are imbalanced and constantly changing, and model performance may be inconsistent across samples. Our goal with CODA was to substantially reduce this effort. We do this by making the data annotation process “active.” Instead of requiring users to bulk-annotate a large test dataset all at once, in active model selection we make the process interactive, guiding users to annotate the most informative data points in their raw data. This is remarkably effective, often requiring users to annotate as few as 25 examples to identify the best model from their set of candidates. 

We’re very excited about CODA offering a new perspective on how to best utilize human effort in the development and deployment of machine-learning (ML) systems. As AI models become more commonplace, our work emphasizes the value of focusing effort on robust evaluation pipelines, rather than solely on training.

Q: You applied the CODA method to classifying wildlife in images. Why did it perform so well, and what role can systems like this have in monitoring ecosystems in the future?

A: One key insight was that when considering a collection of candidate AI models, the consensus of all of their predictions is more informative than any individual model’s predictions. This can be seen as a sort of “wisdom of the crowd:” On average, pooling the votes of all models gives you a decent prior over what the labels of individual data points in your raw dataset should be. Our approach with CODA is based on estimating a “confusion matrix” for each AI model — given the true label for some data point is class X, what is the probability that an individual model predicts class X, Y, or Z? This creates informative dependencies between all of the candidate models, the categories you want to label, and the unlabeled points in your dataset.

Consider an example application where you are a wildlife ecologist who has just collected a dataset containing potentially hundreds of thousands of images from cameras deployed in the wild. You want to know what species are in these images, a time-consuming task that computer vision classifiers can help automate. You are trying to decide which species classification model to run on your data. If you have labeled 50 images of tigers so far, and some model has performed well on those 50 images, you can be pretty confident it will perform well on the remainder of the (currently unlabeled) images of tigers in your raw dataset as well. You also know that when that model predicts some image contains a tiger, it is likely to be correct, and therefore that any model that predicts a different label for that image is more likely to be wrong. You can use all these interdependencies to construct probabilistic estimates of each model’s confusion matrix, as well as a probability distribution over which model has the highest accuracy on the overall dataset. These design choices allow us to make more informed choices over which data points to label and ultimately are the reason why CODA performs model selection much more efficiently than past work.

There are also a lot of exciting possibilities for building on top of our work. We think there may be even better ways of constructing informative priors for model selection based on domain expertise — for instance, if it is already known that one model performs exceptionally well on some subset of classes or poorly on others. There are also opportunities to extend the framework to support more complex machine-learning tasks and more sophisticated probabilistic models of performance. We hope our work can provide inspiration and a starting point for other researchers to keep pushing the state of the art.

Q: You work in the Beerylab, led by Sara Beery, where researchers are combining the pattern-recognition capabilities of machine-learning algorithms with computer vision technology to monitor wildlife. What are some other ways your team is tracking and analyzing the natural world, beyond CODA?

A: The lab is a really exciting place to work, and new projects are emerging all the time. We have ongoing projects monitoring coral reefs with drones, re-identifying individual elephants over time, and fusing multi-modal Earth observation data from satellites and in-situ cameras, just to name a few. Broadly, we look at emerging technologies for biodiversity monitoring and try to understand where the data analysis bottlenecks are, and develop new computer vision and machine-learning approaches that address those problems in a widely applicable way. It’s an exciting way of approaching problems that sort of targets the “meta-questions” underlying particular data challenges we face. 

The computer vision algorithms I’ve worked on that count migrating salmon in underwater sonar video are examples of that work. We often deal with shifting data distributions, even as we try to construct the most diverse training datasets we can. We always encounter something new when we deploy a new camera, and this tends to degrade the performance of computer vision algorithms. This is one instance of a general problem in machine learning called domain adaptation, but when we tried to apply existing domain adaptation algorithms to our fisheries data we realized there were serious limitations in how existing algorithms were trained and evaluated. We were able to develop a new domain adaptation framework, published earlier this year in Transactions on Machine Learning Research, that addressed these limitations and led to advancements in fish counting, and even self-driving and spacecraft analysis.

One line of work that I’m particularly excited about is understanding how to better develop and analyze the performance of predictive ML algorithms in the context of what they are actually used for. Usually, the outputs from some computer vision algorithm — say, bounding boxes around animals in images — are not actually the thing that people care about, but rather a means to an end to answer a larger problem — say, what species live here, and how is that changing over time? We have been working on methods to analyze predictive performance in this context and reconsider the ways that we input human expertise into ML systems with this in mind. CODA was one example of this, where we showed that we could actually consider the ML models themselves as fixed and build a statistical framework to understand their performance very efficiently. We have been working recently on similar integrated analyses combining ML predictions with multi-stage prediction pipelines, as well as ecological statistical models. 

The natural world is changing at unprecedented rates and scales, and being able to quickly move from scientific hypotheses or management questions to data-driven answers is more important than ever for protecting ecosystems and the communities that depend on them. Advancements in AI can play an important role, but we need to think critically about the ways that we design, train, and evaluate algorithms in the context of these very real challenges.

Charting the future of AI, from safer answers to faster thinking

Adoption of new tools and technologies occurs when users largely perceive them as reliable, accessible, and an improvement over the available methods and workflows for the cost. Five PhD students from the inaugural class of the MIT-IBM Watson AI Lab Summer Program are utilizing state-of-the-art resources, alleviating AI pain points, and creating new features and capabilities to promote AI usefulness and deployment — from learning when to trust a model that predicts another’s accuracy to more effectively reasoning over knowledge bases. Together, the efforts from the students and their mentors form a through-line, where practical and technically rigorous research leads to more dependable and valuable models across domains.

Building probes, routers, new attention mechanisms, synthetic datasets, and program-synthesis pipelines, the students’ work spans safety, inference efficiency, multimodal data, and knowledge-grounded reasoning. Their techniques emphasize scaling and integration, with impact always in sight.

Learning to trust, and when

MIT math graduate student Andrey Bryutkin’s research prioritizes the trustworthiness of models. He seeks out internal structures within problems, such as equations governing a system and conservation laws, to understand how to leverage them to produce more dependable and robust solutions. Armed with this and working with the lab, Bryutkin developed a method to peer into the nature of large learning models (LLMs) behaviors. Together with the lab’s Veronika Thost of IBM Research and Marzyeh Ghassemi — associate professor and the Germeshausen Career Development Professor in the MIT Department of Electrical Engineering and Computer Science (EECS) and a member of the Institute of Medical Engineering Sciences and the Laboratory for Information and Decision Systems — Bryutkin explored the “uncertainty of uncertainty” of LLMs. 

Classically, tiny feed-forward neural networks two-to-three layers deep, called probes, are trained alongside LLMs and employed to flag untrustworthy answers from the larger model to developers; however, these classifiers can also produce false negatives and only provide point estimates, which don’t offer much information about when the LLM is failing. Investigating safe/unsafe prompts and question-answer tasks, the MIT-IBM team used prompt-label pairs, as well as the hidden states like activation vectors and last tokens from an LLM, to measure gradient scores, sensitivity to prompts, and out-of-distribution data to determine how reliable the probe was and learn areas of data that are difficult to predict. Their method also helps identify potential labeling noise. This is a critical function, as the trustworthiness of AI systems depends entirely on the quality and accuracy of the labeled data they are built upon. More accurate and consistent probes are especially important for domains with critical data in applications like IBM’s Granite Guardian family of models.

Another way to ensure trustworthy responses to queries from an LLM is to augment them with external, trusted knowledge bases to eliminate hallucinations. For structured data, such as social media connections, financial transactions, or corporate databases, knowledge graphs (KG) are natural fits; however, communications between the LLM and KGs often use fixed, multi-agent pipelines that are computationally inefficient and expensive. Addressing this, physics graduate student Jinyeop Song, along with lab researchers Yada Zhu of IBM Research and EECS Associate Professor Julian Shun created a single-agent, multi-turn, reinforcement learning framework that streamlines this process. Here, the group designed an API server hosting Freebase and Wikidata KGs, which consist of general web-based knowledge data, and a LLM agent that issues targeted retrieval actions to fetch pertinent information from the server. Then, through continuous back-and-forth, the agent appends the gathered data from the KGs to the context and responds to the query. Crucially, the system uses reinforcement learning to train itself to deliver answers that strike a balance between accuracy and completeness. The framework pairs an API server with a single reinforcement learning agent to orchestrate data-grounded reasoning with improved accuracy, transparency, efficiency, and transferability.

Spending computation wisely

The timeliness and completeness of a model’s response carry similar weight to the importance of its accuracy. This is especially true for handling long input texts and those where elements, like the subject of a story, evolve over time, so EECS graduate student Songlin Yang is re-engineering what models can handle at each step of inference. Focusing on transformer limitations, like those in LLMs, the lab’s Rameswar Panda of IBM Research and Yoon Kim, the NBX Professor and associate professor in EECS, joined Yang to develop next-generation language model architectures beyond transformers.

Transformers face two key limitations: high computational complexity in long-sequence modeling due to the softmax attention mechanism, and limited expressivity resulting from the weak inductive bias of RoPE (rotary positional encoding). This means that as the input length doubles, the computational cost quadruples. RoPE allows transformers to understand the sequence order of tokens (i.e., words); however, it does not do a good job capturing internal state changes over time, like variable values, and is limited to the sequence lengths seen during training.

To address this, the MIT-IBM team explored theoretically grounded yet hardware-efficient algorithms. As an alternative to softmax attention, they adopted linear attention, reducing the quadratic complexity that limits the feasible sequence length. They also investigated hybrid architectures that combine softmax and linear attention to strike a better balance between computational efficiency and performance.

Increasing expressivity, they replaced RoPE with a dynamic reflective positional encoding based on the Householder transform. This approach enables richer positional interactions for deeper understanding of sequential information, while maintaining fast and efficient computation. The MIT-IBM team’s advancement reduces the need for transformers to break problems into many steps, instead enabling them to handle more complex subproblems with fewer inference tokens.

Visions anew

Visual data contain multitudes that the human brain can quickly parse, internalize, and then imitate. Using vision-language models (VLMs), two graduate students are exploring ways to do this through code.

Over the past two summers and under the advisement of Aude Oliva, MIT director of the MIT-IBM Watson AI Lab and a senior research scientist in the Computer Science and Artificial Intelligence Laboratory; and IBM Research’s Rogerio Feris, Dan Gutfreund, and Leonid Karlinsky (now at Xero), Jovana Kondic of EECS has explored visual document understanding, specifically charts. These contain elements, such as data points, legends, and axes labels, that require optical character recognition and numerical reasoning, which models still struggle with. In order to facilitate the performance on tasks such as these, Kondic’s group set out to create a large, open-source, synthetic chart dataset from code that could be used for training and benchmarking. 

With their prototype, ChartGen, the researchers created a pipeline that passes seed chart images through a VLM, which is prompted to read the chart and generate a Python script that was likely used to create the chart in the first place. The LLM component of the framework then iteratively augments the code from many charts to ultimately produce over 200,000 unique pairs of charts and their codes, spanning nearly 30 chart types, as well as supporting data and annotation like descriptions and question-answer pairs about the charts. The team is further expanding their dataset, helping to enable critical multimodal understanding to data visualizations for enterprise applications like financial and scientific reports, blogs, and more.

Instead of charts, EECS graduate student Leonardo Hernandez Cano has his eyes on digital design, specifically visual texture generation for CAD applications and the goal of discovering efficient ways to enable to capabilities in VLMs. Teaming up with the lab groups led by Armando Solar-Lezama, EECS professor and Distinguished Professor of Computing in the MIT Schwarzman College of Computing, and IBM Research’s Nathan Fulton, Hernandez Cano created a program synthesis system that learns to refine code on its own. The system starts with a texture description given by a user in the form of an image. It then generates an initial Python program, which produces visual textures, and iteratively refines the code with the goal of finding a program that produces a texture that matches the target description, learning to search for new programs from the data that the system itself produces. Through these refinements, the novel program can create visualizations with the desired luminosity, color, iridescence, etc., mimicking real materials.

When viewed together, these projects, and the people behind them, are making a cohesive push toward more robust and practical artificial intelligence. By tackling the core challenges of reliability, efficiency, and multimodal reasoning, the work paves the way for AI systems that are not only more powerful, but also more dependable and cost-effective, for real-world enterprise and scientific applications.

Tess Smidt named to 2025 cohort of AI2050 Early Career Fellows

Associate Professor Tess Smidt was among those named to the 2025 cohort of AI2050 Early Career Fellows. The honor is announced annually by Schmidt Sciences, a nonprofit organization founded in 2024 by Eric and Wendy Schmidt that works to accelerate scientific knowledge and breakthroughs with the most promising, advanced tools to support a thriving planet. The organization prioritizes research in areas poised for impact including AI and advanced computing, astrophysics, biosciences, climate, and space—as well as supporting researchers in a variety of disciplines through its science systems program. 

Smidt is the principal investigator of the Atomic Architects group at the Research Laboratory of Electronics (RLE), where she works at the intersection of physics, geometry, and machine learning to design algorithms that aid in the understanding of physical systems under physical and geometric constraints, with applications to the design both of new materials and new molecules. She has a particular focus on symmetries present in 3D physical systems, such as rotation, translation, and reflection.

Smidt earned her SB in Physics from MIT in 2012 and her PhD in Physics from the University of California, Berkeley in 2018. Prior to joining the MIT EECS faculty in 2021, she was the 2018 Alvarez Postdoctoral Fellow in Computing Sciences at Lawrence Berkeley National Laboratory and a Software Engineering Intern on the Google Accelerated Sciences team, where she developed Euclidean symmetry equivariant neural networks which naturally handle 3D geometry and geometric tensor data. Besides the AI2050 fellowship, she has received an Air Force Office of Scientific Research Young Investigator Program (AFOSR YIP) award, the EECS Outstanding Educator Award, and a Transformative Research Fund award.

Conceived and co-chaired by Eric Schmidt and James Manyika, AI2050 is a philanthropic initiative aimed at helping to solve hard problems in AI. Within their research, each fellow will contend with the central motivating question of AI2050: “It’s 2050. AI has turned out to be hugely beneficial to society. What happened? What are the most important problems we solved and the opportunities and possibilities we realized to ensure this outcome?”

A faster problem-solving tool that guarantees feasibility

Managing a power grid is like trying to solve an enormous puzzle.

Grid operators must ensure the proper amount of power is flowing to the right areas at the exact time when it is needed, and they must do this in a way that minimizes costs without overloading physical infrastructure. Even more, they must solve this complicated problem repeatedly, as rapidly as possible, to meet constantly changing demand.

To help crack this consistent conundrum, MIT researchers developed a problem-solving tool that finds the optimal solution much faster than traditional approaches while ensuring the solution doesn’t violate any of the system’s constraints. In a power grid, constraints could be things like generator and line capacity.

This new tool incorporates a feasibility-seeking step into a powerful machine-learning model trained to solve the problem. The feasibility-seeking step uses the model’s prediction as a starting point, iteratively refining the solution until it finds the best achievable answer.

The MIT system can unravel complex problems several times faster than traditional solvers, while providing strong guarantees of success. For some extremely complex problems, it could find better solutions than tried-and-true tools. The technique also outperformed pure machine learning approaches, which are fast but can’t always find feasible solutions.

In addition to helping schedule power production in an electric grid, this new tool could be applied to many types of complicated problems, such as designing new products, managing investment portfolios, or planning production to meet consumer demand.

“Solving these especially thorny problems well requires us to combine tools from machine learning, optimization, and electrical engineering to develop methods that hit the right tradeoffs in terms of providing value to the domain, while also meeting its requirements. You have to look at the needs of the application and design methods in a way that actually fulfills those needs,” says Priya Donti, the Silverman Family Career Development Professor in the Department of Electrical Engineering and Computer Science (EECS) and a principal investigator at the Laboratory for Information and Decision Systems (LIDS).

Donti, senior author of an open-access paper on this new tool, called FSNet, is joined by lead author Hoang Nguyen, an EECS graduate student. The paper will be presented at the Conference on Neural Information Processing Systems.

Combining approaches

Ensuring optimal power flow in an electric grid is an extremely hard problem that is becoming more difficult for operators to solve quickly.

“As we try to integrate more renewables into the grid, operators must deal with the fact that the amount of power generation is going to vary moment to moment. At the same time, there are many more distributed devices to coordinate,” Donti explains.

Grid operators often rely on traditional solvers, which provide mathematical guarantees that the optimal solution doesn’t violate any problem constraints. But these tools can take hours or even days to arrive at that solution if the problem is especially convoluted.

On the other hand, deep-learning models can solve even very hard problems in a fraction of the time, but the solution might ignore some important constraints. For a power grid operator, this could result in issues like unsafe voltage levels or even grid outages.

“Machine-learning models struggle to satisfy all the constraints due to the many errors that occur during the training process,” Nguyen explains.

For FSNet, the researchers combined the best of both approaches into a two-step problem-solving framework.

Focusing on feasibility

In the first step, a neural network predicts a solution to the optimization problem. Very loosely inspired by neurons in the human brain, neural networks are deep learning models that excel at recognizing patterns in data.

Next, a traditional solver that has been incorporated into FSNet performs a feasibility-seeking step. This optimization algorithm iteratively refines the initial prediction while ensuring the solution does not violate any constraints.

Because the feasibility-seeking step is based on a mathematical model of the problem, it can guarantee the solution is deployable.

“This step is very important. In FSNet, we can have the rigorous guarantees that we need in practice,” Hoang says.

The researchers designed FSNet to address both main types of constraints (equality and inequality) at the same time. This makes it easier to use than other approaches that may require customizing the neural network or solving for each type of constraint separately.

“Here, you can just plug and play with different optimization solvers,” Donti says.

By thinking differently about how the neural network solves complex optimization problems, the researchers were able to unlock a new technique that works better, she adds.

They compared FSNet to traditional solvers and pure machine-learning approaches on a range of challenging problems, including power grid optimization. Their system cut solving times by orders of magnitude compared to the baseline approaches, while respecting all problem constraints.

FSNet also found better solutions to some of the trickiest problems.

“While this was surprising to us, it does make sense. Our neural network can figure out by itself some additional structure in the data that the original optimization solver was not designed to exploit,” Donti explains.

In the future, the researchers want to make FSNet less memory-intensive, incorporate more efficient optimization algorithms, and scale it up to tackle more realistic problems.

“Finding solutions to challenging optimization problems that are feasible is paramount to finding ones that are close to optimal. Especially for physical systems like power grids, close to optimal means nothing without feasibility. This work provides an important step toward ensuring that deep-learning models can produce predictions that satisfy constraints, with explicit guarantees on constraint enforcement,” says Kyri Baker, an associate professor at the University of Colorado Boulder, who was not involved with this work.

“A persistent challenge for machine learning-based optimization is feasibility. This work elegantly couples end-to-end learning with an unrolled feasibility-seeking procedure that minimizes equality and inequality violations. The results are very promising and I look forward to see where this research will head,” adds Ferdinando Fioretto, an assistant professor at the University of Virginia, who was not involved with this work.

Two MIT presidential initiatives partner with SuperUROP to expand student opportunities

A view over the annual SuperUROP poster session, held in MIT's Stata Center.

Beginning in the fall of 2025, MIT students will have the opportunity to participate in projects funded directly by two major presidential initiatives: MIT’s Health and Life Sciences Collaborative (MIT HEALS) and MIT’s Generative AI Impact Consortium (MGAIC)

Both MIT HEALS and MGAIC are now collaborating with SuperUROP, a two-semester supervised research experience which takes undergraduates through the complete research cycle, from selecting a topic and designing their experiment to writing a technical paper and presenting their results at conferences. While MGAIC’s funding will focus on research in generative AI, responsible AI, and applications that drive societal benefits, MIT HEALS’s will support research at the intersection of health, well-being, equity, and technology. 

“The support from these presidential initiatives reflects an institutional commitment to undergraduate research and innovation, and aligns well with MIT’s broader vision of a hub for interdisciplinary cooperation and cross-disciplinary impact,” says Asu Ozdaglar, head of the Department of EECS. 

While the program was initially born in, and continues to be administered by, EECS, participation in the program is open to students across the School of Engineering and the School of Science, with leading faculty from several departments offering mentorship and a wide variety of research projects. That variety now grows even broader with the participation of MGAIC and MIT HEALS, and the program administrators hope the new research projects available will appeal to a broad group of MIT student researchers. The new SuperUROP projects also offer the opportunity for research groups and students alike to advance MIT’s strategic priorities, whether developing open, collaborative AI solutions, or catalyzing discovery, innovation, and impact for human health. A sampling of the projects planned includes: “Animate biosensors for toxic chemical detection,”  “Human-AI Collaborative Music Generation,” “Optimization of Natural Fiber Prosthetic Socket Manufacturing Method for Sierra Leone,” and “Enhancing Biodiversity Image Datasets with Generative AI for Improved Species Classification.” “SuperUROP provides a truly unique experience at the undergraduate level,” says Dina Katabi, Thuan (1990) and Nicole Pham Professor in EECS at MIT, who has supervised several SuperUROP scholars since the program’s inception. “With the full experience of a research cycle, students get a real sense of how a research career could suit them, plus experience in meaningfully communicating their results—all before they decide whether to pursue a graduate degree.”

“SuperUROP students in this cohort will be working on socially impactful, high-priority research topics,” says Joanne Luciano, the SuperUROP program administrator. “These opportunities elevate the undergraduate research experience, provide interdisciplinary exposure, and help with career readiness.” 

For more information, please visit the SuperUROP website. To learn more about giving to SuperUROP, please visit our supporter page

This is your brain without sleep

Nearly everyone has experienced it: After a night of poor sleep, you don’t feel as alert as you should. Your brain might seem foggy, and your mind drifts off when you should be paying attention.

A new study from MIT reveals what happens inside the brain as these momentary failures of attention occur. The scientists found that during these lapses, a wave of cerebrospinal fluid (CSF) flows out of the brain — a process that typically occurs during sleep and helps to wash away waste products that have built up during the day. This flushing is believed to be necessary for maintaining a healthy, normally functioning brain.

When a person is sleep-deprived, it appears that their body attempts to catch up on this cleansing process by initiating pulses of CSF flow. However, this comes at a cost of dramatically impaired attention.

“If you don’t sleep, the CSF waves start to intrude into wakefulness where normally you wouldn’t see them. However, they come with an attentional tradeoff, where attention fails during the moments that you have this wave of fluid flow,” says Laura Lewis, the Athinoula A. Martinos Associate Professor of Electrical Engineering and Computer Science, a member of MIT’s Institute for Medical Engineering and Science and the Research Laboratory of Electronics, and an associate member of the Picower Institute for Learning and Memory.

Lewis is the senior author of the study, which appears today in Nature Neuroscience. MIT visiting graduate student Zinong Yang is the lead author of the paper.

Flushing the brain

Although sleep is a critical biological process, it’s not known exactly why it is so important. It appears to be essential for maintaining alertness, and it has been well-documented that sleep deprivation leads to impairments of attention and other cognitive functions.

During sleep, the cerebrospinal fluid that cushions the brain helps to remove waste that has built up during the day. In a 2019 study, Lewis and colleagues showed that CSF flow during sleep follows a rhythmic pattern in and out of the brain, and that these flows are linked to changes in brain waves during sleep.

That finding led Lewis to wonder what might happen to CSF flow after sleep deprivation. To explore that question, she and her colleagues recruited 26 volunteers who were tested twice — once following a night of sleep deprivation in the lab, and once when they were well-rested.

In the morning, the researchers monitored several different measures of brain and body function as the participants performed a task that is commonly used to evaluate the effects of sleep deprivation.

During the task, each participant wore an electroencephalogram (EEG) cap that could record brain waves while they were also in a functional magnetic resonance imaging (fMRI) scanner. The researchers used a modified version of fMRI that allowed them to measure not only blood oxygenation in the brain, but also the flow of CSF in and out of the brain. They also measured each subject’s heart rate, breathing rate, and pupil diameter.

The participants performed two attentional tasks while in the fMRI scanner, one visual and one auditory. For the visual task, they had to look at a screen that had a fixed cross. At random intervals, the cross would turn into a square, and the participants were told to press a button whenever they saw this happen. For the auditory task, they would hear a beep instead of seeing a visual transformation.

Sleep-deprived participants performed much worse than well-rested participants on these tasks, as expected. Their response times were slower, and for some of the stimuli, the participants never registered the change at all.

During these momentary lapses of attention, the researchers identified several physiological changes that occurred at the same time. Most significantly, they found a flux of CSF out of the brain just as those lapses occurred. After each lapse, CSF flowed back into the brain.

“The results are suggesting that at the moment that attention fails, this fluid is actually being expelled outward away from the brain. And when attention recovers, it’s drawn back in,” Lewis says.

The researchers hypothesize that when the brain is sleep-deprived, it begins to compensate for the loss of the cleansing that normally occurs during sleep, even though these pulses of CSF flow come with the cost of attention loss.

“One way to think about those events is because your brain is so in need of sleep, it tries its best to enter into a sleep-like state to restore some cognitive functions,” Yang says. “Your brain’s fluid system is trying to restore function by pushing the brain to iterate between high-attention and high-flow states.”

A unified circuit

The researchers also found several other physiological events linked to attentional lapses, including decreases in breathing and heart rate, along with constriction of the pupils. They found that pupil constriction began about 12 seconds before CSF flowed out of the brain, and pupils dilated again after the attentional lapse.

“What’s interesting is it seems like this isn’t just a phenomenon in the brain, it’s also a body-wide event. It suggests that there’s a tight coordination of these systems, where when your attention fails, you might feel it perceptually and psychologically, but it’s also reflecting an event that’s happening throughout the brain and body,” Lewis says.

This close linkage between disparate events may indicate that there is a single circuit that controls both attention and bodily functions such as fluid flow, heart rate, and arousal, according to the researchers.

“These results suggest to us that there’s a unified circuit that’s governing both what we think of as very high-level functions of the brain — our attention, our ability to perceive and respond to the world — and then also really basic fundamental physiological processes like fluid dynamics of the brain, brain-wide blood flow, and blood vessel constriction,” Lewis says.

In this study, the researchers did not explore what circuit might be controlling this switching, but one good candidate, they say, is the noradrenergic system. Recent research has shown that this system, which regulates many cognitive and bodily functions through the neurotransmitter norepinephrine, oscillates during normal sleep.

The research was funded by the National Institutes of Health, a National Defense Science and Engineering Graduate Research Fellowship, a NAWA Fellowship, a McKnight Scholar Award, a Sloan Fellowship, a Pew Biomedical Scholar Award, a One Mind Rising Star Award, and the Simons Collaboration on Plasticity in the Aging Brain.

Kevin Chen wins IROS Toshio Fukuda Young Professional Award

Kevin Y. Chen, Associate Professor in the Department of Electrical Engineering and Computer Science at MIT and Principal Investigator of the Soft and Micro Robotics Laboratory at RLE has received the Toshio Fukuda Young Professional Award at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). The award recognizes Chen’s pioneering contributions to multimodal dynamic insect-scale robots and soft-actuated aerial robots.

At MIT, Chen leads his lab with the goal of developing microscale robotic systems that demonstrate insect-like locomotive capabilities in aerial, aquatic and terrestrial settings. Their research pursues three main thrusts: (1) the development of high-bandwidth and robust soft actuators and active materials for locomotion and manipulation; (2) the investigation and application of physical phenomena at the millimetre scale (such as surface tension, fluid-structure interaction, friction and electrostatics) in designing multifunctional robots; and (3) the creation of design and fabrication tools suitable for rapid prototyping of hybrid soft-rigid robots.

Chen’s group has achieved several notable research milestones that underscore the scope and impact of their work:

  • The development of a sub-gram hopping and flying microrobot that can traverse challenging terrains including grass, ice and even a floating lotus leaf — achieved by combining flapping-wing lift with a passive elastic leg for hopping.
  • The creation of a new generation of flapping-wing micro-aerial robots driven by soft artificial muscles — capable of hover times exceeding 1000 seconds, performing somersaults in 0.16 s, and precisely tracking challenging trajectories.
  • The demonstration of damage-resilient soft actuators for aerial robots — where actuators can be pierced repeatedly and then “repaired” via laser-assisted recovery — opening the door to operation in harsh environments.
  • Recognition of their work at the institutional and media level: for example, their insect-scale aerial robots were featured in MIT’s “Top 10 Research Stories of 2022” and in broader media outlets like CNN Tech of Good.

By leveraging soft-materials, advanced actuation methods and microscale design principles, Chen’s research is laying the groundwork for small-scale robotic systems that could one day perform inspection, surgical operations or environmental monitoring in places inaccessible to larger machines.

The IROS Young Professional Award highlights Chen’s growing prominence in the robotics community and strengthens MIT’s position at the forefront of next-generation robotic locomotion and manipulation research.

EECS 2025 Awards

The Department would like to celebrate the accomplishments and contributions of our incredible EECS community by sharing some of the awards given by the department this year. Congratulations to all the winners!

Louis D. Smullin Award – Negar Reiskarimian


Jerome H. Saltzer Award – Manya Ghobadi


Burgess (1952) & Elizabeth Jamieson Award – Justin Solomon


Burgess (1952) & Elizabeth Jamieson Award – Tamara Broderick


EECS Digital Innovation Award – Max Goldman


Ruth and Joel Spira Award for Excellence in Teaching – Jelena Notaros


Ruth and Joel Spira Award for Excellence in Teaching – Jonathan Ragan-Kelley


EECS Outstanding Educator Award – Jacob Andreas


Kolokotrones Education Award – Shen Shen


Department Head Special Recognition Award

Myung-Hee Vabulas


Richard J. Caloggero Award

Zhi Xuan Tan


Carlton E. Tucker Award for teaching excellence

William A. Brandon
Peter F. Satterthwaite
Josiah J. McMenamy
Julian M. Zanders


Harold L. Hazen  Award for teaching excellence

Luke A. Wagner
William Liu
Thana Somsirivattana
Ashar Farooq


Frederick C. Hennie III Award for teaching excellence

Keshav Gupta
Lydia J. Patterson
Pleng Chomphoochan
Lasya A. Balachandran
Emma Y. Jung
Nathan A. Shwatal


Undergraduate Teaching Award for teaching excellence

Jason Li


Jeremy Gerstle UROP Award (in AI)

Andy Li (supervised by Stefanie Mueller)
Alex C. Luchianov (supervised by Stefanie Mueller)


Morais (1986) and Rosenblum (1986) UROP Award

Nikitha Thoduguli (supervised by Manolis Kellis)


Anna Pogosyants UROP Prize

Sarah A. Zhao (supervised by Manolis Kellis)


Robert M. Fano UROP Award

Pascal J.H. Passigan (supervised by Manolis Kellis)


George C. Newton UG Lab Prize

Tori A. Hagenlocker
Cathy Y. Hu
Heba I. Hussein


Northern Telecom/BNR Project Award

Noah Wiley
Luc Gaitskell


David A. Chanen Writing Award

Charles A. Harmon
Alaina Kolli
Drew S. Geoly

Rose N. Alsalman
Zach D. Marinov
Dakota Goldberg


David Adler Memorial EE MEng Thesis Award

Isabelle A. Quaye (supervised by Ronitt Rubinfeld)
Cassia B. Wang (supervised by Manolis Kellis)


Charles & Jennifer Johnson Computer Science MEng Thesis Award

Anna Arpaci-Dusseau (supervised by Xuhao Chen/C. Leiserson)
Emma P. Tysinger (supervised by Manolis Kellis)


Charles & Jennifer Johnson Artificial Intelligence and Decision-Making MEng Thesis Award

Ruowang Zhang (supervised by Stefanie Mueller)
Irene Yu-Lin Huang (supervised by Aude Oliva)


J. Francis Reintjes Excellence in 6-A Industrial Practice Award

Vlada Petrusenko

Michael Lu


Jin Au Kong PhD Thesis in EE

Yiyue Luo (supervised by Wojciech Matusik, Tomas Palacios)


George M. Sprowls PhD Thesis Award in Computer Science

Anish R. Athalye (supervised by Nickolai Zeldovich /M. Frans Kaashoek)
Shyam Narayanan (supervised by Piotr Indyk)


George M. Sprowls PhD Thesis Award In Artificial Intelligence and Decision Making

Andrew Ilyas (supervised by Aleksander Mądry / Constantinos Daskalakis

Tobia Marcucci (supervised by (Russell Tedrake / Pablo Parrilo)


William A. Martin SM Thesis Award in Computer Science

Yuheng Yang (supervised by Mengjia Yan)
Seyoon Ragavan (supervised by Vinod Vaikuntanathan)


Ernst A. Guillemin SM Thesis Award in Electrical Engineering

Laurentiu Lucian Anton (supervised by Marija Ilic)
Emma Wawrzynek (supervised by Jeffrey Lang)


Ernst A. Guillemin SM Thesis Award in Artificial Intelligence and Decision Making

Kiril A. Bangachev (supervised by Guy Bresler)

Haike Xu (supervised by Piotr Indyk)


Behring Foundation Prize

Lucas Shoji