Novel AI model inspired by neural dynamics from the brain

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a novel artificial intelligence model inspired by neural oscillations in the brain, with the goal of significantly advancing how machine learning algorithms handle long sequences of data.

AI often struggles with analyzing complex information that unfolds over long periods of time, such as climate trends, biological signals, or financial data. One new type of AI model, called “state-space models,” has been designed specifically to understand these sequential patterns more effectively. However, existing state-space models often face challenges — they can become unstable or require a significant amount of computational resources when processing long data sequences.

To address these issues, CSAIL researchers T. Konstantin Rusch and Daniela Rus have developed what they call “linear oscillatory state-space models” (LinOSS), which leverage principles of forced harmonic oscillators — a concept deeply rooted in physics and observed in biological neural networks. This approach provides stable, expressive, and computationally efficient predictions without overly restrictive conditions on the model parameters.

“Our goal was to capture the stability and efficiency seen in biological neural systems and translate these principles into a machine learning framework,” explains Rusch. “With LinOSS, we can now reliably learn long-range interactions, even in sequences spanning hundreds of thousands of data points or more.”

The LinOSS model is unique in ensuring stable prediction by requiring far less restrictive design choices than previous methods. Moreover, the researchers rigorously proved the model’s universal approximation capability, meaning it can approximate any continuous, causal function relating input and output sequences.

Empirical testing demonstrated that LinOSS consistently outperformed existing state-of-the-art models across various demanding sequence classification and forecasting tasks. Notably, LinOSS outperformed the widely-used Mamba model by nearly two times in tasks involving sequences of extreme length.

Recognized for its significance, the research was selected for an oral presentation at ICLR 2025 — an honor awarded to only the top 1 percent of submissions. The MIT researchers anticipate that the LinOSS model could significantly impact any fields that would benefit from accurate and efficient long-horizon forecasting and classification, including health-care analytics, climate science, autonomous driving, and financial forecasting.

“This work exemplifies how mathematical rigor can lead to performance breakthroughs and broad applications,” Rus says. “With LinOSS, we’re providing the scientific community with a powerful tool for understanding and predicting complex systems, bridging the gap between biological inspiration and computational innovation.”

The team imagines that the emergence of a new paradigm like LinOSS will be of interest to machine learning practitioners to build upon. Looking ahead, the researchers plan to apply their model to an even wider range of different data modalities. Moreover, they suggest that LinOSS could provide valuable insights into neuroscience, potentially deepening our understanding of the brain itself.

Their work was supported by the Swiss National Science Foundation, the Schmidt AI2050 program, and the U.S. Department of the Air Force Artificial Intelligence Accelerator.

Student Spotlight: Aria Eppinger

This interview is part of a series of short interviews from the Department of EECS, called Student Spotlights. Each Spotlight features a student answering their choice of questions about themselves and life at MIT. Today’s interviewee, Aria Eppinger, graduated with her undergraduate degree in 6-7 Computer Science and Molecular Biology in spring of 2024. This spring, she will complete her MEng in 6-7. Her thesis, supervised by Ford Professor of Engineering Doug Lauffenburger  in the Department of Biological Engineering, investigates the biological underpinnings of adverse pregnancy outcomes, including preterm birth and pre-eclampsia, by applying polytope-fitting algorithms.

Tell me about one teacher from your past—here at MIT, at your high school, or even earlier, who had an influence on the person you’ve become.

There are many teachers who had a large impact on my trajectory. I would first like to thank my elementary and middle school teachers for imbuing in me a love of learning. I would also like to thank my high school teachers for not only teaching me the foundations of writing strong arguments, programming, and designing experiments, but also instilling in me the importance of being a balanced person. It can be tempting to be ruled by studies or work, especially when learning and working are so fun. My high school teachers encouraged me to pursue my hobbies, make memories with friends, and spend time with family. As life continues to be hectic, I’m so grateful for this lesson (even if I’m still working on mastering it).

Tell me about one conversation that changed the trajectory of your life.

A number of years ago, I had the opportunity to chat with Warren Buffett. I was nervous at first but soon put to ease by his descriptions of his favorite foods – hamburgers, French fries, and ice cream – and his hitch-hiking stories. His kindness impressed and inspired me, which is something I carry with me and aim to emulate all these years later.

Do you have any pets? Tell us about them—and if you have pictures, please share!

I have one dog who lives at home with my parents. Dodger, named after “Artful Dodger” in Oliver Twist, is as mischievous as beagles tend to be. We adopted him from a rescue shelter when I was in elementary school.

Dodger (left) and the late Patch (right), shared a doghouse built as a group project by Aria, her brother, father, and grandfather. Photo credit: Francesmary Modugno

Are you a re-reader or a re-watcher—and if so, what are your comfort books, shows, or movies?

I don’t re-read many books or re-watch many movies, but I never tire of Jane Austen’s Pride and Prejudice. I bought myself an ornately bound copy when I was interning in NYC last summer.  Austen’s other novels, especially Sense and Sensibility, Persuasion, and Emma, are also favorites, and I’ve seen a fair number of their movie and mini-series adaptations. My favorite adaptation is the 1995 BBC production of Pride and Prejudice because of the cohesion with the original book and the casting of the leads, as well as the touches and plot derivations added by the producer and director to bring the work to modern audiences. The adaptation is quite long, but I have fond memories of re-watching it with some fellow Austenites at MIT.

Speaking of swimming scenes, Eppinger just finished her final season as a member of the MIT Varsity Swimming and Diving Team, where she competed in distance freestyle and breaststroke events. Photo credit: Sydney Chun

If you had to teach a really in-depth class about one niche topic, what would you pick?

There are two types of people in the world – those who eat to live and those who live to eat. As one of the latter, I would have to teach some sort of in-depth class on food. Perhaps, I would teach the science behind baking chocolate cake or churning the perfect ice cream. Or maybe I would teach the biochemistry of digesting. In any case, I would have to have lots of hands-on demos and reserve plenty for taste-testing!

What was the last thing you changed your mind about?

Brisket! I never was a big fan of brisket until I went to a Texas BBQ restaurant near campus, The Smoke Shop BBQ. Growing up, I had never had true BBQ, so I was quite skeptical. However, I enjoyed not only the brisket but also the other dishes. The brussels sprouts with caramelized onions is probably my favorite dish, but it feels like a crime to say that about a BBQ place!

What are you looking forward to about life after graduation? What do you think you’ll miss about MIT?

I’m looking forward to new adventures after graduation, including working in NYC and traveling to new places. I cross-registered to take Intensive Italian at Harvard this semester and am planning a trip to Italy to practice my Italian, see the historic sites, visit the Vatican, and taste the food. Non vedo l’ora di viaggiare all’Italia!

Eppinger has relished her time at MIT. “College is a special time to live with friends in close proximity and to stay up late working on psets in the Baker lounges.” Photo credit: Karla Ravin

While I’m excited for what lies ahead, I will miss MIT. What a joy it is to spend most of the day learning information from a fire hose, taking a class on a foreign topic because the course catalog description looked fun, talking to people whose viewpoint is very similar or very different from my own, and making friends that will last a lifetime.

Merging design and computer science in creative ways

The speed with which new technologies hit the market is nothing compared to the speed with which talented researchers find creative ways to use them, train them, even turn them into things we can’t live without. One such researcher is MIT MAD Fellow Alexander Htet Kyaw, a graduate student pursuing dual master’s degrees in architectural studies in computation and in electrical engineering and computer science.

Kyaw takes technologies like artificial intelligence, augmented reality, and robotics, and combines them with gesture, speech, and object recognition to create human-AI workflows that have the potential to interact with our built environment, change how we shop, design complex structures, and make physical things.

One of his latest innovations is Curator AI, for which he and his MIT graduate student partners took first prize — $26,000 in OpenAI products and cash — at the MIT AI Conference’s AI Build: Generative Voice AI Solutions, a weeklong hackathon at MIT with final presentations held last fall in New York City. Working with Kyaw were Richa Gupta (architecture) and Bradley Bunch, Nidhish Sagar, and Michael Won — all from the MIT Department of Electrical Engineering and Computer Science (EECS).

Curator AI is designed to streamline online furniture shopping by providing context-aware product recommendations using AI and AR. The platform uses AR to take the dimensions of a room with locations of windows, doors, and existing furniture. Users can then speak to the software to describe what new furnishings they want, and the system will use a vision-language AI model to search for and display various options that match both the user’s prompts and the room’s visual characteristics.

“Shoppers can choose from the suggested options, visualize products in AR, and use natural language to ask for modifications to the search, making the furniture selection process more intuitive, efficient, and personalized,” Kyaw says. “The problem we’re trying to solve is that most people don’t know where to start when furnishing a room, so we developed Curator AI to provide smart, contextual recommendations based on what your room looks like.” Although Curator AI was developed for furniture shopping, it could be expanded for use in other markets.

Another example of Kyaw’s work is Estimate, a product that he and three other graduate students created during the MIT Sloan Product Tech Conference’s hackathon in March 2024. The focus of that competition was to help small businesses; Kyaw and team decided to base their work on a painting company in Cambridge that employs 10 people. Estimate uses AR and an object-recognition AI technology to take the exact measurements of a room and generate a detailed cost estimate for a renovation and/or paint job. It also leverages generative AI to display images of the room or rooms as they might look like after painting or renovating, and generates an invoice once the project is complete.

The team won that hackathon and $5,000 in cash. Kyaw’s teammates were Guillaume Allegre, May Khine, and Anna Mathy, all of whom graduated from MIT in 2024 with master’s degrees in business analytics.

In April, Kyaw will give a TedX talk at his alma mater, Cornell University, in which he’ll describe Curator AI, Estimate, and other projects that use AI, AR, and robotics to design and build things.

One of these projects is Unlog, for which Kyaw connected AR with gesture recognition to build a software that takes input from the touch of a fingertip on the surface of a material, or even in the air, to map the dimensions of building components. That’s how Unlog — a towering art sculpture made from ash logs that stands on the Cornell campus — came about.

Gesture Recognition for Feedback-Based Mixed Reality and Robotic Fabrication of the Unlog Tower. Video: Alexander Htet Kyaw

Unlog represents the possibility that structures can be built directly from a whole log, rather than having the log travel to a lumber mill to be turned into planks or two-by-fours, then shipped to a wholesaler or retailer. It’s a good representation of Kyaw’s desire to use building materials in a more sustainable way. A paper on this work, “Gestural Recognition for Feedback-Based Mixed Reality Fabrication a Case Study of the UnLog Tower,” was published by Kyaw, Leslie Lok, Lawson Spencer, and Sasa Zivkovic in the Proceedings of the 5th International Conference on Computational Design and Robotic Fabrication, January 2024.

Another system Kyaw developed integrates physics simulation, gesture recognition, and AR to design active bending structures built with bamboo poles. Gesture recognition allows users to manipulate digital bamboo modules in AR, and the physics simulation is integrated to visualize how the bamboo bends and where to attach the bamboo poles in ways that create a stable structure. This work appeared in the Proceedings of the 41st Education and Research in Computer Aided Architectural Design in Europe, August 2023, as “Active Bending in Physics-Based Mixed Reality: The Design and Fabrication of a Reconfigurable Modular Bamboo System.”

Kyaw pitched a similar idea using bamboo modules to create deployable structures last year to MITdesignX, an MIT MAD program that selects promising startups and provides coaching and funding to launch them. Kyaw has since founded BendShelters to build the prefabricated, modular bamboo shelters and community spaces for refugees and displaced persons in Myanmar, his home country.

“Where I grew up, in Myanmar, I’ve seen a lot of day-to-day effects of climate change and extreme poverty,” Kyaw says. “There’s a huge refugee crisis in the country, and I want to think about how I can contribute back to my community.”

His work with BendShelters has been recognized by MIT Sandbox, PKG Social Innovation Challenge, and the Amazon Robotics’ Prize for Social Good.

At MIT, Kyaw is collaborating with Professor Neil Gershenfeld, director of the Center for Bits and Atoms, and PhD student Miana Smith to use speech recognition, 3D generative AI, and robotic arms to create a workflow that can build objects in an accessible, on-demand, and sustainable way. Kyaw holds bachelor’s degrees in architecture and computer science from Cornell. Last year, he was awarded an SJA Fellowship from the Steve Jobs Archive, which provides funding for projects at the intersection of technology and the arts. 

“I enjoy exploring different kinds of technologies to design and make things,” Kyaw says. “Being part of MAD has made me think about how all my work connects, and helped clarify my intentions. My research vision is to design and develop systems and products that enable natural interactions between humans, machines, and the world around us.” 

3D modeling you can feel

Essential for many industries ranging from Hollywood computer-generated imagery to product design, 3D modeling tools often use text or image prompts to dictate different aspects of visual appearance, like color and form. As much as this makes sense as a first point of contact, these systems are still limited in their realism due to their neglect of something central to the human experience: touch.

Fundamental to the uniqueness of physical objects are their tactile properties, such as roughness, bumpiness, or the feel of materials like wood or stone. Existing modeling methods often require advanced computer-aided design expertise and rarely support tactile feedback that can be crucial for how we perceive and interact with the physical world.

With that in mind, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have created a new system for stylizing 3D models using image prompts, effectively replicating both visual appearance and tactile properties.

The CSAIL team’s “TactStyle” tool allows creators to stylize 3D models based on images while also incorporating the expected tactile properties of the textures. TactStyle separates visual and geometric stylization, enabling the replication of both visual and tactile properties from a single image input.

“TactStyle” tool allows creators to stylize 3D models based on images while also incorporating the expected tactile properties of the textures.

EECS PhD student Faraz Faruqi, lead author of a new paper on the project, says that TactStyle could have far-reaching applications, extending from home decor and personal accessories to tactile learning tools. TactStyle enables users to download a base design — such as a headphone stand from Thingiverse — and customize it with the styles and textures they desire. In education, learners can explore diverse textures from around the world without leaving the classroom, while in product design, rapid prototyping becomes easier as designers quickly print multiple iterations to refine tactile qualities.

“You could imagine using this sort of system for common objects, such as phone stands and earbud cases, to enable more complex textures and enhance tactile feedback in a variety of ways,” says Faruqi, who co-wrote the paper alongside MIT Associate Professor Stefanie Mueller, leader of the Human-Computer Interaction (HCI) Engineering Group at CSAIL. “You can create tactile educational tools to demonstrate a range of different concepts in fields such as biology, geometry, and topography.”

Traditional methods for replicating textures involve using specialized tactile sensors — such as GelSight, developed at MIT — that physically touch an object to capture its surface microgeometry as a “heightfield.” But this requires having a physical object or its recorded surface for replication. TactStyle allows users to replicate the surface microgeometry by leveraging generative AI to generate a heightfield directly from an image of the texture.

On top of that, for platforms like the 3D printing repository Thingiverse, it’s difficult to take individual designs and customize them. Indeed, if a user lacks sufficient technical background, changing a design manually runs the risk of actually “breaking” it so that it can’t be printed anymore. All of these factors spurred Faruqi to wonder about building a tool that enables customization of downloadable models on a high level, but that also preserves functionality.

In experiments, TactStyle showed significant improvements over traditional stylization methods by generating accurate correlations between a texture’s visual image and its heightfield. This enables the replication of tactile properties directly from an image. One psychophysical experiment showed that users perceive TactStyle’s generated textures as similar to both the expected tactile properties from visual input and the tactile features of the original texture, leading to a unified tactile and visual experience.

TactStyle leverages a preexisting method, called “Style2Fab,” to modify the model’s color channels to match the input image’s visual style. Users first provide an image of the desired texture, and then a fine-tuned variational autoencoder is used to translate the input image into a corresponding heightfield. This heightfield is then applied to modify the model’s geometry to create the tactile properties.

The color and geometry stylization modules work in tandem, stylizing both the visual and tactile properties of the 3D model from a single image input. Faruqi says that the core innovation lies in the geometry stylization module, which uses a fine-tuned diffusion model to generate heightfields from texture images — something previous stylization frameworks do not accurately replicate.

Looking ahead, Faruqi says the team aims to extend TactStyle to generate novel 3D models using generative AI with embedded textures. This requires exploring exactly the sort of pipeline needed to replicate both the form and function of the 3D models being fabricated. They also plan to investigate “visuo-haptic mismatches” to create novel experiences with materials that defy conventional expectations, like something that appears to be made of marble but feels like it’s made of wood.

Faruqi and Mueller co-authored the new paper alongside PhD students Maxine Perroni-Scharf and Yunyi Zhu, visiting undergraduate student Jaskaran Singh Walia, visiting masters student Shuyue Feng, and assistant professor Donald Degraen of the Human Interface Technology (HIT) Lab NZ in New Zealand.

“Periodic table of machine learning” could fuel AI discovery

MIT researchers have created a periodic table that shows how more than 20 classical machine-learning algorithms are connected. The new framework sheds light on how scientists could fuse strategies from different methods to improve existing AI models or come up with new ones.

For instance, the researchers used their framework to combine elements of two different algorithms to create a new image-classification algorithm that performed 8 percent better than current state-of-the-art approaches.

The periodic table stems from one key idea: All these algorithms learn a specific kind of relationship between data points. While each algorithm may accomplish that in a slightly different way, the core mathematics behind each approach is the same.

Building on these insights, the researchers identified a unifying equation that underlies many classical AI algorithms. They used that equation to reframe popular methods and arrange them into a table, categorizing each based on the approximate relationships it learns.

Just like the periodic table of chemical elements, which initially contained blank squares that were later filled in by scientists, the periodic table of machine learning also has empty spaces. These spaces predict where algorithms should exist, but which haven’t been discovered yet.

The table gives researchers a toolkit to design new algorithms without the need to rediscover ideas from prior approaches, says Shaden Alshammari, an MIT graduate student and lead author of a paper on this new framework.

“It’s not just a metaphor,” adds Alshammari. “We’re starting to see machine learning as a system with structure that is a space we can explore rather than just guess our way through.”

She is joined on the paper by John Hershey, a researcher at Google AI Perception; Axel Feldmann, an MIT graduate student; William Freeman, the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and senior author Mark Hamilton, an MIT graduate student and senior engineering manager at Microsoft. The research will be presented at the International Conference on Learning Representations.

An accidental equation

The researchers didn’t set out to create a periodic table of machine learning.

After joining the Freeman Lab, Alshammari began studying clustering, a machine-learning technique that classifies images by learning to organize similar images into nearby clusters.

She realized the clustering algorithm she was studying was similar to another classical machine-learning algorithm, called contrastive learning, and began digging deeper into the mathematics. Alshammari found that these two disparate algorithms could be reframed using the same underlying equation.

“We almost got to this unifying equation by accident. Once Shaden discovered that it connects two methods, we just started dreaming up new methods to bring into this framework. Almost every single one we tried could be added in,” Hamilton says.

The framework they created, information contrastive learning (I-Con), shows how a variety of algorithms can be viewed through the lens of this unifying equation. It includes everything from classification algorithms that can detect spam to the deep learning algorithms that power LLMs.

The equation describes how such algorithms find connections between real data points and then approximate those connections internally.

Each algorithm aims to minimize the amount of deviation between the connections it learns to approximate and the real connections in its training data.

They decided to organize I-Con into a periodic table to categorize algorithms based on how points are connected in real datasets and the primary ways algorithms can approximate those connections.

“The work went gradually, but once we had identified the general structure of this equation, it was easier to add more methods to our framework,” Alshammari says.

A tool for discovery

As they arranged the table, the researchers began to see gaps where algorithms could exist, but which hadn’t been invented yet.

The researchers filled in one gap by borrowing ideas from a machine-learning technique called contrastive learning and applying them to image clustering. This resulted in a new algorithm that could classify unlabeled images 8 percent better than another state-of-the-art approach.

They also used I-Con to show how a data debiasing technique developed for contrastive learning could be used to boost the accuracy of clustering algorithms.

In addition, the flexible periodic table allows researchers to add new rows and columns to represent additional types of datapoint connections.

Ultimately, having I-Con as a guide could help machine learning scientists think outside the box, encouraging them to combine ideas in ways they wouldn’t necessarily have thought of otherwise, says Hamilton.

“We’ve shown that just one very elegant equation, rooted in the science of information, gives you rich algorithms spanning 100 years of research in machine learning. This opens up many new avenues for discovery,” he adds.

“Perhaps the most challenging aspect of being a machine-learning researcher these days is the seemingly unlimited number of papers that appear each year. In this context, papers that unify and connect existing algorithms are of great importance, yet they are extremely rare. I-Con provides an excellent example of such a unifying approach and will hopefully inspire others to apply a similar approach to other domains of machine learning,” says Yair Weiss, a professor in the School of Computer Science and Engineering at the Hebrew University of Jerusalem, who was not involved in this research.

This research was funded, in part, by the Air Force Artificial Intelligence Accelerator, the National Science Foundation AI Institute for Artificial Intelligence and Fundamental Interactions, and Quanta Computer.

Training LLMs to self-detoxify their language

As we mature from childhood, our vocabulary — as well as the ways we use it — grows, and our experiences become richer, allowing us to think, reason, and interact with others with specificity and intention. Accordingly, our word choices evolve to align with our personal values, ethics, cultural norms, and views. Over time, most of us develop an internal “guide” that enables us to learn context behind conversation; it also frequently directs us away from sharing information and sentiments that are, or could be, harmful or inappropriate. As it turns out, large language models (LLMs) — which are trained on extensive, public datasets and therefore often have biases and toxic language baked in — can gain a similar capacity to moderate their own language.

A new method from MIT, the MIT-IBM Watson AI Lab, and IBM Research, called self-disciplined autoregressive sampling (SASA), allows LLMs to detoxify their own outputs, without sacrificing fluency. 

Unlike other detoxifying methods, this decoding algorithm learns a boundary between toxic/nontoxic subspaces within the LLM’s own internal representation, without altering the parameters of the model, the need for retraining, or an external reward model. Then, during inference, the algorithm assesses the toxicity value of the partially generated phrase: tokens (words) already generated and accepted, along with each potential new token that could reasonably be chosen for proximity to the classifier boundary. Next, it selects a word option that places the phrase in the nontoxic space, ultimately offering a fast and efficient way to generate less-toxic language.

“We wanted to find out a way with any existing language model [that], during the generation process, the decoding can be subject to some human values; the example here we are taking is toxicity,” says the study’s lead author Ching-Yun “Irene” Ko PhD ’24, a former graduate intern with the MIT-IBM Watson AI Lab and a current research scientist at IBM’s Thomas J. Watson Research Center in New York.

Ko’s co-authors include Luca Daniel, professor in the MIT Department of Electrical Engineering and Computer Science (EECS), a member of the MIT-IBM Watson AI Lab, and Ko’s graduate advisor; and several members of the MIT-IBM Watson AI Lab and/or IBM Research — Pin-Yu Chen, Payel Das, Youssef Mroueh, Soham Dan, Georgios Kollias, Subhajit Chaudhury, and Tejaswini Pedapati. The work will be presented at the International Conference on Learning Representations.

Finding the “guardrails”

The training resources behind LLMs almost always include content collected from public spaces like the internet and other readily available datasets. As such, curse words and bullying/unpalatable language is a component, although some of it is in the context of literary works. It then follows that LLMs can innately produce — or be tricked into generating — dangerous and/or biased content, which often contains disagreeable words or hateful language, even from innocuous prompts. Further, it’s been found that they can learn and amplify language that’s not preferred or even detrimental for many applications and downstream tasks — leading to the need for mitigation or correction strategies.

There are many ways to achieve robust language generation that’s fair and value-aligned. Some methods use LLM retraining with a sanitized dataset, which is costly, takes time, and may alter the LLM’s performance; others employ decoding external reward models, like sampling or beam search, which take longer to run and require more memory. In the case of SASA, Ko, Daniel, and the IBM Research team developed a method that leverages the autoregressive nature of LLMs, and using a decoding-based strategy during the LLM’s inference, gradually steers the generation — one token at a time — away from unsavory or undesired outputs and toward better language.

The research group achieved this by building a linear classifier that operates on the learned subspace from the LLM’s embedding. When LLMs are trained, words with similar meanings are placed closely together in vector space and further away from dissimilar words; the researchers hypothesized that an LLM’s embedding would therefore also capture contextual information, which could be used for detoxification. The researchers used datasets that contained sets of a prompt (first half of a sentence or thought), a response (the completion of that sentence), and human-attributed annotation, like toxic or nontoxic, preferred or not preferred, with continuous labels from 0-1, denoting increasing toxicity. A Bayes-optimal classifier was then applied to learn and figuratively draw a line between the binary subspaces within the sentence embeddings, represented by positive values (nontoxic space) and negative numbers (toxic space). 

The SASA system then works by re-weighting the sampling probabilities of newest potential token based on the value of it and the generated phrase’s distance to the classifier, with the goal of remaining close to the original sampling distribution.

To illustrate, if a user is generating a potential token #12 in a sentence, the LLM will look over its full vocabulary for a reasonable word, based on the 11 words that came before it, and using top-k, top-p, it will filter and produce roughly 10 tokens to select from. SASA then evaluates each of those tokens in the partially completed sentence for its proximity to the classifier (i.e., the value of tokens 1-11, plus each potential token 12). Tokens that produce sentences in the positive space are encouraged, while those in the negative space are penalized. Additionally, the further away from the classifier, the stronger the impact.

“The goal is to change the autoregressive sampling process by re-weighting the probability of good tokens. If the next token is likely to be toxic given the context, then we are going to reduce the sampling probability for those prone to be toxic tokens,” says Ko. The researchers chose to do it this way “because the things we say, whether it’s benign or not, is subject to the context.”

Tamping down toxicity for value matching

The researchers evaluated their method against several baseline interventions with three LLMs of increasing size; all were transformers and autoregressive-based: GPT2-Large, Llama2-7b, and Llama 3.1-8b-Instruct, with 762 million, 7 billion, and 8 billion parameters respectively. For each prompt, the LLM was tasked with completing the sentence/phrase 25 times, and PerspectiveAPI scored them from 0 to 1, with anything over 0.5 being toxic. The team looked at two metrics: the average maximum toxicity score over the 25 generations for all the prompts, and the toxic rate, which was the probability of producing at least one toxic phrase over 25 generations. Reduced fluency (and therefore increased perplexity) were also analyzed. SASA was tested to complete RealToxicityPrompts (RPT), BOLD, and AttaQ datasets, which contained naturally occurring, English sentence prompts.

The researchers ramped up the complexity of their trials for detoxification by SASA, beginning with nontoxic prompts from the RPT dataset, looking for harmful sentence completions. Then, they escalated it to more challenging prompts from RPT that were more likely to produce concerning results, and as well applied SASA to the instruction-tuned model to assess if their technique could further reduce unwanted ouputs. They also used the BOLD and AttaQ benchmarks to examine the general applicability of SASA in detoxification. With the BOLD dataset, the researchers further looked for gender bias in language generations and tried to achieve a balanced toxic rate between the genders. Lastly, the team looked at runtime, memory usage, and how SASA could be combined with word filtering to achieve healthy and/or helpful language generation.

“If we think about how human beings think and react in the world, we do see bad things, so it’s not about allowing the language model to see only the good things. It’s about understanding the full spectrum — both good and bad,” says Ko, “and choosing to uphold our values when we speak and act.”

Overall, SASA achieved significant toxic language generation reductions, performing on par with RAD, a state-of-the-art external reward model technique. However, it was universally observed that stronger detoxification accompanied a decrease in fluency. Before intervention, the LLMs produced more toxic responses for female labeled prompts than male; however, SASA was able to also significantly cut down harmful responses, making them more equalized. Similarly, word filtering on top of SASA did markedly lower toxicity levels, but it also hindered the ability of the LLM to respond coherently.

A great aspect of this work is that it’s a well-defined, constrained optimization problem, says Ko, meaning that balance between open language generation that sounds natural and the need to reduce unwanted language can be achieved and tuned.

Further, Ko says, SASA could work well for multiple attributes in the future: “For human beings, we have multiple human values. We don’t want to say toxic things, but we also want to be truthful, helpful, and loyal … If you were to fine-tune a model for all of these values, it would require more computational resources and, of course, additional training.” On account of the lightweight manner of SASA, it could easily be applied in these circumstances: “If you want to work with multiple values, it’s simply checking the generation’s position in multiple subspaces. It only adds marginal overhead in terms of the compute and parameters,” says Ko, leading to more positive, fair, and principle-aligned language.

This work was supported, in part, by the MIT-IBM Watson AI Lab and the National Science Foundation.

Could LLMs help design our next medicines and materials?

The process of discovering molecules that have the properties needed to create new medicines and materials is cumbersome and expensive, consuming vast computational resources and months of human labor to narrow down the enormous space of potential candidates.

Large language models (LLMs) like ChatGPT could streamline this process, but enabling an LLM to understand and reason about the atoms and bonds that form a molecule, the same way it does with words that form sentences, has presented a scientific stumbling block.

Researchers from MIT and the MIT-IBM Watson AI Lab created a promising approach that augments an LLM with other machine-learning models known as graph-based models, which are specifically designed for generating and predicting molecular structures.

Their method employs a base LLM to interpret natural language queries specifying desired molecular properties. It automatically switches between the base LLM and graph-based AI modules to design the molecule, explain the rationale, and generate a step-by-step plan to synthesize it. It interleaves text, graph, and synthesis step generation, combining words, graphs, and reactions into a common vocabulary for the LLM to consume.

When compared to existing LLM-based approaches, this multimodal technique generated molecules that better matched user specifications and were more likely to have a valid synthesis plan, improving the success ratio from 5 percent to 35 percent.

It also outperformed LLMs that are more than 10 times its size and that design molecules and synthesis routes only with text-based representations, suggesting multimodality is key to the new system’s success.

“This could hopefully be an end-to-end solution where, from start to finish, we would automate the entire process of designing and making a molecule. If an LLM could just give you the answer in a few seconds, it would be a huge time-saver for pharmaceutical companies,” says Michael Sun, an MIT graduate student and co-author of a paper on this technique.

Sun’s co-authors include lead author Gang Liu, a graduate student at the University of Notre Dame; Wojciech Matusik, a professor of electrical engineering and computer science at MIT who leads the Computational Design and Fabrication Group within the Computer Science and Artificial Intelligence Laboratory (CSAIL); Meng Jiang, associate professor at the University of Notre Dame; and senior author Jie Chen, a senior research scientist and manager in the MIT-IBM Watson AI Lab. The research will be presented at the International Conference on Learning Representations.

Best of both worlds

Large language models aren’t built to understand the nuances of chemistry, which is one reason they struggle with inverse molecular design, a process of identifying molecular structures that have certain functions or properties.

LLMs convert text into representations called tokens, which they use to sequentially predict the next word in a sentence. But molecules are “graph structures,” composed of atoms and bonds with no particular ordering, making them difficult to encode as sequential text.

On the other hand, powerful graph-based AI models represent atoms and molecular bonds as interconnected nodes and edges in a graph. While these models are popular for inverse molecular design, they require complex inputs, can’t understand natural language, and yield results that can be difficult to interpret.

The MIT researchers combined an LLM with graph-based AI models into a unified framework that gets the best of both worlds.

Llamole, which stands for large language model for molecular discovery, uses a base LLM as a gatekeeper to understand a user’s query — a plain-language request for a molecule with certain properties.

For instance, perhaps a user seeks a molecule that can penetrate the blood-brain barrier and inhibit HIV, given that it has a molecular weight of 209 and certain bond characteristics.

As the LLM predicts text in response to the query, it switches between graph modules.

One module uses a graph diffusion model to generate the molecular structure conditioned on input requirements. A second module uses a graph neural network to encode the generated molecular structure back into tokens for the LLMs to consume. The final graph module is a graph reaction predictor which takes as input an intermediate molecular structure and predicts a reaction step, searching for the exact set of steps to make the molecule from basic building blocks.

The researchers created a new type of trigger token that tells the LLM when to activate each module. When the LLM predicts a “design” trigger token, it switches to the module that sketches a molecular structure, and when it predicts a “retro” trigger token, it switches to the retrosynthetic planning module that predicts the next reaction step.

“The beauty of this is that everything the LLM generates before activating a particular module gets fed into that module itself. The module is learning to operate in a way that is consistent with what came before,” Sun says.

In the same manner, the output of each module is encoded and fed back into the generation process of the LLM, so it understands what each module did and will continue predicting tokens based on those data.

Better, simpler molecular structures

In the end, Llamole outputs an image of the molecular structure, a textual description of the molecule, and a step-by-step synthesis plan that provides the details of how to make it, down to individual chemical reactions.

In experiments involving designing molecules that matched user specifications, Llamole outperformed 10 standard LLMs, four fine-tuned LLMs, and a state-of-the-art domain-specific method. At the same time, it boosted the retrosynthetic planning success rate from 5 percent to 35 percent by generating molecules that are higher-quality, which means they had simpler structures and lower-cost building blocks.

“On their own, LLMs struggle to figure out how to synthesize molecules because it requires a lot of multistep planning. Our method can generate better molecular structures that are also easier to synthesize,” Liu says.

To train and evaluate Llamole, the researchers built two datasets from scratch since existing datasets of molecular structures didn’t contain enough details. They augmented hundreds of thousands of patented molecules with AI-generated natural language descriptions and customized description templates.

The dataset they built to fine-tune the LLM includes templates related to 10 molecular properties, so one limitation of Llamole is that it is trained to design molecules considering only those 10 numerical properties.

In future work, the researchers want to generalize Llamole so it can incorporate any molecular property. In addition, they plan to improve the graph modules to boost Llamole’s retrosynthesis success rate.

And in the long run, they hope to use this approach to go beyond molecules, creating multimodal LLMs that can handle other types of graph-based data, such as interconnected sensors in a power grid or transactions in a financial market.

“Llamole demonstrates the feasibility of using large language models as an interface to complex data beyond textual description, and we anticipate them to be a foundation that interacts with other AI algorithms to solve any graph problems,” says Chen.

This research is funded, in part, by the MIT-IBM Watson AI Lab, the National Science Foundation, and the Office of Naval Research.

Hopping gives this tiny robot a leg up

Insect-scale robots can squeeze into places their larger counterparts can’t, like deep into a collapsed building to search for survivors after an earthquake.

However, as they move through the rubble, tiny crawling robots might encounter tall obstacles they can’t climb over or slanted surfaces they will slide down. While aerial robots could avoid these hazards, the amount of energy required for flight would severely limit how far the robot can travel into the wreckage before it needs to return to base and recharge.

To get the best of both locomotion methods, MIT researchers developed a hopping robot that can leap over tall obstacles and jump across slanted or uneven surfaces, while using far less energy than an aerial robot.

The hopping robot, which is smaller than a human thumb and weighs less than a paperclip, has a springy leg that propels it off the ground, and four flapping-wing modules that give it lift and control its orientation.

The robot can jump about 20 centimeters into the air, or four times its height, at a lateral speed of about 30 centimeters per second, and has no trouble hopping across ice, wet surfaces, and uneven soil, or even onto a hovering drone. All the while, the hopping robot consumes about 60 percent less energy than its flying cousin.

Due to its light weight and durability, and the energy efficiency of the hopping process, the robot could carry about 10 times more payload than a similar-sized aerial robot, opening the door to many new applications.

“Being able to put batteries, circuits, and sensors on board has become much more feasible with a hopping robot than a flying one. Our hope is that one day this robot could go out of the lab and be useful in real-world scenarios,” says Yi-Hsuan (Nemo) Hsiao, an MIT graduate student and co-lead author of a paper on the hopping robot.

Hsiao is joined on the paper by co-lead authors Songnan Bai, a research assistant professor at The University of Hong Kong; and Zhongtao Guan, an incoming MIT graduate student who completed this work as a visiting undergraduate; as well as Suhan Kim and Zhijian Ren of MIT; and senior authors Pakpong Chirarattananon, an associate professor of the City University of Hong Kong; and Kevin Chen, an associate professor in the MIT Department of Electrical Engineering and Computer Science and head of the Soft and Micro Robotics Laboratory within the Research Laboratory of Electronics. The research appears today in Science Advances.

The hopping robot, which is smaller than a human thumb and weighs less than a paperclip, has a springy leg that propels it off the ground, and four flapping-wing modules that give it lift and control its orientation.
Image credit: Courtesy of the researchers

Maximizing efficiency

Jumping is common among insects, from fleas that leap onto new hosts to grasshoppers that bound around a meadow. While jumping is less common among insect-scale robots, which usually fly or crawl, hopping affords many advantages for energy efficiency.

When a robot hops, it transforms potential energy, which comes from its height off the ground, into kinetic energy as it falls. This kinetic energy transforms back to potential energy when it hits the ground, then back to kinetic as it rises, and so on.

To maximize efficiency of this process, the MIT robot is fitted with an elastic leg made from a compression spring, which is akin to the spring on a click-top pen. This spring converts the robot’s downward velocity to upward velocity when it strikes the ground.

“If you have an ideal spring, your robot can just hop along without losing any energy. But since our spring is not quite ideal, we use the flapping modules to compensate for the small amount of energy it loses when it makes contact with the ground,” Hsiao explains.

As the robot bounces back up into the air, the flapping wings provide lift, while ensuring the robot remains upright and has the correct orientation for its next jump. Its four flapping-wing mechanisms are powered by soft actuators, or artificial muscles, that are durable enough to endure repeated impacts with the ground without being damaged.

“We have been using the same robot for this entire series of experiments, and we never needed to stop and fix it,” Hsiao adds.

Key to the robot’s performance is a fast control mechanism that determines how the robot should be oriented for its next jump. Sensing is performed using an external motion-tracking system, and an observer algorithm computes the necessary control information using sensor measurements.

As the robot hops, it follows a ballistic trajectory, arcing through the air. At the peak of that trajectory, it estimates its landing position. Then, based on its target landing point, the controller calculates the desired takeoff velocity for the next jump. While airborne, the robot flaps its wings to adjust its orientation so it strikes the ground with the correct angle and axis to move in the proper direction and at the right speed.

Durability and flexibility

The researchers put the hopping robot, and its control mechanism, to the test on a variety of surfaces, including grass, ice, wet glass, and uneven soil — it successfully traversed all surfaces. The robot could even hop on a surface that was dynamically tilting.

“The robot doesn’t really care about the angle of the surface it is landing on. As long as it doesn’t slip when it strikes the ground, it will be fine,” Hsiao says.

Since the controller can handle multiple terrains, the robot can easily transition from one surface to another without missing a beat.

For instance, hopping across grass requires more thrust than hopping across glass, since blades of grass cause a damping effect that reduces its jump height. The controller can pump more energy to the robot’s wings during its aerial phase to compensate.

Due to its small size and light weight, the robot has an even smaller moment of inertia, which makes it more agile than a larger robot and better able to withstand collisions.

The researchers showcased its agility by demonstrating acrobatic flips. The featherweight robot could also hop onto an airborne drone without damaging either device, which could be useful in collaborative tasks.

In addition, while the team demonstrated a hopping robot that carried twice its weight, the maximum payload may be much higher. Adding more weight doesn’t hurt the robot’s efficiency. Rather, the efficiency of the spring is the most significant factor that limits how much the robot can carry.

Moving forward, the researchers plan to leverage its ability to carry heavy loads by installing batteries, sensors, and other circuits onto the robot, in the hopes of enabling it to hop autonomously outside the lab.

“Multimodal robots (those combining multiple movement strategies) are generally challenging and particularly impressive at such a tiny scale. The versatility of this tiny multimodal robot — flipping, jumping on rough or moving terrain, and even another robot — makes it even more impressive,” says Justin Yim, assistant professor at the University of Illinois at Urbana-Champagne, who was not involved with this work. “Continuous hopping shown in this research enables agile and efficient locomotion in environments with many large obstacles.”

This research is funded, in part, by the U.S. National Science Foundation and the MIT MISTI program. Chirarattananon was supported by the Research Grants Council of the Hong Kong Special Administrative Region of China. Hsiao is supported by a MathWorks Fellowship, and Kim is supported by a Zakhartchenko Fellowship.

New method efficiently safeguards sensitive AI training data

Data privacy comes with a cost. There are security techniques that protect sensitive user data, like customer addresses, from attackers who may attempt to extract them from AI models — but they often make those models less accurate.

MIT researchers recently developed a framework, based on a new privacy metric called PAC Privacy, that could maintain the performance of an AI model while ensuring sensitive data, such as medical images or financial records, remain safe from attackers. Now, they’ve taken this work a step further by making their technique more computationally efficient, improving the tradeoff between accuracy and privacy, and creating a formal template that can be used to privatize virtually any algorithm without needing access to that algorithm’s inner workings.

The team utilized their new version of PAC Privacy to privatize several classic algorithms for data analysis and machine-learning tasks.

They also demonstrated that more “stable” algorithms are easier to privatize with their method. A stable algorithm’s predictions remain consistent even when its training data are slightly modified. Greater stability helps an algorithm make more accurate predictions on previously unseen data.

The researchers say the increased efficiency of the new PAC Privacy framework, and the four-step template one can follow to implement it, would make the technique easier to deploy in real-world situations.

“We tend to consider robustness and privacy as unrelated to, or perhaps even in conflict with, constructing a high-performance algorithm. First, we make a working algorithm, then we make it robust, and then private. We’ve shown that is not always the right framing. If you make your algorithm perform better in a variety of settings, you can essentially get privacy for free,” says Mayuri Sridhar, an MIT graduate student and lead author of a paper on this privacy framework.

She is joined in the paper by Hanshen Xiao PhD ’24, who will start as an assistant professor at Purdue University in the fall; and senior author Srini Devadas, the Edwin Sibley Webster Professor of Electrical Engineering at MIT. The research will be presented at the IEEE Symposium on Security and Privacy.

Estimating noise

To protect sensitive data that were used to train an AI model, engineers often add noise, or generic randomness, to the model so it becomes harder for an adversary to guess the original training data. This noise reduces a model’s accuracy, so the less noise one can add, the better.

PAC Privacy automatically estimates the smallest amount of noise one needs to add to an algorithm to achieve a desired level of privacy.

The original PAC Privacy algorithm runs a user’s AI model many times on different samples of a dataset. It measures the variance as well as correlations among these many outputs and uses this information to estimate how much noise needs to be added to protect the data.

This new variant of PAC Privacy works the same way but does not need to represent the entire matrix of data correlations across the outputs; it just needs the output variances.

“Because the thing you are estimating is much, much smaller than the entire covariance matrix, you can do it much, much faster,” Sridhar explains. This means that one can scale up to much larger datasets.

Adding noise can hurt the utility of the results, and it is important to minimize utility loss. Due to computational cost, the original PAC Privacy algorithm was limited to adding isotropic noise, which is added uniformly in all directions. Because the new variant estimates anisotropic noise, which is tailored to specific characteristics of the training data, a user could add less overall noise to achieve the same level of privacy, boosting the accuracy of the privatized algorithm.

Privacy and stability

As she studied PAC Privacy, Sridhar hypothesized that more stable algorithms would be easier to privatize with this technique. She used the more efficient variant of PAC Privacy to test this theory on several classical algorithms.

Algorithms that are more stable have less variance in their outputs when their training data change slightly. PAC Privacy breaks a dataset into chunks, runs the algorithm on each chunk of data, and measures the variance among outputs. The greater the variance, the more noise must be added to privatize the algorithm.

Employing stability techniques to decrease the variance in an algorithm’s outputs would also reduce the amount of noise that needs to be added to privatize it, she explains.

“In the best cases, we can get these win-win scenarios,” she says.

The team showed that these privacy guarantees remained strong despite the algorithm they tested, and that the new variant of PAC Privacy required an order of magnitude fewer trials to estimate the noise. They also tested the method in attack simulations, demonstrating that its privacy guarantees could withstand state-of-the-art attacks.

“We want to explore how algorithms could be co-designed with PAC Privacy, so the algorithm is more stable, secure, and robust from the beginning,” Devadas says. The researchers also want to test their method with more complex algorithms and further explore the privacy-utility tradeoff.

“The question now is: When do these win-win situations happen, and how can we make them happen more often?” Sridhar says.

“I think the key advantage PAC Privacy has in this setting over other privacy definitions is that it is a black box — you don’t need to manually analyze each individual query to privatize the results. It can be done completely automatically. We are actively building a PAC-enabled database by extending existing SQL engines to support practical, automated, and efficient private data analytics,” says Xiangyao Yu, an assistant professor in the computer sciences department at the University of Wisconsin at Madison, who was not involved with this study.

This research is supported, in part, by Cisco Systems, Capital One, the U.S. Department of Defense, and a MathWorks Fellowship.

Student Spotlight: YongYan (Crystal) Liang

This interview is part of a series of short interviews from the Department of EECS, called Student Spotlights. Each Spotlight features a student answering their choice of questions about themselves and life at MIT. Today’s interviewee, YongYan (Crystal) Liang, is a senior majoring in 6-2, Electrical Engineering and Computer Science. Liang has a particular interest in bioengineering and medical devices, which lead her to join the Living Machines track as part of NEET. A SuperUROP scholar, Liang was supported by the Nadar Foundation Undergraduate Research and Innovation Scholar award for her project, which focused on steering systems for intravascular drug delivery devices. A world traveler, Liang has also taught robotics to students in MISTI GTL (Global Teaching Labs) programs in Korea and Germany–and is involved with the Terrascope and Medlinks communities. She took time out of her busy schedule to answer a selection of questions about her experiences at MIT!

Do you have a bucket list? If so, share one or two of the items on it!

I’d like to be proficient in at least five languages in a conversational sense (though probably not at a working proficiency level). Currently, I’m fluent in English and can speak Cantonese and Mandarin. I also have a 1600+ day Duolingo streak where I’m trying to learn the foundations of a few languages, including German, Korean, Japanese, and Russian.

Liang in Genoa, Italy.

Another bucket list item I have is to try every martial art/combat sport there is, even if it’s just an introduction class. So far, I’ve practiced Taekwondo for a few years, taken a few lessons in Boxing/Kickboxing, and dabbled in beginners’ classes for Karate, Krav Maga, and Brazilian Jiujitsu. I’ll probably try to take Judo, Aikido, and other classes this upcoming year! It would also be pretty epic to be a 4th dan black belt one day, though that may take a decade or two…

Liang in Pisa, Italy.

If you had to teach a really in-depth class about one niche topic, what would you pick?

Personally, I think artificial organs are pretty awesome! I would probably talk about the fusion of engineering with our bodies, and organ enhancement. This might include adding functionalities and possible organ regeneration, so that those waiting for organ donations can be helped without being morally conflicted by waiting for another person’s downfall. I’ve previously done research in several BioEECS related labs that I’d love to talk about as well. This includes the Traverso lab at Pappalardo, briefly in the Edelman lab at the IMES (Institute for Medical Engineering and Science), the Langer Lab at the Koch Institute of Integrative Cancer Research, as well as in the MIT Media Lab with the Conformable Decoders and BioMechatronics group! I also contributed to a recently published paper related to gastrointestinal devices: OSIRIS.  

If you suddenly won the lottery, what would you spend some of the money on?

I would make sure my mom got most of the money. The first thing we’d do is probably go house shopping around the world and buy properties in great travel destinations–then go around and live in said properties. We would do this on rotation with our friends until we ran out of money, then put the properties up for rent and use the money to open a restaurant with my mom’s recipes as the menu. Then I’d get to eat her food forever 🙂

Liang shares a special moment with her mom in front of the Great Dome.

What do you believe is an underrated invention or technology? Why’s it so important?

I feel like many people wear glasses or put on contacts nowadays and don’t really think twice about it, glossing over how cool it is that we can fix bad sight and how critical sight is for our survival. If a zombie apocalypse happened and my glasses broke, it would be over for me 🙁 And don’t get me started about the invention of the indoor toilet and plumbing systems…

Are you a re-reader or a re-watcher—and if so, what are your comfort books, shows, or movies?

I’m both a re-reader and a re-watcher! I have a lot of fun binging webtoons and dramas. I’m also a huge Marvel fan, although recently, it’s been a hit or miss. Action and rom coms are my kinda vibes and occasionally I do watch some anime. If I’m bored I usually rewatch some MCU movies or Fairy Tail or read some Isekai genre stories. 

Crystal hangs out with Iron Man and the Hulk in Changwon, Korea.

It’s time to get on the shuttle to the first Mars colony, and you can only bring one personal item. What are you going to bring along with you?

My first thought was my phone, but I feel like that may be too standard of an answer. If we were talking about the fantasy realm, I might ask Stephen Strange to borrow his sling ring to open more portals to link the Earth and Mars. As to why he wouldn’t have just come with us in the first place, I don’t know, maybe he’s too busy fighting aliens or something?

What are you looking forward to about life after graduation? What do you think you’ll miss about MIT?

I won’t be missing dining hall food very much, that’s for sure. (Except for the amazing oatmeal from one of the Maseeh dining hall chefs, Sum!) I am, however, excited to live the 9-5 life for a few years and have my weekends back. I’ll miss my friends dearly since everyone will be so spread out across the States and abroad. I’ll miss the nights we spent watching movies, playing games, cooking, eating and yapping away. I’m excited to see everyone grow and take another step closer to their dreams. It will be fun visiting them and being able to explore the world at the same time ! For more immediate plans, I’ll be going back to Apple this summer to intern again and will finish my MEng with the 6A program at Cadence. Afterwards, I shall see where life takes me!

Liang in Berlin, Germany.