
Unpacking the bias of large language models
In a new study, researchers discover the root cause of a type of bias in LLMs, paving the way for more accurate and reliable AI systems.

Words like “no” and “not” can cause this popular class of AI models to fail unexpectedly in high-stakes settings, such as medical diagnosis.

The CausVid generative AI tool uses a diffusion model to teach an autoregressive (frame-by-frame) system to rapidly produce stable, high-resolution videos.

Training LLMs to self-detoxify their language
A new method from the MIT-IBM Watson AI Lab helps large language models to steer their own responses toward safer, more ethical, value-aligned outputs.

Could LLMs help design our next medicines and materials?
A new method lets users ask, in plain language, for a new molecule with certain properties, and receive a detailed description of how to synthesize it.

Like human brains, large language models reason about diverse data in a general way
A new study shows LLMs represent different data types based on their underlying meaning and reason about data in their dominant language.

Despite its impressive output, generative AI doesn’t have a coherent understanding of the world
Researchers show that even the best-performing large language models don’t form a true model of the world and its rules, and can thus fail unexpectedly on similar tasks.

Making it easier to verify an AI model’s responses
By allowing users to clearly see data referenced by a large language model, this tool speeds manual validation to help users spot AI errors.

Enhancing LLM collaboration for smarter, more efficient solutions
“Co-LLM” algorithm helps a general-purpose AI model collaborate with an expert large language model by combining the best parts of both answers, leading to more factual responses.

Method prevents an AI model from being overconfident about wrong answers
More efficient than other approaches, the “Thermometer” technique could help someone know when they should trust a large language model.