Tag: large language models

This new metric for measuring uncertainty could flag hallucinations and help users know whether to trust an AI model.

By leveraging idle computing time, researchers can double the speed of model training while preserving accuracy.

The context of long-term conversations can cause an LLM to begin mirroring the user’s viewpoints, possibly reducing accuracy or creating a virtual echo-chamber.

The “self-steering” DisCIPL system directs small models to work together on tasks with constraints, like itinerary planning and budgeting.

Large language models can learn to mistakenly link certain sentence patterns with specific topics — and may then repeat these patterns instead of reasoning.

With a new method developed at MIT, an LLM behaves more like a student, writing notes that it studies to memorize new information.

The coding framework uses modular concepts and simple synchronization rules to make software clearer, safer, and easier for LLMs to generate.

MIT-IBM Watson AI Lab researchers have developed a universal guide for estimating how large language models will perform based on smaller models in the same family.

Language models follow changing situations using clever arithmetic, instead of sequential tracking. By controlling when these approaches are used, engineers could improve the systems’ capabilities.

Researchers find nonclinical information in patient messages — like typos, extra white space, and colorful language — reduces the accuracy of an AI model.

A better method for identifying overconfident large language models

New method could increase LLM training efficiency

Personalization features can make LLMs more agreeable

Enabling small language models to solve complex reasoning tasks

Researchers discover a shortcoming that makes LLMs less reliable

Teaching large language models how to absorb new knowledge

MIT researchers propose a new model for legible, modular software

How to build AI scaling laws for efficient LLM training and budget maximization

The unique, mathematical shortcuts language models use to predict dynamic scenarios

LLMs factor in unrelated information when recommending medical treatments