A new training method improves the reliability of AI confidence estimates without sacrificing performance, addressing a root cause of hallucination in reasoning models.
MIT-IBM Watson AI Lab researchers have developed a universal guide for estimating how large language models will perform based on smaller models in the same family.
Language models follow changing situations using clever arithmetic, instead of sequential tracking. By controlling when these approaches are used, engineers could improve the systems’ capabilities.