admin管理员组

文章数量:1531793

LLMs之Lamini-1:《Banishing LLM Hallucinations Requires Rethinking Generalization消除大型语言模型幻觉需要重新思考泛化问题》翻译与解读

导读:论文从实验的角度,重新审视现有大规模语言模型(LLM)在推理过程中的“幻想”现象,并提出一种名为“Lamini记忆调整”的新方法来消除这一问题。主要核心点总结如下:

>> 背景痛点

当前LLM的训练方法重视泛化误差,无法消除对关键事实的“幻想”。

泛化误差不能区分有无幻想的模型。

消除幻想需要的计算量远大于缩放规律所需。

>> 解决方案

通过随机化实验证明LLM容易拟合随机标签,但不影响泛化能力。

提出“Lamini记忆调整”训练方法,通过重训练精确还原关键事实。

开发基于大量记忆专家混合体(MoME)的Lamini-1模型架构。

>> 核心思路

将事实存储在MoME的专家中,通过交叉注意力选择相关专家。

固定主干网络和注意力机制,仅更新选择的专家实现精确求解。

大大降低每个事实训练的计算成本。

>> 主要优势

理论上可以精确存储与参数数量等规模的事实数量。

通过系统优化大幅降低消除幻想所需计算。

提供一代试点模型证明此框架在事实回溯上的效果。

目录

《Banishing LLM Hallucinations Requires Rethinking Generalization》翻译与解读

Abstract

11 Conclusion


《Banishing LLM Hallucinations Requires Rethinking Generalization》翻译与解读

地址

论文地址:https://arxiv/abs/2406.17642

时间

2024年6月25日

作者

Lamini团队

Abstract

Despite their powerful chat, coding, and reasoning abilities, Large Language Models (LLMs) frequently hallucinate. Conventional wisdom suggests that hallu-cinations are a consequence of a balance between creativity and factuality, which can be mitigated, but not eliminated, by grounding the LLM in external knowledge sources. Through extensive systematic experiments, we show that these traditional approaches fail to explain why LLMs hallucinate in practice. Specifically, we show that LLMs augmented with a massive Mixture of Memory Experts (MoME) can easily memorize large datasets of random numbers. We corroborate these experimental findings with a theoretical construction showing that simple neural networks trained to predict the next token hallucinate when the training loss is above a threshold as it usually does in practice when training on internet scale data. We interpret our findings by comparing against traditional retrieval methods for mitigating hallucinations. We use our findings to design a first generation model for removing hallucinations - Lamini-1 - that stores facts in a massive mixture of millions of memory experts that are retrieved dynamically.

虽然大型语言模型(LLMs)具有强大的聊天、编码和推理能力,但经常会出现幻觉。传统观念认为,幻觉是创造力和事实之间的平衡的结果,可以通过将LLM与外部知识源联系起来来减轻,但无法完全消除。通过广泛系统的实验,我们发现这些传统方法无法解释LLM为何在实践中会出现幻觉。具体而言,我们发现,通过增加大量混合记忆专家(MoME)的LLM可以轻松记忆大量随机数数据集。我们用理论构建支持这些实验结果,该构建表明,简单的神经网络在训练损失高于阈值时会出现幻觉,这在实践中通常发生,尤其是在以互联网规模数据进行训练时。我们通过与传统检索方法对比解释了我们的发现,用这些发现设计了一个用于去除幻觉的第一代模型——Lamini-1,它将事实存储在数百万个记忆专家的大量混合中,并且可以动态检索出来。

11 Conclusion

This paper presents a groundbreaking study that challenges the conventional wisdom on large language models (LLMs) and their ability to generalize without hallucinations. We demonstrate that LLMs can easily memorize random labels without increasing their generalization error, contradicting the notion that hallucinations are a consequence of a balance between creativity and factuality. Furthermore, we show that generalization error does not discriminate between models that hallucinate and those that don’t, and that training long enough to remove hallucinations is computationally intensive and may not be feasible on existing systems in 2024. Our study highlights the need for new metrics and approaches to evaluate the ability of LLMs to memorize and recall facts precisely, and suggests that LLMs have sufficient capacity to store large datasets of facts precisely, even when the training data is noisy or random. The findings have significant implications for the development of LLMs, their applications, and related deep neural networks trained with SGD. Our results underscore the importance of rethinking the design and training of these models to mitigate hallucinations and improve factual recall.

本文介绍了一项开创性的研究,挑战了关于大型语言模型(LLMs)及其在没有幻觉情况下泛化能力的传统认知。我们证明了LLMs可以轻松地记住随机标签,而不会增加其泛化误差,这与幻觉是创造力和事实之间平衡的结果的观念相矛盾。此外,我们还表明,泛化误差并不区分是否会产生幻觉的模型,以及那些不会产生幻觉的模型,并且要消除幻觉需要进行耗时的训练,可能在2024年的现有系统上无法实现。我们的研究突显了需要新的指标方法来评估LLMs精确记忆和召回事实的能力,并提出LLMs有足够的容量来精确存储大量数据集中的事实,即使训练数据是嘈杂或随机的。这些发现对LLMs的发展、应用以及使用SGD训练的相关深度神经网络具有重要意义。我们的结果强调了重新思考设计和训练这些模型的重要性,以减少幻觉并提高事实召回能力。

本文标签: 幻觉模型语言BanishingLLM