Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models

Qitan Lv, Jie Wang, Hanzhu Chen, Bin Li, Yongdong Zhang, Feng Wu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:33594-33623, 2024.

Abstract

Generation of plausible but incorrect factual information, often termed hallucination, has attracted significant research interest. Retrieval-augmented language model (RALM)—which enhances models with up-to-date knowledge—emerges as a promising method to reduce hallucination. However, existing RALMs may instead exacerbate hallucination when retrieving lengthy contexts. To address this challenge, we propose COFT, a novel COarse-to-Fine highlighTing method to focus on different granularity-level key texts, thereby avoiding getting lost in lengthy contexts. Specifically, COFT consists of three components: recaller, scorer, and selector. First, recaller applies a knowledge graph to extract potential key entities in a given context. Second, scorer measures the importance of each entity by calculating its contextual weight. Finally, selector selects high contextual weight entities with a dynamic threshold algorithm and highlights the corresponding paragraphs, sentences, or words in a coarse-to-fine manner. Extensive experiments on knowledge hallucination benchmark demonstrate the effectiveness of COFT, leading to a superior performance over 30% in F1 score metric. Moreover, COFT also exhibits remarkable versatility across various long-form tasks, such as reading comprehension and question answering.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-lv24c, title = {Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models}, author = {Lv, Qitan and Wang, Jie and Chen, Hanzhu and Li, Bin and Zhang, Yongdong and Wu, Feng}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {33594--33623}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/lv24c/lv24c.pdf}, url = {https://proceedings.mlr.press/v235/lv24c.html}, abstract = {Generation of plausible but incorrect factual information, often termed hallucination, has attracted significant research interest. Retrieval-augmented language model (RALM)—which enhances models with up-to-date knowledge—emerges as a promising method to reduce hallucination. However, existing RALMs may instead exacerbate hallucination when retrieving lengthy contexts. To address this challenge, we propose COFT, a novel COarse-to-Fine highlighTing method to focus on different granularity-level key texts, thereby avoiding getting lost in lengthy contexts. Specifically, COFT consists of three components: recaller, scorer, and selector. First, recaller applies a knowledge graph to extract potential key entities in a given context. Second, scorer measures the importance of each entity by calculating its contextual weight. Finally, selector selects high contextual weight entities with a dynamic threshold algorithm and highlights the corresponding paragraphs, sentences, or words in a coarse-to-fine manner. Extensive experiments on knowledge hallucination benchmark demonstrate the effectiveness of COFT, leading to a superior performance over 30% in F1 score metric. Moreover, COFT also exhibits remarkable versatility across various long-form tasks, such as reading comprehension and question answering.} }
Endnote
%0 Conference Paper %T Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models %A Qitan Lv %A Jie Wang %A Hanzhu Chen %A Bin Li %A Yongdong Zhang %A Feng Wu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-lv24c %I PMLR %P 33594--33623 %U https://proceedings.mlr.press/v235/lv24c.html %V 235 %X Generation of plausible but incorrect factual information, often termed hallucination, has attracted significant research interest. Retrieval-augmented language model (RALM)—which enhances models with up-to-date knowledge—emerges as a promising method to reduce hallucination. However, existing RALMs may instead exacerbate hallucination when retrieving lengthy contexts. To address this challenge, we propose COFT, a novel COarse-to-Fine highlighTing method to focus on different granularity-level key texts, thereby avoiding getting lost in lengthy contexts. Specifically, COFT consists of three components: recaller, scorer, and selector. First, recaller applies a knowledge graph to extract potential key entities in a given context. Second, scorer measures the importance of each entity by calculating its contextual weight. Finally, selector selects high contextual weight entities with a dynamic threshold algorithm and highlights the corresponding paragraphs, sentences, or words in a coarse-to-fine manner. Extensive experiments on knowledge hallucination benchmark demonstrate the effectiveness of COFT, leading to a superior performance over 30% in F1 score metric. Moreover, COFT also exhibits remarkable versatility across various long-form tasks, such as reading comprehension and question answering.
APA
Lv, Q., Wang, J., Chen, H., Li, B., Zhang, Y. & Wu, F.. (2024). Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:33594-33623 Available from https://proceedings.mlr.press/v235/lv24c.html.

Related Material