带上下文开放问答
评测集介绍
Introduction
在大型语言模型(Large Language Model,LLM)使用RAG方法解决开放域问答时,检索到的外部上下文知识(Contextual Knowledge)和LLM内部存储的参数知识(Parametric Knowledge)有时会出现不一致,这被定义为知识冲突(Knowledge Conflict)。产生知识冲突主要原因为:
- 时间错位(TemporalMisalignment):LLM的参数知识不能随着世界知识的发展及时更新,致使LLM无法了解最新的世界知识,如询问今年的诺贝尔奖得主是谁;
- 错误信息污染(MisinformationPollution):检索到的上下文知识存在错误,从而误导LLM,如上下文中表明奥巴马出生于纽约(正确答案为夏威夷),LLM可能因为参考了上下文而给出错误答案。
这两种冲突在RAG场景中往往同时出现,但解决第一种冲突需要LLM关注外部知识,解决第二种冲突需要LLM关注内部知识,这是一个困难的任务。因此东南大学联合OpenKG开放知识图谱社区提出了开放领域问答下的知识冲突评测任务,该评测任务贴近真实场景,数据集中包括各种知识冲突和非知识冲突场景,要求LLM利用其参数知识和上下文知识来解决这些问题。
Meta Data
{
“question”: ” Original question.”,
“answers”: “The answer list. “,
“document”: “External Document”,
“conflict_type”: “The annotation of conflict type”,
}
Example
{
“question”: “What is the derivative work of iOS 17?”,
“answers”: [
“iPadOS 17”,
“watchOS 10”,
“macOS Sonoma”
],
“document”: “iOS 17 is the seventeenth and current major release of Apple’s iOS operating system for the iPhone. It is the direct successor to iOS 16, which was released one year earlier. It was announced on June 5, 2023, at Apple’s annual Worldwide Developers Conference alongside watchOS 10, iPadOS 17, and macOS Sonoma. It was made publicly available on September 18, 2023, as a free software update for supported iOS devices (see the supported devices section). iOS 17 has received security and bug-fix updates multiple times a month, and feature updates every few months. Beta builds are sent weekly or biweekly to members of the Apple Developer Program and public beta testers. As with every release since iOS 4, these updates are free to users.”,
“conflict_type”: [
“temporal_misalignment”,
“exist_over_time”,
2023
]
}
Metric
本次任务对每个样本采用开放域问答中的常用指标EM评估预测答案,EM分数越高,说明预测越精确。
submit form
{
“predict_answer”: “According to the document, the derivative work of iOS 17 is iPadOS 17, watchOS 10, macOS Sonoma.
}
