HybridQA
Introduction
Hybird QA是一个大规模的问答数据集,它要求对异构信息进行推理。在每个问题中,都会关联一个维基百科表格以及与表格中实体相连的多个自由形式的语料库。这些问题旨在综合表格信息和文本信息,也就是说,如果缺少这两种形式中的任何一种,问题都将无法得到解答。这种设计使得Hybird QA成为一个挑战性的数据集,因为它要求模型不仅理解表格数据,还要理解与之相关的文本内容,并将两者结合起来以提供准确的答案。
Meta Data
{
“id”: A string representing the question id,
“question”: A string representing the question
“source”: A string representing the source data
“answer”: A list of string representing the answers //The test file does not contain this
}
Example
{
“id”: “00009b9649d0dd0a”,
“question”: “Who were the builders of the mosque in Herat with fire temples ?”,
“source”: “List_of_mosques_in_Afghanistan_0”,
“answer”: [
“Ghurids”
]
}
submit form
[
{
“WebQSP”: [
{“q001”: [“answer1”, “answer2”, …]},
{“q002”: [“answer1”, “answer2”, …]}
]
},
{
“WTQ”: [
{“q0000001”: [“answer1”, “answer2”, …]},
{“q0000002”: [“answer1”, “answer2”, …]}
]
}
]
