RAG 数据集及评价标准
1.使用LLM生成问题及答案 2.使用标注数据集 1.LightRAG 数据集 使用MemoRAG提出的Benchmark。 在UltraDomain里,包含多个领域的数据,每个数据包括多本书。以cs为例,共含有100本书和100个对应的问题。该领域专注于计算机科学,涵盖数据科学和软件工程的关键领域。它特别强调机器学习和大数据处理,内容涉及推荐系统、分类算法以及使用Spark进行实时分析。: { input: How does Spark Streaming enable real-time data processing? answers: ['Spark Streaming extends ...... '] context: "Whole Book......" length: 131651 context_id: 7bcef8714a477fd61fc8fb0d499b2cc3 _id: b2fd8d9c6d1499d521d778ce3d6d06fa label: cs meta: {'title': 'Machine Learning With Spark', 'authors': 'Nick Pentreath'} } 数据集地址:TommyChien/UltraDomain · Datasets at Hugging Face 问题生成 生成问题的方法来自于From Local to Global: A Graph RAG Approach to Query-Focused Summarization 提供文本,让大模型生成K个使用该数据集的用户身份(比如数据集是财经新闻,user就可能是收集金融市场趋势的财经记者),对于每个用户再生成N个任务,每个用户-任务提出M个高层次问题(理解整个数据集、无需提取具体事实) User: A tech journalist looking for insights and trends in the tech industry Task: Understanding how tech leaders view the role of policy and regulation Questions: 1. Which episodes deal primarily with tech policy and government regulation? 2. How do guests perceive the impact of privacy laws on technology development? 3. Do any guests discuss the balance between innovation and ethical considerations? 4. What are the suggested changes to current policies mentioned by the guests? 5. Are collaborations between tech companies and governments discussed and how? 评价标准 不使用黄金标准答案,使用LLM评价。包括 ...