如何构建一个糟糕的 RAG 系统:开发者最常犯的错误清单
原文:链接
这篇文章超有意思!
标题是「How to build a terrible RAG system」,提供了一系列的反面建议,手把手教你搞出一个糟糕的RAG 系统 🤣
1.忽略延迟问题 / Don't worry about latency
2.隐藏中间过程 / Don't show intermediate results
3.不展示源文档 / Don't Show Them the Source Document
4.忽视用户流失率 / We Should Not Worry About Churn
5.使用通用搜索索引 / We Should Use a Generic Search Index
6.不开发定制界面 / We Should Not Develop Custom UI
7.不微调嵌入向量 / We Should Not Fine-Tune Our Embeddings
8.训练大型语言模型(LLM) / We Should Train an LLM
9.不手动策划内容库存 / We Should Not Manually Curate Our Inventory
10.不分析用户查询 / We Should Not Analyze Inbound Queries
11.机器学习工程师不参与数据摄取 / Machine Learning Engineers Should Not Be Involved in Ingestion
12.使用知识图谱 / We Should Use a Knowledge Graph
13.平等对待所有内容库存 / We should treat all inbound inventory the same
14.不需要特殊数据摄取管道 / We should not have to build special injestion pipelines
15.不要求数据提供者提供完整数据 / We should never have to ask the data provider for clean data
16.不进行库存聚类 / We should never have to cluster our inventory
17.专注于本地评估而非A/B测试 / We should focus on local evals and not A/B tests
