Notes on RAG from the creator of GitHub - agentset-ai/agentset — open-source platform to build, evaluate, and ship production-ready RAG and agentic applications. It provides end-to-end tooling: ingestion, vector indexing, evaluation/benchmarks, chat playground, hosting, and a clean API with first-class developer experience.
What made the difference
- Query Generation — review the thread and generate a number of semantic + keyword queries. We processed all of those queries in parallel, and passed them to a reranker. This made us cover a larger surface area and not be dependent on a computed score for hybrid search.
- Reranking — “the highest value 5 lines of code you’ll add. The chunk ranking shifted a lot. More than you’d expect. Reranking can many times make up for a bad setup if you pass in enough chunks.”
- Chunking Strategy — * this takes a lot of effort, you’ll probably be spending most of your time on it.*
- Metadata to LLM — Injecting metadata (title, author, etc.) improves context and answers by a lot.
- Query routing: many users asked questions that can’t be answered by RAG (e.g. summarize the article, who wrote this). We created a small router that detects these questions and answers them using an API call + LLM instead of the full-blown RAG set-ups.
Stack
- turbopuffer for vector-databases
- UnstructuredIO for chunking
- Zerank by ZeroEntropy - Smarter, Faster Models for Search for Reranking
- GPT-4.1