Skip to content

RAG / Context Engineering Complete Guide

Combine LLMs with retrieval to produce grounded answers and leverage your own knowledge base.
Long context, MCP, Agentic Search, and code exploration now sit beside RAG, so the design question is what to retrieve, what to place in context, and what to expose as tools.

What this hub covers

Before diving into papers, OSS, or implementation posts, this hub treats RAG, long context, MCP, and Agentic Search as different ways to provide evidence to models. The focus is not only what each technology is, but when to choose it.


What Is RAG?

RAG, or Retrieval-Augmented Generation, is a design pattern where the system retrieves relevant documents or data at answer time and passes them to the model as context. The core question is not only whether to use a vector database. It also includes retrieval scope, permissions, freshness, evidence display, and evaluation.

Start Here

Design Questions

LensQuestion
Retrieval scopeWhich documents, databases, or code should the system reference?
ContextAt what granularity should retrieved evidence be passed to the model?
PermissionsHow should access differ by user or role?
FreshnessHow often should documents or indexes be updated?
EvaluationHow should retrieval quality and answer quality be measured separately?

Implementation Archive

These are older implementation-oriented articles with SageMaker assumptions. Start with the design articles above before using them.