LlamaIndex Primer
Build RAG pipelines that ingest, index, and query your data with production-grade retrieval.
What is LlamaIndex?
LlamaIndex is a data framework for LLM applications. It connects your custom data sources — PDFs, SQL databases, APIs, Notion pages — to large language models through a unified query interface. Think of it as the indexing and retrieval layer that sits between raw documents and your LLM.
Core Concepts
- Documents — raw data objects (text, tables, images) that enter the pipeline.
- Nodes — chunked, metadata-enriched pieces of a Document. The atomic unit of retrieval.
- Index — a data structure over Nodes enabling fast retrieval (vector, keyword, tree, graph).
- Query Engine — the end-to-end pipeline that retrieves Nodes and synthesizes an answer via the LLM.
Quick Start Pattern
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
# 1. Load documents
docs = SimpleDirectoryReader("./data").load_data()
# 2. Build index
index = VectorStoreIndex.from_documents(docs)
# 3. Query
engine = index.as_query_engine()
response = engine.query("What is the summary?")
print(response)When to Use LlamaIndex
Choose LlamaIndex when you need structured retrieval over heterogeneous data — multi-document Q&A, chat-over-docs, structured data extraction, or agent-driven tool use. It shines with complex indexing strategies (recursive retrieval, hybrid search) and integrates natively with LangChain, OpenAI, and local models via Ollama.
Next step: Advanced indexing strategies — recursive retrieval, sub-question decomposition, and agentic RAG patterns.