Build LlamaIndex RAG System

ai_agents
Python
architecture
mentor
Remix

Create a production-ready RAG system with vector indexing, chat interface, and persistent storage.

12/8/2025

Prompt

Build a production-ready LlamaIndex RAG system for [PROJECT_NAME] with the following specifications:

Requirements

  • Project name: [PROJECT_NAME]
  • Document sources: [DIRECTORY_PATH or FILE_PATHS]
  • Document types: [PDF/TXT/DOCX/MD]
  • LLM provider: [OpenAI/Anthropic/Local]
  • Model name: [gpt-4/claude-3/llama-2]
  • Embedding model: [text-embedding-ada-002/custom]
  • Vector store: [Default/Pinecone/Weaviate/Chroma]
  • Metadata fields: [FIELD_1], [FIELD_2], [FIELD_3]
  • Query modes: [Simple Q&A/Chat/Streaming]
  • Persistence needed: [YES/NO]
  • Retrieval strategy: [Similarity/MMR/Rerank]

Deliverables

Generate complete LlamaIndex RAG system with:

Document loading and processing:

  • SimpleDirectoryReader or custom loader for [DOCUMENT_TYPES]
  • Document metadata extraction for [METADATA_FIELDS]
  • Text splitter configuration for optimal chunk sizes
  • Document preprocessing and cleaning

Vector index creation:

  • VectorStoreIndex setup with [VECTOR_STORE]
  • Custom ServiceContext with [LLM_PROVIDER] and [MODEL_NAME]
  • Embedding configuration for [EMBEDDING_MODEL]
  • Index persistence to [STORAGE_PATH] if needed

Query engine setup:

  • Query engine with [RETRIEVAL_STRATEGY]
  • Response synthesis mode configuration
  • Custom prompts for [PROJECT_NAME] use case
  • Similarity threshold and top-k configuration
  • Metadata filtering for [METADATA_FIELDS]

Chat engine if needed:

  • Conversational interface with context retention
  • Chat mode selection (condense_question/context/refine)
  • Chat history management
  • Streaming responses if required

Advanced features:

  • Custom prompt templates for domain-specific responses
  • Response evaluation and feedback loop
  • Query transformation for better retrieval
  • Sub-question query engine for complex questions

Storage and loading:

  • Index persistence implementation
  • Loading from storage for fast startup
  • Incremental index updates

Output complete, production-ready RAG application with all configuration and usage examples.

Tags

llamaindex
rag
retrieval
ai

Tested Models

gpt-4-turbo
gpt-4

Comments (0)

Sign in to leave a comment

Sign In