// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

RAG

An AI technique where a model first searches for relevant information from a knowledge base and then uses that information to generate a more accurate and informed response.

TECHNICAL DEFINITION

Retrieval Augmented Generation (RAG) is an architectural pattern for LLMs that enhances generation by first retrieving relevant documents or data snippets from an external knowledge base (e.g., vector database) and then conditioning the LLM's response generation on this retrieved context, mitigating hallucination and enabling grounding.

BACKGROUND

Prompt engineering is the process of structuring natural language inputs to produce specified outputs from a generative artificial intelligence (GenAI) model. Context engineering is the related area of software engineering that focuses on the management of non-prompt contexts supplied to the GenAI model, such as metadata, API tools, and tokens.

READ MORE ON WIKIPEDIA

SYNONYMS & ALIASES

  • Retrieval-Augmented LLM
  • Grounded Generation
  • External Knowledge Integration

USAGE NOTE

RAG is widely used to provide LLMs with up-to-date, domain-specific, and factual information beyond their training data.

DEVELOPERS

Organizations developing technology related to RAG.

  • LlamaIndex

    A data framework for LLM applications that specializes in connecting custom data sources to large language models, making it a primary tool for building RAG systems.

  • LangChain

    An open-source framework for developing applications powered by language models, providing extensive modules and integrations for building and orchestrating RAG pipelines.

  • Pinecone

    A leading vector database that provides fast, scalable similarity search for embedding vectors, which is a critical component for the retrieval step in RAG systems.

  • Weaviate

    An open-source vector database that allows developers to store data objects and vector embeddings for efficient search and retrieval, making it ideal for RAG applications.

  • Hugging Face

    Provides a platform, libraries (like Transformers and Datasets), and pre-trained models that are widely used to build and experiment with various components of RAG systems, including embedding models and language models.

  • Cohere

    Offers powerful language models and embedding models that are integral to RAG workflows, enabling both the retrieval of relevant information and the generation of contextually aware responses.

  • Zilliz

    The company behind Milvus, an open-source vector database, offering cloud solutions for vector search that are crucial for scaling the retrieval component of RAG applications.

  • Microsoft (Azure AI)

    Through Azure AI services, Microsoft offers tools like Azure AI Search and Azure OpenAI Service, enabling enterprises to build and deploy RAG solutions by integrating LLMs with their proprietary data.

RELATED TERMS IN PROMPTING & LOGIC