// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM
RAG
An AI technique where a model first searches for relevant information from a knowledge base and then uses that information to generate a more accurate and informed response.
TECHNICAL DEFINITION
Retrieval Augmented Generation (RAG) is an architectural pattern for LLMs that enhances generation by first retrieving relevant documents or data snippets from an external knowledge base (e.g., vector database) and then conditioning the LLM's response generation on this retrieved context, mitigating hallucination and enabling grounding.
BACKGROUND
Prompt engineering is the process of structuring natural language inputs to produce specified outputs from a generative artificial intelligence (GenAI) model. Context engineering is the related area of software engineering that focuses on the management of non-prompt contexts supplied to the GenAI model, such as metadata, API tools, and tokens.
READ MORE ON WIKIPEDIASYNONYMS & ALIASES
- Retrieval-Augmented LLM
- Grounded Generation
- External Knowledge Integration
USAGE NOTE
RAG is widely used to provide LLMs with up-to-date, domain-specific, and factual information beyond their training data.
DEVELOPERS
Organizations developing technology related to RAG.
A data framework for LLM applications that specializes in connecting custom data sources to large language models, making it a primary tool for building RAG systems.
An open-source framework for developing applications powered by language models, providing extensive modules and integrations for building and orchestrating RAG pipelines.
A leading vector database that provides fast, scalable similarity search for embedding vectors, which is a critical component for the retrieval step in RAG systems.
An open-source vector database that allows developers to store data objects and vector embeddings for efficient search and retrieval, making it ideal for RAG applications.
Provides a platform, libraries (like Transformers and Datasets), and pre-trained models that are widely used to build and experiment with various components of RAG systems, including embedding models and language models.
Offers powerful language models and embedding models that are integral to RAG workflows, enabling both the retrieval of relevant information and the generation of contextually aware responses.
The company behind Milvus, an open-source vector database, offering cloud solutions for vector search that are crucial for scaling the retrieval component of RAG applications.
Through Azure AI services, Microsoft offers tools like Azure AI Search and Azure OpenAI Service, enabling enterprises to build and deploy RAG solutions by integrating LLMs with their proprietary data.