// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM
One-Hot Encoding
A technique to convert categorical data into a numerical format that machine learning models can understand, creating new binary columns for each category.
TECHNICAL DEFINITION
A categorical feature encoding scheme that transforms nominal categorical variables into a binary vector representation, where a new binary column is created for each unique category, and a '1' indicates the presence of that category.
BACKGROUND
Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information from external data sources. With RAG, LLMs first refer to a specified set of documents, then respond to user queries. These documents supplement information from the LLM's pre-existing training data. This allows LLMs to use domain-specific and/or updated information that is not available in the training data. For example, this enables LLM-based chatbots to access internal company data or generate responses based on authoritative sources.
READ MORE ON WIKIPEDIASYNONYMS & ALIASES
- Dummy encoding
- binary encoding (for categories)
- one-of-K encoding
USAGE NOTE
One-hot encoding is commonly used to prepare categorical features for algorithms that require numerical input.