// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

Shadow Mode

Shadow mode means running a new model alongside the old one, but only the old model's predictions are used, while the new model's predictions are just observed.

Shadow Mode — illustration from Wikipedia
Image via Wikipedia

TECHNICAL DEFINITION

Shadow Mode deployment involves running a new machine learning model in parallel with the currently active production model, routing a copy of live inference requests to the new model, but only using the active model's predictions for actual responses.

BACKGROUND

ChatGPT is a generative artificial intelligence chatbot developed by OpenAI. Originally released in November 2022, the product utilizes large language models—specifically generative pre-trained transformers (GPTs)—to generate text, speech, and images in response to user prompts. ChatGPT accelerated the AI boom, an ongoing period marked by rapid investment and public attention toward the field of artificial intelligence (AI). OpenAI operates the service on a freemium model. Users can interact with ChatGPT through text, audio, and image prompts.

READ MORE ON WIKIPEDIA

SYNONYMS & ALIASES

  • Dark launch
  • passive deployment
  • silent deployment

USAGE NOTE

Shadow mode is excellent for evaluating a new model's performance with real-world data without impacting users.

DEVELOPERS

Organizations developing technology related to Shadow Mode.

  • Seldon

    An MLOps company providing tools to deploy, monitor, and manage machine learning models on Kubernetes. Their platform, Seldon Deploy, explicitly supports advanced deployment patterns including shadow deployments, allowing new models to receive real-time production traffic for testing without affecting user-facing systems.

  • Databricks

    A unified data and AI platform whose MLflow component is a standard for managing the machine learning lifecycle. It supports various model deployment strategies, including shadowing (or dark launching), to safely test and validate new models against production data streams.

  • LaunchDarkly

    A leading feature management and experimentation platform. While not exclusively for AI, its core technology is used to implement shadow mode by decoupling code deployment from feature release. This allows teams to route production traffic to a new AI model in the background, log its outputs, and analyze performance without any user impact.

  • Arize AI

    An ML observability platform that provides the critical analysis component for shadow mode deployments. Teams use Arize to monitor a shadow model's predictions, compare its performance and data drift against the live production model, and validate its readiness for promotion.

  • Verta

    An MLOps platform for model management, deployment, and monitoring. Verta provides capabilities for safe and controlled model rollouts, explicitly supporting shadow deployment to test a model's performance on live data before exposing it to users.

  • Langfuse

    An open-source observability and analytics platform for LLM applications. It enables developers to trace and evaluate different versions of prompts or models. This is essential for shadow mode, as it allows for direct comparison of a new prompt's performance against the production version using live traffic.

  • Humanloop

    A platform for prompt engineering and evaluating LLM applications. Humanloop's tooling is designed to A/B test prompts and models on live data, a process that is functionally a form of shadow testing used to identify the best-performing versions before a full rollout.

  • Split

    A feature delivery and experimentation platform that provides tools for controlled rollouts. Its infrastructure is used to implement shadow testing by sending a copy of production requests to a new model or system and comparing its behavior to the current one, without risk.

RELATED TERMS IN MLOPS & DEPLOYMENT