// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

Performance Monitoring

Performance monitoring tracks how well a model is performing its task, such as its accuracy or speed, in a production setting.

TECHNICAL DEFINITION

Performance monitoring for ML models involves continuously tracking key metrics such as accuracy, precision, recall, F1-score, latency, and throughput, to assess the model's effectiveness and operational efficiency in a production environment.

BACKGROUND

Prompt injection is a cybersecurity exploit and an attack vector in which innocuous-looking inputs are designed to cause unintended behavior in machine learning models, particularly large language models (LLMs). The attack takes advantage of the model's inability to distinguish between developer-defined prompts and user inputs to bypass safeguards and influence model behaviour. While LLMs are designed to follow trusted instructions, they can be manipulated into carrying out unintended responses through carefully crafted inputs.

READ MORE ON WIKIPEDIA

SYNONYMS & ALIASES

  • Model performance tracking
  • metric monitoring
  • operational monitoring

USAGE NOTE

Performance monitoring dashboards provide insights into a model's health and business impact.

DEVELOPERS

Organizations developing technology related to Performance Monitoring.

  • Arize AI

    Arize AI provides a machine learning observability platform that helps data science and ML engineering teams monitor model performance, detect issues like drift, and troubleshoot problems in production AI systems.

  • WhyLabs AI

    WhyLabs offers the AI Observability Platform, which provides continuous monitoring of data and model health, detecting data drift, model performance degradation, and data quality issues in AI applications.

  • Weights & Biases

    Weights & Biases (W&B) provides an MLOps platform for tracking, visualizing, and comparing machine learning experiments, including tools for model monitoring, performance logging, and prompt engineering evaluations.

  • LangChain (LangSmith)

    LangSmith, developed by the creators of LangChain, is a platform for debugging, testing, evaluating, and monitoring large language model (LLM) applications and agent systems, focusing on performance and reliability.

  • Helicone

    Helicone offers an open-source platform for monitoring and managing LLM usage, providing observability into prompt performance, cost, latency, and error rates for AI applications.

  • Vellum

    Vellum is an LLM development platform that includes features for prompt management, testing, and deployment, with built-in monitoring and analytics to track the performance and efficacy of prompts and models in production.

  • Fiddler AI

    Fiddler AI provides an MLOps platform for responsible AI, offering explainability, fairness, and performance monitoring for machine learning models across their lifecycle, including model drift detection and bias analysis.

  • Datadog

    Datadog, a leading monitoring and security platform, offers LLM observability capabilities that allow users to monitor the performance, cost, latency, and token usage of their large language model applications.

RELATED TERMS IN MLOPS & DEPLOYMENT