// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM
Performance Monitoring
Performance monitoring tracks how well a model is performing its task, such as its accuracy or speed, in a production setting.
TECHNICAL DEFINITION
Performance monitoring for ML models involves continuously tracking key metrics such as accuracy, precision, recall, F1-score, latency, and throughput, to assess the model's effectiveness and operational efficiency in a production environment.
BACKGROUND
Prompt injection is a cybersecurity exploit and an attack vector in which innocuous-looking inputs are designed to cause unintended behavior in machine learning models, particularly large language models (LLMs). The attack takes advantage of the model's inability to distinguish between developer-defined prompts and user inputs to bypass safeguards and influence model behaviour. While LLMs are designed to follow trusted instructions, they can be manipulated into carrying out unintended responses through carefully crafted inputs.
READ MORE ON WIKIPEDIASYNONYMS & ALIASES
- Model performance tracking
- metric monitoring
- operational monitoring
USAGE NOTE
Performance monitoring dashboards provide insights into a model's health and business impact.
DEVELOPERS
Organizations developing technology related to Performance Monitoring.
Arize AI provides a machine learning observability platform that helps data science and ML engineering teams monitor model performance, detect issues like drift, and troubleshoot problems in production AI systems.
WhyLabs offers the AI Observability Platform, which provides continuous monitoring of data and model health, detecting data drift, model performance degradation, and data quality issues in AI applications.
Weights & Biases (W&B) provides an MLOps platform for tracking, visualizing, and comparing machine learning experiments, including tools for model monitoring, performance logging, and prompt engineering evaluations.
LangSmith, developed by the creators of LangChain, is a platform for debugging, testing, evaluating, and monitoring large language model (LLM) applications and agent systems, focusing on performance and reliability.
Helicone offers an open-source platform for monitoring and managing LLM usage, providing observability into prompt performance, cost, latency, and error rates for AI applications.
Vellum is an LLM development platform that includes features for prompt management, testing, and deployment, with built-in monitoring and analytics to track the performance and efficacy of prompts and models in production.
Fiddler AI provides an MLOps platform for responsible AI, offering explainability, fairness, and performance monitoring for machine learning models across their lifecycle, including model drift detection and bias analysis.
Datadog, a leading monitoring and security platform, offers LLM observability capabilities that allow users to monitor the performance, cost, latency, and token usage of their large language model applications.