// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

Capability Control

Methods designed to limit or restrict the abilities of an AI system to prevent it from causing harm or acting outside its intended purpose.

TECHNICAL DEFINITION

Capability control refers to a set of technical strategies and mechanisms aimed at limiting the scope, power, or autonomy of an AI system, particularly advanced or superintelligent AI, to prevent it from acquiring dangerous capabilities, escaping containment, or acting in ways that could lead to unintended or harmful outcomes.

BACKGROUND

Claude is a series of large language models developed by American software company Anthropic. Named after Claude Shannon, Claude was released as an AI-based chatbot in March 2023. It is also used in AI-assisted software development.

SYNONYMS & ALIASES

AI Containment
AI Restriction
AI Power Limitation

USAGE NOTE

Researchers explore capability control techniques like "AI boxing" to manage powerful AI systems.

DEVELOPERS

Organizations developing technology related to Capability Control.

OpenAI
Develops leading large language models and provides extensive tools for prompt engineering, system prompts, and moderation APIs to control AI behavior and capabilities, emphasizing safety and alignment.
Anthropic
Focuses on AI safety and alignment, notably with 'Constitutional AI' which uses a set of principles to guide and control model capabilities and outputs, making them more helpful, harmless, and honest.
Google DeepMind
Conducts extensive research in AI safety, alignment, and responsible AI, developing methods to understand and control the capabilities of advanced AI models to ensure beneficial outcomes.
Microsoft
Invests heavily in responsible AI development, offering content safety features, prompt engineering guidance, and tools within Azure AI to help developers control and steer AI model capabilities for safer and more reliable applications.
Alignment Research Center (ARC)
A research organization dedicated to ensuring that future AI systems are safe and aligned with human values, which fundamentally involves understanding and controlling their advanced capabilities.
UK AI Safety Institute
A government-backed organization focused on evaluating the safety of advanced AI models, including understanding and testing their capabilities to develop methods for robust control and risk mitigation.
Hugging Face
While primarily a platform, it hosts numerous open-source models, datasets, and tools for fine-tuning and alignment, enabling the community to develop and implement methods for capability control in AI engineering.

RELATED TERMS IN AI ETHICS & SAFETY

BACK TO AI ENGINEERING & PROMPT DESIGN LEXICON

TECHNICAL DEFINITION

BACKGROUND

SYNONYMS & ALIASES

USAGE NOTE

DEVELOPERS

OpenAI

Anthropic

Google DeepMind

Microsoft

Alignment Research Center (ARC)

UK AI Safety Institute

Hugging Face

RELATED TERMS IN AI ETHICS & SAFETY