// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

AI Risk

The potential for AI systems to cause harm, make errors, or have unintended negative consequences.

TECHNICAL DEFINITION

AI Risk refers to the potential for adverse outcomes, including economic, social, ethical, safety, or security harms, arising from the design, development, deployment, or misuse of artificial intelligence systems.

BACKGROUND

Prompt injection is a cybersecurity exploit and an attack vector in which innocuous-looking inputs are designed to cause unintended behavior in machine learning models, particularly large language models (LLMs). The attack takes advantage of the model's inability to distinguish between developer-defined prompts and user inputs to bypass safeguards and influence model behaviour. While LLMs are designed to follow trusted instructions, they can be manipulated into carrying out unintended responses through carefully crafted inputs.

SYNONYMS & ALIASES

AI hazards
AI threats
AI dangers
AI vulnerabilities

USAGE NOTE

Identifying and mitigating AI risk is a primary concern for developers and policymakers.

DEVELOPERS

Organizations developing technology related to AI Risk.

Anthropic
An AI safety and research company that builds reliable, interpretable, and steerable AI systems. They develop techniques like Constitutional AI to train models to be helpful and harmless without direct human supervision on harmful queries.
OpenAI
A major AI research and deployment company with dedicated teams focusing on safety and alignment. They develop governance frameworks and technical methods, such as Reinforcement Learning from Human Feedback (RLHF), to manage the risks of increasingly powerful models.
Google DeepMind
A leading AI research laboratory with extensive programs in AI safety, ethics, and robustness. Their work includes developing techniques for model interpretability, evaluating for social biases, and creating safer reinforcement learning agents.
Center for AI Safety (CAIS)
A non-profit organization that researches how to reduce societal-scale risks from AI. They conduct technical research and build evaluations to test for dangerous capabilities in frontier AI models.
Alignment Research Center (ARC)
A non-profit research organization focused on the theoretical and technical challenges of aligning advanced AI systems with human intent. They work on problems like scalable oversight and preventing models from engaging in deceptive alignment.
Credo AI
A company providing an AI governance platform designed to help organizations operationalize responsible AI. Their software enables businesses to assess, manage, and report on AI risks related to fairness, performance, security, and compliance.
Conjecture
An AI research company focused exclusively on developing scalable and verifiable AI alignment solutions. Their work involves creating technologies to ensure advanced AI systems remain controllable and aligned with human values.
FAR AI
A research organization dedicated to advancing AI safety and alignment through targeted research programs and competitions. They focus on evaluating large language models for dangerous capabilities and developing scalable oversight techniques.

RELATED TERMS IN AI ETHICS & SAFETY

BACK TO AI ENGINEERING & PROMPT DESIGN LEXICON

TECHNICAL DEFINITION

BACKGROUND

SYNONYMS & ALIASES

USAGE NOTE

DEVELOPERS

Anthropic

OpenAI

Google DeepMind

Center for AI Safety (CAIS)

Alignment Research Center (ARC)

Credo AI

Conjecture

FAR AI

RELATED TERMS IN AI ETHICS & SAFETY