// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

Safety

The measures and principles applied to AI systems to prevent them from generating harmful, biased, or inappropriate content and to ensure their responsible use.

TECHNICAL DEFINITION

AI Safety encompasses the design, implementation, and operational practices aimed at mitigating risks associated with AI systems, including preventing the generation of harmful, biased, toxic, or unethical content, ensuring alignment with human values, and establishing robust guardrails against misuse.

BACKGROUND

Prompt injection is a cybersecurity exploit and an attack vector in which innocuous-looking inputs are designed to cause unintended behavior in machine learning models, particularly large language models (LLMs). The attack takes advantage of the model's inability to distinguish between developer-defined prompts and user inputs to bypass safeguards and influence model behaviour. While LLMs are designed to follow trusted instructions, they can be manipulated into carrying out unintended responses through carefully crafted inputs.

READ MORE ON WIKIPEDIA

SYNONYMS & ALIASES

  • AI Ethics
  • Responsible AI
  • Content Moderation
  • Harm Reduction
  • Alignment

USAGE NOTE

Safety is a paramount concern in deploying AI models, requiring continuous monitoring and refinement of moderation techniques.

DEVELOPERS

Organizations developing technology related to Safety.

  • Anthropic

    A leading AI safety and research company that developed 'Constitutional AI', an approach to align AI models with human values by providing a set of principles, directly influencing prompt design and model engineering for safety.

  • OpenAI

    Develops advanced AI models like ChatGPT and DALL-E, with significant investment in AI safety, alignment research, and responsible deployment practices that influence both the engineering of their models and guidance on safe prompt design.

  • Google DeepMind

    Conducts extensive research in AI safety, ethics, and responsible AI, integrating these principles into the development of their foundational models (e.g., Gemini) and providing frameworks for safe AI engineering and prompt design.

  • Microsoft Azure AI

    Offers a comprehensive suite of Responsible AI tools, safety filters, and guidelines within its Azure AI platform to help developers engineer and deploy AI systems ethically and securely, including features for safe prompt engineering.

  • AI Safety Institute (US)

    A U.S. government organization dedicated to conducting advanced AI safety research, developing evaluations, and setting standards to ensure that frontier AI models are safe, secure, and trustworthy, directly impacting AI engineering safety.

  • Meta AI

    Conducts research and develops technologies related to responsible AI, including robustness against adversarial attacks, fairness, and mitigation of harmful outputs, contributing to safer AI engineering and prompting practices for their models.

  • Hugging Face

    Provides a platform for machine learning, hosting open-source models and tools. They promote responsible AI development, offering resources and evaluation tools for model safety, bias detection, and ethical considerations, supporting safer AI engineering and prompt design.

RELATED TERMS IN PROMPTING & LOGIC