// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

On-Device

AI processing that happens directly on your personal device (like a phone) instead of in the cloud.

Image via Wikipedia

TECHNICAL DEFINITION

Refers to AI model execution or training that occurs locally on an end-user device (e.g., smartphone, IoT device) rather than on remote cloud servers, offering benefits like reduced latency, offline capability, and enhanced data privacy.

BACKGROUND

Generative artificial intelligence (GenAI) is a subfield of artificial intelligence (AI) that uses generative models to generate text, images, videos, audio, software code or other forms of data. These models learn the underlying patterns and structures of their training data, and use them to generate new data in response to input, which often takes the form of natural language prompts.

SYNONYMS & ALIASES

Edge AI
local processing
client-side AI

USAGE NOTE

Running AI models on-device improves privacy and reduces reliance on internet connectivity.

DEVELOPERS

Organizations developing technology related to On-Device.

Qualcomm
Designs Snapdragon processors with integrated AI Engines, providing SDKs and tools that enable developers to optimize and deploy AI models, including large language models, directly on mobile and edge devices. This involves significant AI engineering for on-device efficiency.
Apple
Develops custom silicon (A-series, M-series) featuring powerful Neural Engines and provides the Core ML framework, empowering developers to integrate and run machine learning models, including advanced generative AI, natively on Apple devices for enhanced privacy and performance.
Google
Offers TensorFlow Lite, a lightweight framework for on-device machine learning, and integrates dedicated AI hardware (Tensor Processing Units) into its Pixel devices, focusing on efficient model deployment and inference for edge applications.
Meta Platforms
Contributes significantly to on-device AI through PyTorch Mobile/Lite and the development of efficient open-source models like Llama, fostering a strong ecosystem for deploying and optimizing large language models on consumer-grade hardware.
NVIDIA
Provides the Jetson platform, a range of embedded computing boards designed for edge AI, robotics, and IoT applications, enabling developers to deploy complex AI models and perform inference directly on the device.
ARM
Develops the foundational CPU and GPU architectures, including dedicated Neural Processing Units (NPUs) like Ethos, that power most mobile and edge devices, offering intellectual property and tools crucial for efficient on-device AI execution and optimization.
Hugging Face
While not a hardware developer, Hugging Face provides an extensive platform for pre-trained models, libraries like Transformers and Accelerate, and tools for model optimization (e.g., quantization), which are essential for engineering and deploying efficient AI models, including LLMs, to run on-device.

RELATED TERMS IN AI ETHICS & SAFETY

BACK TO AI ENGINEERING & PROMPT DESIGN LEXICON

TECHNICAL DEFINITION

BACKGROUND

SYNONYMS & ALIASES

USAGE NOTE

DEVELOPERS

Qualcomm

Apple

Google

Meta Platforms

NVIDIA

ARM

Hugging Face

RELATED TERMS IN AI ETHICS & SAFETY