// MODEL OPTIMIZATION AND PROMPT SYNTAX TERM

TensorFlow Serving

A system for deploying trained TensorFlow machine learning models into production, making them available for predictions through a simple API.

Image via Wikipedia

TECHNICAL DEFINITION

TensorFlow Serving is an open-source, high-performance serving system for machine learning models, specifically designed for TensorFlow models, providing a flexible, RPC-based API for inference and supporting multiple models, versions, and A/B testing.

BACKGROUND

Google LLC is an American multinational technology corporation focused on information technology, online advertising, search engine technology, email, cloud computing, software, quantum computing, e-commerce, consumer electronics, and artificial intelligence (AI). It has been referred to as "the most powerful company in the world" by the BBC, and is one of the world's most valuable brands. Google's parent company Alphabet Inc. has been described as a Big Tech company.

SYNONYMS & ALIASES

TF Serving
TF Model Server
TensorFlow Deployment
Model Server

USAGE NOTE

It's commonly used to serve TensorFlow models at scale in production environments.

DEVELOPERS

Organizations developing technology related to TensorFlow Serving.

Google
As the creator and primary maintainer of TensorFlow and TensorFlow Serving, Google's AI and cloud divisions continuously develop, update, and support this open-source model serving system for production environments.
Netflix
Netflix heavily utilizes machine learning for recommendations and content personalization, integrating and often extending TensorFlow Serving or similar model serving technologies to handle high-throughput, low-latency inference at scale within its infrastructure.
Uber
Uber's Michelangelo machine learning platform leverages TensorFlow for many of its production models. They have developed significant internal infrastructure around serving these models, which often includes or is inspired by TensorFlow Serving's principles for scalable, reliable inference.
NVIDIA
NVIDIA develops the Triton Inference Server, a high-performance open-source inference serving solution designed to efficiently serve various ML models, including those trained with TensorFlow, often used as an alternative or alongside TensorFlow Serving for optimized GPU utilization.
Amazon Web Services (AWS)
AWS provides services like Amazon SageMaker, which enables developers to easily deploy and serve TensorFlow models in production. They develop underlying infrastructure and tools that orchestrate and manage model serving, often integrating or providing similar capabilities to TensorFlow Serving.
Microsoft Azure
Microsoft Azure's Machine Learning platform supports deploying and serving TensorFlow models at scale. They develop integrated services and tools for model management, monitoring, and high-performance inference that provide functionalities analogous to or complementary with TensorFlow Serving.
Alibaba Cloud
Alibaba Cloud offers various AI and machine learning services that support the deployment and serving of TensorFlow models. They develop robust cloud-native solutions for model inference and lifecycle management, often building on or complementing open-source serving frameworks.
Tencent Cloud
Tencent Cloud provides AI and ML platforms that facilitate the training and deployment of models, including those built with TensorFlow. They develop scalable and efficient model serving solutions as part of their cloud offerings to meet the demands of enterprise AI applications.

RELATED TERMS IN MLOPS & DEPLOYMENT

BACK TO AI ENGINEERING & PROMPT DESIGN LEXICON

TECHNICAL DEFINITION

BACKGROUND

SYNONYMS & ALIASES

USAGE NOTE

DEVELOPERS

Google

Netflix

Uber

NVIDIA

Amazon Web Services (AWS)

Microsoft Azure

Alibaba Cloud

Tencent Cloud

RELATED TERMS IN MLOPS & DEPLOYMENT