Technology
TensorFlow Serving
TensorFlow Serving is a high-performance serving system designed for production environments to deploy machine learning models via REST and gRPC interfaces.
Engineered for high-throughput production environments, TensorFlow Serving handles the entire model lifecycle from versioning to inference. It allows teams to deploy new algorithms and experiments instantly without changing client code or stopping the server. By utilizing gRPC and REST APIs, it optimizes hardware acceleration (GPUs and TPUs) to deliver sub-millisecond latency for complex deep learning architectures. The system supports multi-model serving out of the box, ensuring that 100 percent of your inference traffic remains stable during seamless model hot-swapping.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1