PyTorch is one of the most popular open source libraries for deep learning. Developers and researchers particularly enjoy the flexibility it gives them in building and training models. Yet, this is only half the story, and deploying and managing models in production is often the most difficult part of the machine learning process: building bespoke prediction APIs, scaling them, securing them, etc.
One way to simplify the model deployment process is to use a model server, i.e. an off-the-shelf web application specially designed to serve machine learning predictions in production. Model servers make it easy to load one or several models, automatically creating a prediction API backed by a scalable web server. They’re also able to run preprocessing and postprocessing code on prediction requests. Last but not least, model servers also provide production-critical features like logging, monitoring, and security. Popular model servers include TensorFlow Serving and the Multi Model Server.
Deploying a Model
Configuring TorchServe for Remote Serving
Configuring TorchServe for HTTPS