Last year I began toying around with scikit learn, I'm far from a statistician though - and my aim was largely to understand the workflows required to support machine-learning work. It was fun and I learnt a lot, but I was still intimidated by one particular idea: putting a model in to production.

The stateful nature of an ML model sounded like a nightmare, and work commitments changed so I barely gave it much thought... until now. So lets look at how I'd design a deployable machine learning model!

In this example we're going to write a service that's capable of making predictions based upon the Wisconsin Breast Cancer Database; a readily available dataset that's commonly used for experimenting with machine-learning techniques.

Requirements

A defined API for sending a payload containing parameters to be run against the model.
A cacher to minimise resource intensive operations against known inputs.
Scaling of parameters to ensure they're compatible with the pre-defined model.
Execution of the model against the provided parameters.

High Level Architecture

HTTP / API Layer

The API will be defined using OpenAPI, and implemented using the connexion library for Python. This drastically reduces the amount of boilerplate required, and comes with validation and failure handling out of the box.

Test Strategy

Simple BDD style (i.e. Gherkin) tests will be written against the API to verify the overall functionality, such as:

Correct status codes are returned for malformed requests.
Correct cache usage for known parameters.
Correct scaling of provided parameters when the cache is empty and model execution is required.

Requirements

High Level Architecture

HTTP / API Layer

Test Strategy

Cache Control