Key questions to answer for successful model deployment include:
- Which deployment approach will be used: offline or real-time?
- What is the expected volume of predictions: ten per week, ten per second, etc?
- How does the system handle spikes in demand?
- What is the maximum acceptable latency between a prediction request and response service?
- Where will the model be deployed: cloud, edge device, ordinary server, etc?
- What kind of network connectivity will be required to transmit prediction requests to the model and predictions to output?
- How will model drift be detected?
- How will outliers in data be detected?
- How robust are the fallback mechanisms?
- How robust is the model deployment pipeline?
- Is the training pipeline reproducible?
- Is the new model better than the old one?
- Is model explainability required?
- How much outage (when predictions cannot be served) can the business afford?
- Will the model need to be updated?
- Will the history of prediction requests and responses be required in the future, e.g., by government or for further analysis?
- Will the benefit of operating and maintaining the deployment option outweigh the cost?
- Valohai, https://valohai.com/model-deployment
- Yvonne Cook, https://www.itproportal.com/features/overcoming-the-challenges-of-machine-learning-modeldeployment
Online references were accessed on 17 May 2022.