Company

Differences in the Creation and Production Environments

 

Creation_vs_production When it comes to machine learning models, there are many differences between creation environments (where the model was built), and production environments (where the model will be used, monitored and have it’s life cycle managed). The creation environment is oriented towards a specific set of people working on the model, with specific system, data and outputs configurations. But the production environment may be quite different – with other people, systems and requirements applied to the model.  Understanding these differences allows organizations to be efficient in both environments, and know how to best navigate for the full life cycle of their critical machine learning assets. Let’s take a deeper look at the differences between creation and production environment in order to increase the effectiveness of our deployment process.

The first difference between the creation and production environments is the person that is responsible for the model in each of these environments.  Generally, in the creation environment, data scientists are in control of building and testing the model. Once the model has been created, more often than not it’s handed to a partner org (perhaps IT, or analytic engineering) and operations to be put into production. In the production environment, it is the responsibility of IT and operations to ensure that the analytic asset is running as accurately and efficiently as possible. The handoff from data science to IT has been known to cause problems in organizations, and care should be taken to ensure the roles and responsibilities at this critical handoff are understood, documented and enforced.  Learn how to increase efficiency in the handoff from data science to IT in our recent blog.

Another key difference between these two environments is basic purpose of each, and the systems that will employed.  For example, in the creation environment teams are focused on the tools and packages used to build and run the model. The model creators may iterate on the modeling techniques, libraries and input data features often, looking for and then settling on the best outcome for their application.  As a result, the tooling is oriented to discovery, trial and error and prototyping.  However, as the model moves into production the focus changes.  Now teams are working on a more rigid outlook, enabling scheduling flows, rigorous testing, monitoring and continuous integration techniques.  The tooling in production tends to be more systematic, process oriented and scalable.  These differences are to be expected, but critical to acknowledge and manage for over the life of the machine learning asset.

Data, and the sources or locations of data, is a third critical difference between the creation and production environments. The structure of the data, as specified by the data science team will be the same, but the data is often delivered in different ways.  As a simple example, the creation systems may point to flat local files, managed by the model creation team – but the production system may take live, streaming data straight from sources in the field.   Perhaps even more important to consider is the outputs of models.  In production, teams may look for a wide degree of flexibility here to feed various application types, reporting and dashboards.  The creation team generally does not get involved in this “down stream” integration from the model.  Rather, teams of people who specialize in managing the connection to live business applications are brought in, and manage the asset’s impact on the day to day business.

Although the model itself and data schema stay the same, there are many aspects of the model that change from creation to production. This is why it’s so important to have systems which enables core abstractions, allowing for flexibility where required and enabling other aspects to be locked down.  At Open Data Group, our Model Development Life Cycle (MDLC) approach provides a comprehensive framework to consider these, and other challenges for enabling machine learning to transform the enterprise.  Learn more about the MDLC approach here.

 

All ModelOp Blog Posts 

ModelOp Golden Ale Takes a Holiday – Part 2

ModelOp Golden Ale Takes a Holiday – Part 2

2 Minute Read By Greg Lorence Before we go much further, I feel obligated to state what is likely already obvious: I’m not all about that #InstaLife. All accompanying photography was snapped with little regard for composition, typically while stretching out from 4-6...

Q&A with Ben Mackenzie, AI Architect

Q&A with Ben Mackenzie, AI Architect

2 Minute Read By Ben Mackenzie & Linda Maggi How AI Architects are the Key to Operationalize and Scale Your AI Initiatives Each week we meet more and more clients who are realizing the importance of operationalizing the AI model lifecycle and who are dismissing...

Behind the scene of ModelOp by our Brewmasters- Part1

Behind the scene of ModelOp by our Brewmasters- Part1

2 Minute Read By Greg Lorence As a long-time homebrewer, when our President, Scott asked me, “wouldn’t it be cool if you and Jim brewed a beer to commemorate our rebrand later this year?” my reaction, after the immediate “heck yeah! Beer is awesome”, was honestly...

Open Data Group Officially Becomes ModelOp

Open Data Group Officially Becomes ModelOp

2 Minute Read By ModelOp Today, Open Data Group rebrands as ModelOp. Read more on Globe Newswire It is an exciting day for us, if only because people will stop asking “Why are you called Open Data Group?” after they understand what we do. More importantly the name...

Gartner & WIA Conferences Exit Poll

Gartner & WIA Conferences Exit Poll

2 Minute Read By Garrett Long As we continue into our “Year of Model Operations”, I thought it would be useful to highlight some of the key things I observed, learned and shared over the last few weeks at both the Gartner Data and Analytics Summit March 18-21, 2019 in...

Machine Learning Model Interpretation

To either a model-driven company or a company catching up with the rapid adoption of AI in the industry, machine learning model interpretation has become a key factor that helps to make decisions towards promoting models into business. This is not an easy task --...

Matching for Non Random Studies

Experimental designs such as A/B testing are a cornerstone of statistical practice. By randomly assigning treatments to subjects, we can test the effect of a test versus a control (as in a clinical trial for a proposed new drug) or can determine which of several web...

Distances And Data Science

We're all aware of what 'distance' means in real-life scenarios, and how our notion of what 'distance' means can change with context. If we're talking about the distance from the ODG office to one of our favorite lunch spots, we probably mean the distance we walk when...