By Shrikant Dash – August 22, 2020
This is the first part of a 2-part blog series.
Part 1: The Organizational Case/Context for Model Production
In the last five years, the proliferation and maturing of open-source computational tools and an expansion in modeling techniques as well as the availability of low-cost elastic private and public cloud computation platforms have created opportunities for both, firms with established analytics capabilities, and others relatively new to model-driven decision making.
We are at an inflection point in the emergence of rapidly implementable enterprise-class model production workflows and platforms. These platforms hold the promise of greater transparency and controls in the operationalization of model production, the tracking of metrics and KPIs that signal calibration/deterioration triggers for internal action and external reporting, as well as the measurement of business value and increased speed to market for newly developed models.
In order to fully leverage model production workflow tools and platforms, firms must first establish a data architecture that reflects and anticipates their evolving business architecture through the lens of information management. This requires a mapping of customer/product/channel information flows by business lines within the enterprise. Those information flows must result in a set of data structures, relational or otherwise, that have documented aggregation and transformation of data elements for use in establishing customer identity and relationships, service fulfilment protocols in the product processors, business intelligence and visualization of business performance, as well as predictive decision making via model and strategy frameworks in enterprise channels of engagement. An important addendum, often overlooked in technology teams, is the criticality of external data and its combination with internally aggregated data that often are intertwined in the model production lifecycle and are subject to privacy and other regulatory constraints.
In addition to the establishment or evolution of a core enterprise information management function to execute on an analytics architecture roadmap, it is also essential to establish data governance norms by creating clear data ownership and decision rights among business program executives of critical data elements including source and change definitions and documented protocols for change management, resulting in a robust and dynamic ecosystem of enterprise production metadata.
Once the data environment is well-defined and governed, it can form the groundwork for enterprise model/strategy production and governance. Model taxonomies are an important first step; the ownership and primary/secondary usage of models as well as their role in capital and other regulatory reporting, financial and business forecasting, and in behavioral decision-making for specific customer segments must be documented and controlled through a mechanism of enterprise data and model governance. This governance framework must include business program managers, data scientists, finance and technology leaders in the first line of defense as well as legal counsel, enterprise & model risk and the enterprise data office/audit in the second and third lines of defense.
Finally, the availability of an established model development environment with defining user access and query rights, availability of enterprise class modeling tools that allow a wide variety of interpretable ML techniques, and an appropriately configured and scalable computing environment that allows computationally intensive modeling techniques within the tolerances required in model production (essentially devops for modeling with a robust test bed), is a prerequisite for the development of production-ready models and strategies. PII obfuscation in modeling datasets, rapid query and extraction/visualization tools, as well as feature engineering capabilities are increasingly important both from the standpoint of rapid and robust model development, but also from the standpoint of quick productionalization.
The foregoing is a high-level overview of the prerequisites for a successful deployment of operationalized models in production. Each component requires detailed attention and the execution of organizational structures as well as tools/platforms and also training of qualified personnel that can leverage such an environment for maximum business value.
Shrikant Dash is a financial services executive and analytics leader who has led analytics and operational teams at large financial institutions that include Citi, Morgan Stanley, and Discovery Financial Services.