Data and Processes First, Machine Learning Second
Machine learning is a hot topic among many businesses right now, with executives constantly hearing pitches promising keen insights and automation of all kinds of business tasks. After being sold on the potential of machine learning in the workplace the first instinct of many managers is to try and apply it to anything and everything, from production control to customer experience and contracts management. However, machine learning isn’t a magic bullet – it needs to be applied to the right challenges, and more importantly it needs to be built on top of the right foundation to be successful (or even useful). I’ve seen many teams try to implement machine learning in the workplace, but ultimately fail and become disillusioned with the technology because they either attempted to solve problems that really needed underlying process improvements or because the available data wasn’t up to the task.
Data science insights and Machine Learning models are only as good as the data and processes on which they are based. Without a strong foundation of clear, streamlined, and “gotcha”-free processes combined with well-structured and logically organized data, these new tools are hamstrung from the start. In fact, industry estimates indicate that data scientists spend up to 60% of their time cleaning and normalizing data sets before they can actually generate insights. And that’s assuming that the data is even robust enough and the business use case is tailored enough to get started in the first place. The best way to get started with machine learning is to first evaluate data and process quality before spending large amounts of time and money chasing new technologies.
Solid Data = Solid Results
When looking at data science-based approaches to automation and insights, the logical place to start is with the data itself. Many organizations assume that just because they have large volumes of data, it will be easy to plug it into machine learning models or to quickly make sense of it with the right software tools. The reality is almost always much different, with corporate data sets plagued by years of lax standards, completely siloed repositories, and data entry error or laziness. Almost half of business leaders believe their corporate data is too siloed to make sense of, and nearly a quarter attribute inaccuracies in forecasts and predictions to poor data quality.
In a very real sense, the quality of machine learning outputs is tied directly to the quality of data inputs. Before embarking on new initiatives to make use of this technology, companies should first be looking at implementing strong data governance policies, cleaning and linking their existing data (with or without expert help), and eliminating silos that prevent access to data across teams. While data science tools can certainly help with some of this effort as part of broader machine learning adoption, there’s no replacement for a strong IT organization with a data-centric worldview.
Better Processes = Better Performance
An increasingly large goal of machine learning in the business world is augmenting or replacing humans in complex tasks like reviewing and updating contracts. But virtually all tasks that might be worth applying machine learning to are built on top of a chain of business processes with multiple inputs, outputs, and potential paths. In most cases, these processes have been built up over years, or sometimes decades, of human experience in response to business challenges. The net result is often a task that, while seemingly simple, actually follows a convoluted path to completion that relies on tribal knowledge to navigate effectively. It should be no surprise that attempting to apply machine learning to these processes without first mapping and streamlining them usually results in poor performance at best.
The upside of process improvement is that it doesn’t just help machines. It also helps people within an organization, especially those who are new to a role and need to get up to speed quickly while minimizing mistakes. And while many people equate process improvement with massive initiatives that require full management systems and lengthy ramp-up times, the truth is that many business activities can be improved by something as simple as creating a detailed process map and looking for obvious sources of confusion. Basic activities like this serve to get everyone on a team aligned with the “right” way to perform a task and help uncover inefficiencies and roadblocks, regardless of whether or not a machine learning initiative is ever undertaken.
Steps to Readiness
If machine learning is on your short list of initiatives to pursue, following a few steps before embarking can save you time, money, and headaches down the road. I recommend clients take these actions as a first step:
- Evaluate your data environment objectively and determine if it is really clean enough, organized enough, and unified enough for machine learning
- Work with your IT organization to shore up any data issues before hiring outside experts
- Perform detailed process mapping for the tasks you want to apply machine learning to
- Improve your processes to remove inconsistencies and “gotchas” while standardizing steps and eliminating reliance on tribal knowledge
Once these steps are complete, teams can more accurately gauge if machine learning even makes sense, if it’s realistically possible, and if the return on investment will be worthwhile. If a machine learning initiative is the right next step, the outcomes of these pre-work items will make an implementation much faster and head off many potential issues along the way. And even if machine learning isn’t the right next step today, cleaner and more structured data combined with well-defined and reviewed processes will pay dividends regardless of their end use.