July 18, 2023
In recent months, the AE Business Solutions Data Intelligence team has started to field requests from clients that share a common theme:
Our response is the same as it has been for any project in the past:
"Before we talk about solutions, what exactly is the problem you are trying to solve?"
To put it lightly, there has been a lot of excitement in the marketplace due to recent advances in generative AI. ChatGPT and Stable Diffusion have reset expectations for what algorithms can do. The fervor over AI's newfound capabilities has led to debates over which industries will be next in line for automation, renewed calls for regulation, and fears of an impending AI apocalypse.
AI seems to be more capable than ever, and no one wants to get left behind. This has led organizations to search for the AI Easy Button: a tool branded with AI that advertises a simple solution to a complex problem.
The search for an AI Easy Button begins from the premise that AI is the necessary and best solution for a problem. It presumes that if AI can recommend music and write poems, it must also be able to identify criminals and approve loans. If ChatGPT can write an episode of Seinfeld in which Jerry teaches Darth Vader stand up comedy to save Frodo Baggins, then surely there is an AI tool that can build a data warehouse/evaluate employees/write movie scripts/create content for a blog, right?
In the current technology marketplace, if you search for an AI Easy Button for your problem, you will almost certainly find someone willing to sell you one.
Will it actually solve your problem? Be skeptical.
It's common to criticize AI tools from the perspective of ethics, as we have ample evidence of algorithms reinforcing harmful societal outcomes. But there's an even simpler reason to be cautious of tools branded with AI: many of them simply do not work.
In short, the explosion of interest in AI has populated the marketplace with AI snake oil. In today's marketplace, it is all too easy to be taken in by a solution that functionally cannot deliver on its promise, no matter how many times it mentions "AI" or "deep learning" in its description.
For organizations that are evaluating potential solutions, this creates a real danger. The less you understand about your problem, the more likely you are to be taken in by the AI Easy Button - a solution that is exorbitantly expensive, doesn't work, or both.
The easiest way to be on guard against AI snake oil is to understand the problem you are trying to solve and how an algorithm might help.
There are, broadly speaking, two types of problems for which algorithms and models are helpful:
That's it. That's the list. You can pretty much place any data science use case in either one of these buckets.
Type 1 problems are the ones in which we have really seen improvements in recent years: image recognition, speech to text, spam detection, content recommendation. Advances in computing combined with greater volumes of data have increased our ability to train models capable of learning the patterns inherent in these tasks. These are the use cases that sit atop the list of Classic Data Science ROI because they provide immediate value in the form of saved time.
Type 2 problems can best be thought of as seeking answers to questions. When should a head coach decide to go for it on fourth down? Why do some of our products fail within the first 60 days of their life cycle? What price should we set for our tickets when it's raining? For these types of problems, the goal is to better understand a data generating process so that we can make an informed decision, a decision that saves time, money, or both.
For either type of problem, models and algorithms provide value by learning some pattern from data. For all of their seeming complexity and godlike power, models are only as good as the data on which they are trained. And the patterns they learn depends on the task we give them.
Which goes back to the importance of understanding the problem - what type of problem do you have? Or, stated another way, what pattern do you need a model to learn?
You want to automate some task: what does a model need to learn to accomplish that task? Is it a general pattern, such as translating audio into text, or is it something specific to your organization? Do you have historical data of what has been done manually in the past? What would be an improvement over the current process?
You want to understand why something is happening in your organization. Some of your products fail within 90 days while others do not; some customers decide not to renew their subscriptions. Why? What decisions are you currently making that could affect these outcomes? What could you learn that would lead to a better decision? What information do you wish you had?
These are the types of questions that can get the ball rolling towards defining a successful data science use case: there is some unknown pattern that would be useful to learn and there is data available to learn it. But, even then ,there's never a guarantee that machine learning will outperform what you're currently doing.
That's the thing about (data) science - it's never quite as simple or as easy as you expect it to be. Data is messy. Problems are complicated. Models are uncertain. You usually don't nail everything on your first attempt.
That's what (data) science is about - you make an attempt, learn from where it went wrong, and try to do better the next time. Iteration, failure, and incremental progress are the name of the game.
It might not sound as flashy, but it's a better approach than searching for the AI Easy Button.