Imagine an innovation in pipes. If this innovation were general, something that made all kinds of pipes cheaper to build and maintain, the total benefits could be large, perhaps even comparable to the total amount we spend on pipes today. (Or even much larger.) And if most of the value of pipe use were in many small uses, then that is where most of these economic gains would be found.
In contrast, consider an innovation that only improved the very largest pipes. This innovation might, for example, cost a lot to use per meter of pipe, and so only make sense for the largest pipes. Such an innovation might make for very dramatic demonstrations, with huge vivid pipes, and so get media coverage. But the total economic gains here will probably be smaller; as most of pipe value is found in small pipes, gains to the few biggest pipes can only do so much.
Now consider my most viral tweet so far:
Good CS expert says: Most firms that think they want advanced AI/ML really just need linear regression on cleaned-up data.
This got almost universal agreement from those who see such issues play out behind the scenes. And by analogy with the pipe innovation case, this fact tells us something about the potential near-term economic impact of recent innovations in Machine Learning. Let me explain.
Most firms have piles of data they aren’t doing much with, and far more data that they could collect at a modest cost. Sometimes they use some of this data to predict a few things of interest. Sometimes this creates substantial business value. Most of this value is achieved, as usual, in the simplest applications, where simple prediction methods are applied to simple small datasets. And the total value achieved is only a small fraction of the world economy, at least as measured by income received by workers and firms who specialize in predicting from data.
Many obstacles limit such applications. For example, the value of better predictions for related decisions may be low, data may be in a form poorly suited to informing predictions, making good use of predictions might require larger reorganizations, and organizations that hold parts of the data may not want to lose control of that data. Available personnel may lack sufficient skills to apply the most effective approaches for data cleaning, merging, analysis, and application.
No doubt many errors are made in choices of when to analyze what data how much and by whom. Sometimes they will do too much prediction, and sometimes too little. When tech changes, orgs will sometimes wait too long to try new tech, and sometimes will not wait long enough for tech to mature. But in ordinary times, when the relevant technologies improve at steady known rates, we have no strong reason to expect these choices to be greatly wrong on average.
In the last few years, new “deep machine learning” prediction methods are “hot.” In some widely publicized demonstrations, they seem to allow substantially more accurate predictions from data. Since they shine more when data is plentiful, and they need more skilled personnel, these methods are most promising for the largest prediction problems. Because of this new fashion, at many firms those who don’t understand these issues well are pushing subordinates to seek local applications of these new methods. Those subordinates comply, at least in appearance, in part to help they and their organization appear more skilled.
One result of this new fashion is that a few big new applications are being explored, in places with enough data and potential prediction value to make them decent candidates. But another result is the one described in my tweet above: fashion-induced overuse of more expensive new methods on smaller problems to which they are poorly matched. We should expect this second result to produce a net loss on average. The size of this loss could be enough to outweigh all the gains from the few big new applications; after all, most value is usually achieved in many small problems.
But I don’t want to draw a conclusion here about the net gain or loss. I instead want to consider the potential for this new prediction tech to have an overwhelming impact on the world economy. Some see this new fashion as just first swell of a tsunami that will soon swallow the world. For example, in 2013 Frey and Osborne famously estimated:
About 47 percent of total US employment is at risk .. to computerisation .. perhaps over the next decade or two.
If new prediction techs induced a change that big, they would be creating a value that is a substantial fraction of the world economy, and so consume a similar fraction of world income. If so, the prediction industry would in a short time become vastly larger than it is today. If today’s fashion were the start of that vast growth, we should not only see an increase in prediction activity, we should also see an awe-inspiring rate of success within that activity. The application of these new methods should be enabling huge new revenue streams, across a very wide range of possible application areas.
But I instead hear that within the areas where most prediction value lies, most attempts to apply this new tech actually produce less net value than would be achieved with old tech. I hear that prediction analysis tech is usually not the most important part the process, and that recently obsession with showing proficiency in this new analysis tech has led to neglect of the more important and basic issues of thinking carefully about what you might want to predict with what data, and then carefully cleaning and merging your data into a more useful form.
Yes, there must be exceptions, and some of those may be big. So a few big applications may enable big value. And self-driving cars seem a plausible candidate, a case where prediction is ready to give large value, high enough to justify using the most advanced prediction tech, and where lots of the right sort of data is available. But even if self-driving vehicles displace most drivers within a few decades, that rate of job automation wouldn’t be out of the range of our historical record of job automation. So it wouldn’t show that “this time is different.” To be clearly out of that range, we’d need another ten jobs that big also displaced in the same period. And even that isn’t enough to automate half of all jobs in two decades.
The bottom line here is that while some see this new prediction tech as like a new pipe tech that could improve all pipes, no matter their size, it is actually more like a tech only useful on very large pipes. Just as it would be a waste to force a pipe tech only useful for big pipes onto all pipes, it can be a waste to push advanced prediction tech onto typical prediction tasks. And the fact that this new tech is mainly only useful on rare big problems suggests that its total impact will be limited. It just isn’t the sort of thing that can remake the world economy in two decades. To the extend that the current boom is based on such grand homes, this boom must soon bust.