The financial industry generates a lot of reports, and the part that’s looking at tech perhaps more reports than most. Tech is complex, and not surprisingly some of the reports have little or no insight inside them, and present no useful information to readers. Occasionally something worthwhile comes along, as it has with a Goldman Sachs report on the software industry and generative AI. It’s not all beer and roses, but there are some blooms (and thorns) we should assess.
The report (remember this is a Wall Street analyst) opens with some great questions, after noting the excitement around generative AI in 2023: “What happened?”, “Why are we not seeing more meaningful changes?”, “Where is the disruption?” ‘Why are we not seeing meaningful impact to Software despite the massive Capital spending cycle undertaken by the Hyperscalers” Let’s see if they can be answered, in the report, based on what enterprises tell me, or both. One way we’ll try to do that is by introducing another report!
The report, looking at AI software players first, notes that the analyst’s survey of enterprises says that 9% of their IT budgets would be allocated to generative AI in three years, up from 7% in the same survey in January. I can’t validate this from my own contacts with enterprises, none of whom had any real idea what was being spent company-wide on generative AI services. However, enterprises are getting more bullish on AI overall, and if we discard both the notion of AI services (from hyperscalers) and the “generative” qualifier, these figures aren’t unreasonable in my view.
The report then makes a point that ROI for generative AI is hard to quantify, just as it was for the early-day cloud. I agree, but I’m uncomfortable with the presumption that since the cloud inevitably proved out (it cites the fact that Microsoft is the cloud leader in growth now, and yet Azure’s ROI was negative early on) so will AI. I think that’s true, but I think justifying the statement requires more than just a cloud comparison. The cloud, for years, suffered from the misapprehension that everything would inevitably go there because economy of scale made the cloud cheaper. It isn’t cheaper and everything isn’t going, but the cloud has a clear mission in support of the “front-end” GUI-intensive elements of applications. AI’s success is visible even now, but only if you forget the cloud model and look at self-hosting.
The figure (Exhibit 3 in the report) that shows the framework for an AI buildout shows a clear bias toward the as-a-service model of AI, and the applications proposed are not nearly enough to justify the multi-trillion-dollar TAM the report suggests. However, Exhibit 13 in the report talks about early, mid- and late-cycle evolution of generative AI, and for input data shows a shift away from web training to proprietary data training. That’s why I believe (and why enterprises tell me) that data sovereignty issues for AI, like they have for the cloud, will force the important stuff to be hosted on premises. That’s one key trend that I don’t think the report addresses.
Another key trend is toward model simplification. The resources needed to host an AI language model depends primarily on two factors, the complexity of the problem and the number of users to be supported. IBM, who enterprises tell me is the vendor who gets their AI needs, sponsored a Reuters Plus report, “From black box to open book” as believing that the visible LLMs are simply not designed to support enterprise applications. To quote: “I don’t need a model to be able to write me poems or help me plan my holiday,” [the report says, quoting the global managing partner for generative AI in IBM consulting]. “We believe in the advantages of having smaller code and language models specifically built and trained for the job at hand.” In short, missions to support enterprise benefits only need small (or smaller) language models, require less resources and so can be run in house, and thus protect data sovereignty. IBM/Red Hat, working with InstructLab open-source AI project, lets users customize IBM’s Granite model (or any other LLM, including Llama 2 and a Mistral derivative), essentially subsetting it and training it with user data for a specific mission. The result can be run in house.
Logically, you can run such a model on a single laptop with a decent GPU, which to me is the problem with heady revenue projections for as-a-service forms of AI already familiar to most of us, and for some of the winners in the report. Intuit, of course, has announced a big layoff and refocusing on AI, leading to the presumption that tax and small-business accounting tools using AI would sell like hotcakes. Five AI experts told me that running personal and small-business taxes on a laptop using a specialized model like one described above would work fine. No as-a-service needed. A small group of accountants working with an AI expert could update it as needed. I think that AI is more a threat than an opportunity to Intuit, and the same is likely true for other players, even Adobe.
The focus on LLMs and making them work for specialized chatbot missions was the primary goal of RAG (retrieval-augmented generation). It lets a generalized LLM grab proprietary data, reducing the data sovereignty issues of AI-as-a-service, but the InstructLab approach can create something that’s easier to host yourself, so it doesn’t raise the TAM for AI-as-a-service. Enterprise experience with RAG is very limited, but ten tell me that while RAG improves cloud-hosted chatbots, it’s not an ideal strategy.
This opens what I believe is the weakness of the report, its focus on the cloud model of AI as though it was the only model. This, I think, is due to the investor bias that’s understanding in a report intended for Wall Street. Technology you can play in the market is what matters to investors. Many would say that the Street has a consumer bias in the way it looks at technology, because (1 there are a lot of consumers, and (2 consumers respond more emotionally and are thus more susceptible to the hype that creates bubbles. Enterprises, though, want a strong business case for AI, meaning an AI that can offer them a significant improvement in operations. They don’t see how you get to that without an AI that is driven by your own data rather than data pulled from the Internet, and their own data is going to stay in their own data center. Real enterprise AI then depends on self-hosting.
The report has a good section about the data center implications of self-hosting AI, including the power/cooling impact, but since InstructLab is never mentioned (nor is Red Hat) the material doesn’t reflect the impact of SLM on hosting, and the impact of InstructLab’s approach on training load. Thus, while it’s helpful to understand the general issues of powering and cooling AI clusters, it’s not as helpful as it would be if there were more specific comments on model-specialized hosting.
Lets answer those opening questions now. “What happened?” was hype and the Street’s love of a bubble. Public generative AI was something everyone could fiddle with, so fiddle they did, and media and the Street were eager to cover it, “Why are we not seeing more meaningful changes?” is that hype can’t drive real business decisions. There needed to be a business case for AI, like for anything else, and hype covered that need, so we’ve lagged in producing the business cases, “Where is the disruption?” It’d in the kinds of AI that business cases are now exposing. Those business cases favor smaller, self-hosted, AI. ‘Why are we not seeing meaningful impact to Software despite the massive Capital spending cycle undertaken by the Hyperscalers?” Because the value proposition for basic hyperscaler generative AI really doesn’t exist. Software impact will come from the business cases too. So in short, the article asks the right questions, but doesn’t come up with the right answers.
There’s a lot of good in this piece, though. Any financial analysis of tech is valuable because companies respond to Wall Street first and foremost. If you have access to it (it’s limited circulation) it’s worth a read for that reason alone. It also illustrates a problem Wall Street shares with media overall, which is a lack of understanding of the details of technology. That’s not surprising given how challenging it is for anyone, even those like me who are actually in tech, to keep up to date on things, but it shows that there’s a risk in accepting popular wisdom without question. Take tech commentary as a data point, whatever the source, but try to understand things for yourself.