OK, generative AI is getting a lot of good ink, and some bad as well. There’s little question that AI in general, and generative AI and “large language models” or LLMs in particular, are going to bring about major tech changes and even changes to our lives. There’s little question that the negatives, including their eliminating most jobs or the risk of their driving humans extinct, are overplayed. What are the real risks, the things that could stall actual progress, and in particular what do users think they are?
Enterprises rate the problem of “hallucinations” as the biggest risk by a substantial margin. All of the enterprises I’ve chatted with on the topic, whatever their current use of generative AI might be, say that there are too many cases where the technology just seems to go off the rails. Over the holidays in the US, an enterprise technologist who had previously said that generative AI error rates were too high groused that a study he’d read praised changes that reduced hallucinations to a mere ten percent of results. “What good is that? I’d fire an expert who was wrong ten percent of the time.”
AI experts I’ve emailed with are divided on the way that hallucinations should/will be resolved. Most say that the problem is most easily addressed by adding a kind of boundary layer, a set of constraints that test the validity of a stream of tokens as they’re generated to weed out the chaff. Most also agree that this approach would add to the resource costs of running the model. Some say that improved models and training will also reduce hallucinations, but again these changes are likely to increase resource costs.
Which brings us to the next point, which is that enterprises are largely insensitive to the issue of resource costs. A few enterprises with a strong commitment to responsible tech (less than 10%) are concerned that generative AI is taking too much power and contributing too much to emissions, but none seemed to link the resources needed for generative AI to efforts to reduce hallucinations. That means that they aren’t pushing back on solution paths that could take the cost of running, and training, LLMs. The only enterprises who seem to think about resources are the ones that expect to run a private model on their own data, which is the case for only 15% of the enterprises who offered me data.
The second risk on enterprises’ list, cited by over 80% of enterprises, is the difficulty enterprises have in finding and retaining AI experts. Given the explosion in generative AI technology options, figuring out what choices are available and which one might be best requires a reasonable level of AI skill in house. One enterprise who has a small team collecting data on generative AI developments tells me that there are between six and fifteen announcements uncovered per week. Three-quarters of the material is designed for consumption by AI experts, and enterprises say they simply cannot get those people.
A related issue enterprises raise at about the same frequency is the fear of stranding a lot of effort and perhaps costs because optimum generative AI is obviously a moving target. One company started testing one approach (vanilla ChatGPT), switched to a narrow and specialized model, and was in the process of considering a switch to a model that promised faster training and lower resource levels. Considering that they believed that they could be getting useful results from generative AI in sixty days and that they’d already spent longer than that just switching approaches, you can see how frustrating this particular problem can be.
The third risk, cited by 54% of enterprises, related to security and privacy. While almost all enterprises believe that “public” generative AI services like those of OpenAI, Google, and Microsoft pose risks to both their data privacy and even their business security, almost half think those risks can be addressed by a combination of contractual guarantees and internal information security policies. Still, they admit that employees could decide to take a bunch of company data and push it through a public model without anyone being aware it was being done. As a note, this issue is the one most cited by enterprises as the reason why they’d prefer to host their own generative AI model if it’s to operate on business-critical data.
The largest specific threat this group cites relates to quarterly financial data and security laws on disclosure and insider trading. Companies expressed concern that employee use of generative AI in preparing for an earnings call or to file financial documents could end up disclosing earnings data before general public release, which could then create a compliance problem. Recall that generating documents, and even generating scripts for things like earnings calls, is one of the most accepted uses of generative AI.
The good news for the AI pundits is that nearly all (96%) of enterprises told me that they believed all these issues would be resolved. The average time to expected resolution was mid-2024. The bad news is that only a slightly smaller number of enterprises (94%) said that the current state of generative AI either curtailed or prevented their use of it for “serious” missions, and 21% said they were only allowing its use for some applications that included an explicit human review step. The typical applications in this group included generating ad copy and other boilerplate material.
The really bad news is that while there’s almost universal conviction that all these issues will go away, there’s not even much in the way of serious guesses as to what the resolution of any of them might be. There’s a general view that generative AI training and execution will become “much more efficient” but no clear view of what will bring that about. Same with the hallucination issue, and there’s not even a general view of how security or compliance problems might be resolved. Even companies who are attempting to run their own generative AI software and do their own training aren’t completely sure of where their resource requirements are going and how they’d justify any additional hardware. In fact, most enterprises don’t know what resources would be needed to run generative AI internally, which shows few have even looked into the notion.
All this reinforces my view that, to paraphrase a CIO who offered a personal view, “[management here] is chasing AI rainbows.” All the hype must mean something great is just beyond reach, right? The problem, of course, is that tech hype waves of the past have shown us that exaggerated hopes create exaggerated disillusionment, and that can result in something being devalued just because it didn’t meet unrealistic claims. I think it’s clear that AI and in particular generative AI and LLMs are going to transform a lot of how we work and live. We just have to get realistic.