Three Little Words that Could Save AI

In our tech world of hype, it’s common to find a new term emerging. What’s not common is for the new term to actually represent something meaningful, much less important. We have such a term now added to the AI/ML lexicon, and it’s explainability, and there’s a nice article on it in Medium. The point it makes is one I’ve made myself, based on what enterprises have been telling me; AI’s problems with trust could be resolved if it were possible to understand how it reached a conclusion—if it could “explain” its process.

I did a distributed information system project decades ago, that involved creating technical documents that were assigned specific keywords, then allowing complex queries expressed in the typical parenthetical way to retrieve a subset. In order to do this quickly, one of the things I had to do was convert the parenthetical query to reverse-Polish notation, but how to convince people that it was working, given that few people could use RPL? By showing how the RPL query was processed to create the answer. Explainability. The article cites IBM’s Global Adoption Index, saying “80% of businesses cite the ability to determine how their model arrived at a decision as a crucial factor” and that’s very close to what I heard from enterprises.

Enterprises have consistently said that generative AI suffers from hallucinations at a rate that impacts about a quarter of its answers. I think that’s high if you assume that users are constructing their prompts (AI-speak for the queries entered) carefully; my own experience with Google and Microsoft AI tools shows the real rate is a bit more than 18%, and on fairly simple questions it’s lower, but analysis of ML data is hardly simple, and so enterprise statistics may actually be a bit optimistic. However, they say that even a five percent hallucination rate would make it hard to trust generative AI, or any AI, and the issue of trust remains the main barrier to broad AI adoption.

The problem with providing AI explainability, say my AI gurus, is that it risks exposing the trade secrets behind a model. Obviously that wouldn’t be an issue for a fully open-sourced AI model, but it surely is one for the majority of models out there. For this reason, as the article notes, there’s a chance that “interpretability” could be substituted. That’s a way to explain an AI/ML output by referencing inputs without exposing the specifics of the process.

Another factor the article introduces is “observability”, a term that sadly is used regularly in a different context. When used in AI, it means the ability to observe the performance of AI in reference to “the real answer”. In many AI/ML contexts, it’s possible to know what the “real answer” is, either by observing what happens if a result is accepted or by testing it against an accepted standard. However, enterprises tell me that neither of these are likely to be acceptable in their AI missions, because it would be a major risk to accept a result if the outcome of trying it could be catastrophe if it’s wrong, because there is no objective standard to apply, or both. Thus, the explainability/interpretability pathway to trust is doubly important.

We rarely have either, which is why AI is hard to trust and hard to use optimally. For example, five times as many network operations types say they’d like AI to suggest strategies to respond to a fault as would accept AI taking the recommended action by itself. A third of netops professionals admit that “checking on AI” is eroding the impact AI tools could have on their productivity. But half this group says that, over time, if their checks on results are positive, they’ll likely take similar recommendations with less, or even no, pre-validation. In other words, their own experience is providing observability reinforcement.

There’s an interesting, if statistically tentative, link between AI success and the support of a vendor with high strategic influence on the enterprise launching the project. My data here is sparse, but it appears that where AI projects involve a vendor with very high strategic influence, they’re twice as likely to be successful as projects involving a vendor with little strategic influence, and four times as likely as projects driven entirely by internal expertise. This is why, as I’ve noted, IBM is a standout in AI; they have the highest strategic IT influence of any vendor, and have sustained that position for decades.

There’s another fact with limited statistical significance behind it; an enterprise who launches an AI project that fails is unlikely to have a successful one within a year, and it appears that almost half may abandon their AI plans, except for the kind of casual personal-productivity generative AI stuff that line organizations can simply expense and support without central approval. How long this flight from AI will continue can’t be determined at this point.

We all know that technology tends to cycle from being over-hyped to being over-criticized, and even digests of tech news produced by Google or Microsoft (both big AI promoters) is starting to show a lot of negative stories. Combine this with the once-bitten-twice-shy finding I’ve shared on the results of disappointing AI project outcomes and you can see that it may become harder and harder to gain AI approval.

The truth about AI is simple. The form of AI most of us are familiar with, the free or integrated chatbots that do much of what a web search could do, is in fact overhyped and unlikely to lead any company to a business transformation. On the other hand, there have been incredibly valuable AI projects completed, and one CIO even reported a project whose benefits were more than double the projected level. We can’t afford to lose AI value any more than we can afford to accept when AI can’t demonstrate value. The problem with AI is us…

…because we’ve not learned enough about it to make good decisions. We’ve not though through the meaning of those three words in the article. Yes, coverage has been shallow and largely useless, but realistic stuff is getting out now. Vendors still oversell, but some are helping users make good decisions. If you’re disappointed in the outcome of an early AI project, doubt your own decisions a little, and take another look at it through the lens of explainability, interpretability, and observability. It might earn you a nice profit.

Email and RSS:

Our Commitment: All the Facts, Always the Truth