AI project ‘failure’ has little to do with AI

Reports galore about AI project failures suggest that generative AI (genAI) and Agentic AI are not yet ready for enterprise use. Although there is truth to that, enterprise AI project failures are rarely because of faulty algorithms and models. More often than not, management simply doesn’t understand the technology.

Let’s define failure. In the context of most of the reports about AI performing poorly in POCs, the problem is decidedly not that the technology didn’t work. A business failure is simply when the initial goals and objectives set by decision-makers are not met. 

Let’s put this AI problem into the context of a traditional manufacturing business issue. What if a construction company needed to move 50 tons of dirt from a worksite to a location 20 miles away? And instead of paying for earth movers and appropriate trucks, it purchased and supplied to all of its workers ball peen hammers.

Within no time, the project would be seen as a failure because no dirt was moved to the new location. Did the workers fail? Did the hammers malfunction? Or is management to blame because it forced them to use an absurdly wrong tool? Or what if the project involved moving dirt 9,000 miles away to an overseas location — and management only allowed the use of trucks?

The biggest data issue with genAI is the lack of reliability. That comes from a variety of factors, everything from hallucinations, to bad training data, bad fine-tuning, misinterpreted queries, badly phrased queries, a lack of proper data weighting (where low-quality sources are given the same credibility as high-quality ones) and other factors.

But someone who understands those realities can still get a ton of useful information from the technology. It simply has to be independently verified. I’ve used genAI tools for math problems, but I always verify the answer with a legacy calculator. I will also use it for research — but only as a pointer. Every detail must still be verified. Think about investor call transcripts. You can use genAI to find a statement, but you still have to find a copy of the original audio on a high-reliability site and listen to the passages to verify the transcript is correct.

That is why autonomous agents are so problematic. Their deployment mandates a higher degree of trust than the technology merits. 

Some companies are embracing humans-in-the-loop. That is a great concept as long as the executive is being realistic and reasonable. Are the tasks and volume of work expected reasonable for a skilled human to perform?

Consider a hospital chain that wants to use genAI to analyze lab tests or X-rays more efficiently. For legal reasons, they need to have humans verifying and signing off on the results. So far, so good. An experienced and skilled radiologist should be able to review and approve a test analysis faster than directly doing the analysis.

In the classic NTSB testimony of Capt. Chesley Sullenberger in the “Miracle on the Hudson case,” he demanded that 20 seconds of human reaction time be added to the simulation because those “seconds is the span necessary for even the most experienced operator — Sullenberger had been a commercial pilot for 29 years — to recognize an emergency, seize control and respond effectively.”

Before the deployment of AI, most of the humans had been reviewing anywhere from eight to 10 test results an hour. But that included time to write the results. 

I have heard reports of hospital chains that are now pushing these same humans to review/approve/decline more than 300 test results per hour. That gives them an average of 12 seconds of review per test result. That is barely enough time to look at the AI recommendation and glance at the original image. There’s no time for meaningful thought or analysis. 

That’s not using a person to verify the results; that’s setting up an employee to take the blame when AI invariably screws up. If you want to put humans in the loop, keep the expectations realistic — and human.

I’ve already documented various problems with Agentic systems, such as the inability to track and alert other agents if a hijacked agent has poisoned them with malicious instructions. The fact that that kind of attack cannot be halted should keep enterprises from touching them until they are safe. 

And yet, enterprises are plowing ahead. 

AI can deliver incredibly powerful tools, but by trusting them too much and placing unreasonable demands on the remaining humans, we are condemning these early AI projects to fail. That’s 100% the fault of managers who knew the risks, grabbed for profits, and later fired AI project managers when things went sour. 

Read more: AI project ‘failure’ has little to do with AI

Story added 2. April 2026, content source with full text you can find at link above.