The Death and Rebirth of Data: Part 2

Cameron Price
Oct 10
7 min read

We Industrialized the Plumbing and Forgot the Water.

In Part 1, I argued merging “data” and “engineering” was the day the industry died. In this continuation, we explore how the obsession with building ever more complex data infrastructure, the plumbing, caused us to forget the water: the clean, trusted, usable data that drives business outcomes. Gartner reports that up to 90% of enterprise data budgets are spent on integration and management rather than insights. The industry must now rebalance, focusing less on the pipes and more on the outcomes they’re meant to deliver.

In Part 1 of this series, The Day They Merged “Data” and “Engineering” is the Day the Data Industry Died, I argued that the day we merged the words “data” and “engineering” was the day the industry lost its way. Data became an engineering-first problem rather than an outcome-first opportunity.

Now let’s go deeper into what that meant in practice. I like to use the analogy of plumbing to explain. We industrialized the plumbing of data, while somehow forgetting about the water that was supposed to flow through it.

This is what I like to call the rise of the data plumbing industry. Walk into any modern data team today and you’ll find a sprawling network of connectors, pipelines, orchestrators, transformations, storage layers, and catalogs. Each has its own vendor, its own pricing model, its own “must-have” pitch.

We’ve created an entire industry around data plumbing. Moving data from one place to another, cleaning it, reshaping it, monitoring it, and checking it. This is being compounded further as we see AI teams and data science teams duplicating such processes with little communication between them (Accenture, 2023).

The result? 80–90% of the budget goes to building and maintaining the pipes. Less than 20% goes toward actually understanding the water (Gartner, 2022; McKinsey, 2023). This provides significant strain on the ability for customers to leverage AI, when in reality, we need budgets to move to the right to fund AI initiatives, especially when the data effort to deliver an AI outcome can be 100x of the actual AI effort itself (Forrester, 2023).

And in many organizations, the water still isn’t drinkable. Data leaders proudly show architecture diagrams covered in logos and arrows. Consultants provide amazing, shiny slides, methodologies, and presentations. They demonstrate their orchestration DAGs with hundreds of tasks, and the team celebrates the way forward.

But here’s the uncomfortable truth:

Business users don’t care about orchestration DAGs.
Executives don’t care about schema evolution.
Customers don’t care that you containerized your ingestion pipelines.

What they care about are answers, insights, and outcomes. They have a business to run.

The tragedy of the modern data stack is that we mistake movement for progress. We’re incredibly busy building pipelines, managing tools, and refactoring technology, but often the business impact is negligible.

So how do we determine the symptoms of a plumbing first culture. How do you know your organization has forgotten about the water? Look for these telltale signs:

Tool addiction. Every problem is solved by adding another tool to the stack. We see that even more today where every issue can be solved with AI, whether that is true or not. (See my blog: Every Hammer Doesn’t Need a Nail: A Pragmatic Approach to AI in Data Strategy). IDC (2023) estimates global spending on data management software surpassed $90 billion, much of it driven by tool sprawl and “AI washing”.

Pipeline vanity. Success is measured in number of jobs running, not number of decisions improved. Monte Carlo highlights this same challenge, noting that many data teams focus on output metrics—like the number of pipelines, DAGs, or tables—rather than impact metrics that show how their work drives real business outcomes. As they put it, “At the highest level, we must be able to quantify the impact driven by the platform — not just how many pipelines or dashboards it supports.” (Monte Carlo, Measuring the Impact of Your Data Platform, 2023).

Delayed answers. It takes weeks or months of engineering work to onboard a new data request or produce a new metric. A NewVantage Partners (2021) survey found 92% of firms struggle to deliver timely, data-driven decisions despite heavy investment in data infrastructure. This continues in 2025.

Opaque lineage. Nobody can explain how a number in a dashboard was calculated. The UK Parliament’s 2023 inquiry into government data quality failures cited “inability to trace calculations” as a direct cause of poor decision-making during COVID-19.

Budget burn. Data Team bills climb each month. Cloud data warehouse costs (notably Snowflake and Databricks) continue to rise sharply without matching business value, prompting CFOs to ask, “What exactly are we getting from all this data?” (The Information, 2023).

These are not fringe cases, they’re the mainstream experience, and wherever I go in the world, the challenges are the same.

Why Did We Forget the Water?

The reasons are both cultural and commercial:

Cultural drift. The rise of data engineering as a discipline brought in brilliant technologists. But many lacked the grounding in data semantics, business context, and analytics purpose. The focus shifted to technical excellence over business relevance. This created an island, or a silo. Today we see silos of teams between data engineering, data science, AI, and the business. In many cases AI is making this problem even worse.
BARC Germany (2024) found that only 24% of organizations say their data engineers understand business priorities. A gap that limits trust and adoption of data-driven decision-making.

Vendor incentives. Every vendor promises simplification, yet most add another layer of complexity. Their incentive is to make you dependent on their slice of the stack, not to simplify your ecosystem. It is a standard business model, start somewhere and then move up the stack. Snowflake and Databricks are following the same pattern SAP perfected decades ago. But it stifles innovation and the ability for the organization to flex as business conditions and technology change.
Gartner’s 2023 Magic Quadrant warned of “eco-system lock-in”, which limits innovation and agility.

Fear of missing out. Leaders worried that if they didn’t have Hadoop, or Spark, or Snowflake, dbt, or Databricks, they were falling behind. The fear of irrelevance led to over-investment in tech for tech’s sake.
Deloitte (2022) reported that 72% of organizations adopt new data technology due to perceived competitive pressure rather than clear business need.

The cost of forgetting the water has real consequences and they are painful:

High failure rates. Gartner (2022) reported that up to 85% of data projects failed to deliver measurable value, a figure echoed by PWC’s (2022) findings for AI initiatives. We are still seeing similar failure rates in AI. We continue to fail by not learning from the past.

Eroded trust. Business stakeholders stop believing in data when numbers are inconsistent or slow to appear. And even more concerning, they’ve stopped believing in their data teams. I have had many conversations with customers where the business units or departments decide to go it alone due to the lack of perceived value. My findings are backed by similar findings in the market. Experian (2023) found that 77% of business leaders don’t trust their organizations own data, and many no longer trust the teams responsible for it.

Talent burnout. Engineers burn out maintaining complex integrations, whilst analysts grow frustrated waiting for usable data. Driving an adverse effect, the hiring of more engineers resulting in large teams of highly skilled people performing tasks that drive little value. DBT Labs (2023) found analytics engineer churn is 50% higher than the industry average, largely due to pipeline maintenance fatigue.

Opportunity loss. Competitors who keep things simple often move faster, delivering insights while others are still debugging jobs. Netflix’s embedded analyst model (MIT Sloan, 2022) is a good example, speed and simplicity drive agility. With AI leveling the playing field, everyone is talking about their potential “Kodak” moment. Every organization’s moat just got a lot smaller, therefore agility, speed, whilst maintaining accuracy is key. Unfortunately, most organizations are not set up to succeed in that way.

The irony is devastating. The more we invested in plumbing, the less we quenched the thirst for data-driven outcomes.

So How Do We Reclaim the Focus on the Water?

Plumbing is necessary, but it should never be the star of the show. The star is the “water”. The clean, trusted, usable data that flows into the hands of people who need it.

To get back on track, we must ask:

What decisions will this pipeline improve? How is it connected to an outcome?
What outcome justifies this tool?
What is the simplest way to deliver value now, not in 12 months?

This needs to be done holistically, avoiding the danger of building bespoke capability to answer these three questions. I see this today in the discussion and building of data products, where companies build complex bespoke pipelines just to satisfy the “data product” narrative.

As BARC Germany concluded in their Data Products Study (2024):

“Organizations that start with business outcomes and backtrack to the minimum plumbing required consistently outperform those who start with plumbing and hope outcomes will follow.”

We don’t need fewer pipes; we need fewer unnecessary pipes.

In short, we need to stop celebrating the plumbing and start celebrating the outcomes.

Join a data conversation,

Cameron Price.