top of page

The Day They Merged “Data” and “Engineering” Is the Day the Data Industry Died

  • Cameron Price
  • Jul 31
  • 5 min read

The Day They Merged “Data” and “Engineering” Is the Day the Data Industry Died

 

This is the first instalment in a new series: “The Death and Rebirth of Data”, a brutally honest exploration of how the data industry lost its way, and what it will take to bring it back.



I’ve spent more than 35 years in the data and analytics industry. EIther hands-on, as an ETL developer building ETL pipelines, designing databases, data modelling, building and managing data warehouses, lakes and everything in between, through to leading global data organizations. I’ve seen the rise and fall of every trend, from relational databases to MPP systems, Hadoop clusters to cloud data warehouses, and more recently lakehouses and the explosion of data options now available. I’ve seen new acronyms rise like weeds and vendors rebrand the same ideas with shinier buzzwords.


But through it all, one thing has become painfully clear.


The moment we merged the words “data” and “engineering” was the moment the data industry lost its way.


What started as a discipline meant to make data accessible, understandable, and actionable has become an engineering-centric industrial complex, optimized for pipelines, code, tools, orchestration, and control, but increasingly disconnected from the very thing it was supposed to serve, the outcome.


“Only 10% of companies report achieving significant financial benefit from their data and AI investments.” – McKinsey, 2023 [1].


 

Data Engineering. The Great Hijack


Data engineering was supposed to be the plumbing, invisible, reliable, and in service of insight.


Instead, it has become the cathedral. We now have entire organizations where 80–90% of the data budget is sunk into data engineering teams, platforms, and tooling, and yet business users still struggle to get their hands on trusted, usable data.

 

Nearly three‑quarters (72%) of businesses are struggling to unlock value from data, despite multicloud adoption”. – TechRadar, 2023 [2]

 

Meanwhile, Hightouch calls out fragmented, siloed data teams as a core barrier.

 

“Most data practitioners…are often deeply siloed from the stakeholders they’re serving”. – Hightouch, 2024 [3]

 

Illustrating that the fragmentation is structural, not accidental.

 

We chase lineage graphs, deploy orchestration tools, build CDC pipelines, and obsess over schema evolution, but we rarely stop to ask, what value did this create?

 

From my perspective, the more tools we add to the modern data stack, the less modern it feels. What was meant to simplify access has turned into a maze of orchestrated chaos.

 

Ask a data engineer what they shipped this quarter, and they’ll talk about DAGs, dbt jobs, airflow runs, and new S3 partitions.

Ask a business leader what they got from it, and you’ll get a shrug.

 

We’ve industrialized process, not progress.

 

 

When Did It Go So Wrong?


It went wrong the day we turned data into an engineering problem instead of a business opportunity.


When the vocabulary shifted from “insight” to “pipeline,” from “decisions” to “deployments,” and from “meaning” to “metadata.”

 

Today’s tooling ecosystem enables control but often obscures value.

 

“The tooling explosion has created more silos than it’s solved. It’s less a stack and more a tax”. – Beno Stancil, 2023 [4]

 

And here’s the real issue, most data engineers are not data people.


They’re software engineers, brilliant, skilled, and well-intentioned, with limited understanding of data semantics, disconnected from the business context, and analytical relevance that turns data into decisions.

 

Business analysts and data engineers frequently work in parallel universes, causing misalignment and friction.


“Data engineers often optimize for infrastructure and stability, while analysts optimize for iteration and impact, this misalignment creates friction.” – dbt Labs + Mode, 2021 [5].


Data should be owned by the business and governed by the domain. Not engineered into inaccessibility.


It’s not their fault. It’s ours.


We let the tooling tail wag the data dog.

 

The Cost of Complexity


Every new layer of tooling, ingestion frameworks, lake formats, streaming layers, catalog services, and orchestration engines, promises simplification. And yet, somehow, it all keeps getting more complicated and confusing.


We’ve created jobs to manage the tools, teams to manage the jobs, and roles to govern the teams. The further we go, the more abstracted we become from what really matters, enabling people to understand their world through data. 


We’ve mistaken movement for momentum.


Organizations reported data downtime almost doubled year-over-year, and poor data quality impacted an average of 31% of revenue, with business stakeholders identifying issues first in 74% of organizations. – Monte Carlo, 2023-2024 [6][7].


Investing in reliability and observability isn’t optional, it’s urgent.


Meanwhile, ThoughtSpot research suggests that democratizing analytics, especially through search-driven, GUI-first tools, can dramatically boost adoption and accelerate returns on data investments [8].


“GUI-based analytics tools increased data adoption by 52% compared to CLI environments”. – ThoughtSpot Research, 2023 [9]


So prioritizing ease of use and AI-powered search isn’t a luxury, it’s a requirement for transformation.


In a recent program, I asked why engineers still use blank screens and command-line CLIs. The response? “It’s easier, it’s best practice.” But it reminded me of the MS-DOS era. I couldn’t help thinking of the parallels, I’ve been in the game a long time. We typed and typed, until someone invented Windows.


Isn’t there a GUI for that?

 

It’s Time to Rethink


It’s time we asked hard questions:

  • What if 80% of the data engineering work we do is unnecessary, because it delivers no direct business outcome?

  • What if the modern data stack is more about vendor lock-in than about user access and enablement?

  • What if the real solution is less data engineering, more product thinking?


    dbt Labs reports that analytics engineers are increasingly essential, but the role is blurring boundaries between analysts and engineers.


    Teams that embrace product thinking and data trust are the ones delivering outcomes, not just deploying pipelines. As a colleague of mine once said, “I shouldn’t need a sprint cycle to get my sales data from last week”.

 

Ironically, many of these advisory roles, “analytics engineers”, “platform owners”, emerge not because they solve business problems, but because they manage the complexity we created.

 

We need to return to first principles.

 

Make data useful. Understandable. Trusted.

 

That doesn’t require five orchestration tools and a lakehouse. It requires empathy with the user, clarity on the outcome, and ruthless simplification of the path from source to decision.

 

AI doesn’t just accelerate access. It demands simplification. If you still need engineers to interpret the output, you’ve missed the point.

 

Reclaiming Data’s Soul


Data is not code. It’s context.


It’s the story of a customer, a product, a transaction, a moment.


“The missing piece in most data teams is storytelling” – Harvard Business Review & BigEye, 2022-2023 [10][11]

 

Without narrative, insights rarely drive change.

 

When we reduce data to schema files and pipeline scripts, we strip it of meaning and magic.

In my experience, turning data into code turns context into syntax, but business decisions are not made in YAML.

 

It’s time to bring the soul back into data.

 

That means putting insight over infrastructure, outcomes over ops, and people over pipelines.

 

The era of engineering-as-value is over. We must stop engineering data and started empowering it.

 

Because the day we merged "data" and "engineering" was the day the industry lost its soul. The day we un-merge them? That might just be the day we bring it back.

 

 

Join the data conversation,

Cameron Price.

 



References

  1. McKinsey & Company (2023). The State of AI and Data in Business: 2023 Survey. https://www.mckinsey.com

  2. TechRadar Pro (2023). 72% of businesses still struggling to derive value from their data. https://www.techradar.com

  3. Hightouch (2024). The State of the Modern Data Team. https://hightouch.com/blog/state-of-the-modern-data-team

  4. Stancil, Beno (2023). The Modern Data Stack Tax. https://benn.substack.com/p/the-modern-data-stack-tax

  5. dbt Labs & Mode (2021). The Analytics Engineer: 2021 Benchmark Report. https://mode.com/state-of-data-2021

  6. Monte Carlo Data (2023). Data Downtime and the Business Impact of Bad Data. https://www.montecarlodata.com/blog/2023-data-observability-report

  7. Monte Carlo Data (2024). 2024 Data Observability Trends. https://www.montecarlodata.com

  8. ThoughtSpot (2023). Analytics Adoption Report: GUI and Search-Led Analytics. https://www.thoughtspot.com/resources

  9. ThoughtSpot (2023). GUI vs CLI: The Impact on Data Tool Adoption. https://www.thoughtspot.com

  10. Harvard Business Review (2022). Why Data Storytelling Is So Important. https://hbr.org/2022/04/why-data-storytelling-is-so-important

  11. BigEye (2023). The State of Data Quality and Trust. https://www.bigeye.com/resources

 

Comments


bottom of page