Data & Analytics · Financial Services

Analytics Projects Fail When Engineering Teams Treat Data as a Reporting Problem

Valli Nayagam

Jump to section

Share

Summarize with AI

The Dashboard Was Not the Product

A dashboard ticket closes a request. A data product creates an interface other teams can trust after the meeting ends.

How Reporting Queues Hide Ownership

A SaaS company had five MRR numbers: finance used invoice date, growth used activation date, RevOps joined Salesforce fields, and customer success excluded paused accounts differently. The board saw three numbers in one meeting. The fix was not another report. It was mrr_daily as a product: owner, grain, refund rule, SLA, reconciliation check, changelog, and a migration window from old dashboards.

Where Data Teams Fool Themselves

Engineering teams often treat analytics demand as a visualization backlog. More dashboards appear, but each dashboard forks SQL, definitions, freshness expectations, and ownership.

For data as product, a senior review should ask which data trust decision is being made, which evidence proves it, and what signal would force the team to pause.

What a Data Product Must Prove

The Part Teams Underestimate

Dashboard factory

Engineers become chart printers. Backlog grows faster than platform investment. Each request forks logic because the fastest path is a new SQL file, not extending a shared layer. Six months later, finance has three "official" churn definitions and nobody will delete any of them.

No SLAs on shared tables

revenue_daily updates when someone remembers. Finance closes books on stale numbers. Marketing plans spend against a funnel mart that missed weekend events. When there is no SLA, there is no pager, and stale data is indistinguishable from correct data in the UI.

Consumers confused with stakeholders

Everyone wants a dashboard; few sign up to own the underlying dataset quality. Stakeholders attend demo meetings. Consumers depend on the data daily. Without a named owner, quality debates happen only after a bad decision.

Platform team as cost center

Without product framing, data platform funding competes with features and loses until incident forces attention. Platform work looks like overhead because outputs are invisible until something breaks. Framing datasets as products makes value explicit: reused interfaces, fewer forks, faster audits.

Fork and forget

Analyst leaves. Notebook and SQL remain. New hire rebuilds from raw tables because they do not trust the old dashboard. Institutional knowledge walks out while duplicate pipelines stay on the bill.

The Diagram Behind the Decision

Bad: ticket queue analytics

Analyst ticket --> engineer writes SQL --> dashboard --> next ticket (forked SQL)

Every ticket adds branching logic. Shared tables become dumping grounds. Changes require archaeology.

Good: product-oriented analytics

Domain owner --> data product backlog --> published semantic layer / dataset API
                                              |
                                    BI, ML, ops consumers

Changes flow through versioned contracts. Consumers know whom to call when numbers drift. New dashboards compose existing products instead of scraping raw zones.

What Changes After You See It

Data powers decisions like APIs power applications. Treat published datasets and metrics as interfaces with compatibility promises.

CTOs align engineering and analytics when data product owners sit in the same planning rhythm as product managers. Roadmap reviews ask which datasets need SLAs, which definitions change, and which forks to retire, not only which charts to build.

Self-service analytics works when curated products exist. It fails when self-service means unlimited access to raw tables without contracts. The goal is not fewer analysts. The goal is fewer one-off pipelines serving the same business concept.

Worked Example: SaaS revenue metrics

RevOps files weekly tickets for MRR views. Each engineer joins different subscription tables, applies different refund rules, and ships another dashboard.

Reporting queueData product mrr_daily
five MRR definitionsone versioned metric
fragile dashboardsconsumers use semantic layer
blame in month-endowner paged on SLA breach
new hire rebuilds SQLdocumented contract and changelog

Engineering time shifted from chart printing to one maintained product. Board deck, finance close, and growth experiments read the same mrr_daily v3. When definition changed for a new product line, consumers got two weeks notice and a migration guide.

Where This Shows Up: financial services and SaaS

Financial services: advisor and firm metrics must match regulatory reports. Product-owned definitions reduce examiner friction. When each region built its own AUM logic, compliance spent quarters reconciling. A single versioned aum_daily with owner and changelog turned audits into demonstrations, not scavenger hunts.

SaaS: growth teams need consistent funnel and retention metrics. Forked SQL across teams kills trust in board numbers. Product-led growth adds product usage data outside the warehouse copy of CRM. A product layer that joins account, subscription, and usage with clear SLAs lets sales, success, and product argue about strategy, not about whose SQL is right.

Operating rhythm: data product owners attend the same planning cadence as application product managers. They publish a small roadmap: new attributes, breaking changes, deprecations. Consumers subscribe to changelog notifications like they would for a public API.

Deprecation policy: when a metric version changes, run both v2 and v3 in parallel for one close cycle with a diff report. Consumers migrate deliberately instead of discovering renames in a board meeting.

The Product Boundary

Reporting queue:
Request --> SQL fork --> Dashboard --> next request --> next fork

Data product:
Domain source --> product pipeline --> versioned contract --> BI / ML / Ops consumers

What Data Product Discipline Costs

Saying no to a dashboard until the product underneath exists feels slow. It is faster than reconciling five official answers every month.

For data as product, the useful review is not a generic architecture checklist. It should inspect ownership, grain, freshness, lineage, consumer impact, and change safety. If those fields are missing, the team may still be busy, but leadership does not yet have a decision-quality artifact.

The Operating Review

For data work, the release review should treat datasets like APIs. Start with the published interface: table, semantic model, stream, feature set, or dashboard metric. Who owns it? What is its grain? How fresh must it be? Which consumers depend on it? What schema changes are compatible, and what requires a version bump?

The second artifact is a contract. Freshness, volume, uniqueness, null rates, referential checks, reconciliation totals, and schema expectations should run before consumers receive the data. A green Spark job, Glue job, dbt run, or Flink checkpoint is not enough. The contract should answer whether the data is fit for the decision it supports.

The third artifact is lineage. When a dashboard, metric, or downstream model is wrong, the team should trace it to upstream sources, jobs, table snapshots, owners, and last good publish without starting a forensic SQL project. Iceberg snapshots, warehouse query history, orchestrator metadata, and transformation manifests can all help, but only if lineage capture is part of the deploy path rather than an audit retrofit.

Finally, inspect access and purpose. Healthcare and financial services cannot treat broad warehouse roles as harmless convenience. Retail and SaaS teams also suffer when unrestricted exports become shadow systems. Access should match role, sensitivity, and purpose, with review cadence and break-glass behavior written down.

Failure Drills

Data release tests should include empty extracts, duplicated batches, schema drift, late events, timezone shifts, replayed files, null explosions, source deletes, and upstream business-rule changes. A dashboard that still renders during all of those cases is not proof of health. It may simply be hiding the failure.

For streaming or CDC-backed systems, test lag, reordering, compaction, tombstones, and consumer restart behavior. For warehouse and lakehouse systems, test partition gaps, snapshot rollback, and reconciliation against the source. The goal is not perfect data. The goal is knowing when data stopped meeting its contract.

Other Operating Models That Can Work

Raw self-service is good for exploration. A semantic layer helps when shared metrics need consistency. Domain-owned data products scale when each domain has platform rules for ownership, contracts, and deprecation.

In data as product, the alternative paths are not steps on a ladder. Each one carries a different mix of risk, cost, and learning. The weak choice is the one that hides the tradeoff until users, operators, or auditors discover it for you.

The Rule for Shared Data

A dashboard ticket closes a request. A data product creates an interface other teams can trust after the meeting ends. The practical lesson is to demand evidence that fits data as product, not a universal checklist. The artifact should expose ownership, grain, freshness, lineage, consumer impact, and change safety clearly enough for another team to challenge the decision.

If data as product is the decision in front of your team, use the Data and Analytics Readiness Session to pressure-test the boundary before it hardens.

Data & Analytics Readiness Session

Lineage sketch + pipeline themes, with a ThinkCore demo on sample or sandbox data.