Building a Data Strategy for Scalable Growth

Most organisations we work with have more data than they know what to do with. A retail business with five years of transaction history and a CRM they have outgrown. A healthcare provider whose clinical data lives in one system, billing in another, and patient engagement in a third. A financial services firm with 12 years of customer records spread across a legacy core banking system, a separate analytics warehouse, and a collection of spreadsheets maintained by the finance team.

The problem is not that these organisations lack data. The problem is that the data exists in incompatible formats, governed by different definitions, with no single agreed view of even the most basic entities — customer, product, account — that the business depends on every day. When three business units each maintain a different version of "customer," none of which agree, the organisation cannot answer a question as simple as "how many active customers do we have?" with confidence.

A data strategy is the plan for resolving that problem in a way that produces lasting improvement rather than a temporary fix. This is what one actually looks like in practice.

Start With the Questions, Not the Technology

The most common mistake in data strategy work is starting with tooling. Which data warehouse? Which BI platform? Which ETL pipeline? These are legitimate questions — but they are second-order questions. The first-order question is: what decisions does this organisation need to make, and what data would improve those decisions?

A distribution business might need to decide which routes to prioritise when capacity is constrained. The data question is whether the current systems can surface the relevant variables — delivery time, margin by route, customer priority, vehicle utilisation — in a single view, in time to be useful. If the answer is no, the gap between that question and the current data architecture defines the scope of the strategy.

Starting from decisions also forces prioritisation. Not every data problem is worth solving immediately. A data strategy that tries to fix everything at once usually fixes nothing, because the scope becomes unmanageable before anything reaches production. The right approach is to identify the three to five decisions that most directly affect revenue, risk, or operational efficiency — and build the data infrastructure to support those first.

"The data problem worth solving first is not the largest one. It is the one where improved visibility would most directly change a decision the business makes every week."

The Silo Problem Is Structural, Not Technical

Data silos — separate systems holding separate data with no reliable way to connect them — are almost universal in organisations that have grown faster than their data governance. They are not primarily a technology problem. They are an organisational problem that manifests as a technology problem.

The marketing team chose their automation platform. The sales team chose their CRM. The finance team runs on an ERP that was implemented eight years ago and generates reports nobody fully understands. Each system was the right choice at the time it was made. The problem is that nobody designed how they would share data — what the authoritative source for each data entity would be, how changes would propagate, and what would happen when they disagreed.

Solving this requires two things working together. First, a technical integration layer that connects the systems and creates a shared data foundation — typically a data warehouse or lakehouse architecture with defined ingestion pipelines and transformation logic. Second, a data governance framework that assigns ownership of each data entity, defines what the canonical version looks like, and establishes a process for resolving conflicts when they arise.

The technical piece is often easier than the governance piece. Getting agreement across departments on who owns the definition of "customer" — and what that definition includes — requires organisational alignment that no data platform can substitute for.

What a Single Source of Truth Actually Requires

"Single source of truth" is one of the most repeated phrases in data strategy — and one of the most frequently misunderstood. It does not mean all data lives in one system. It means each data entity has one authoritative source, and that other systems consume from it rather than maintaining their own competing version.

In practice, this means identifying the master record for each entity type. The CRM is the master for customer contact data. The ERP is the master for financial records. The operational database is the master for transaction history. A data warehouse aggregates from all of them for reporting and analytics, but does not compete with them as a record of transactions or customer accounts.

The infrastructure to support this — data pipelines, transformation layers, schema definitions, data cataloguing, lineage tracking — is not trivial to build. But the more important work is the upfront decisions: which system wins when two systems disagree about a customer's address? What is the refresh frequency for each entity type? Which fields are authoritative and which are derived?

These decisions, documented and enforced, are what make a data strategy durable. Without them, the warehouse becomes a third copy of the same contested data — slightly more accessible, no more trustworthy.

The Analytics Layer: What It Can and Cannot Tell You

Once the data foundation is in place — clean, integrated, governed — analytics becomes significantly more useful. Descriptive analytics tell you what happened. Diagnostic analytics tell you why it happened. Predictive analytics project what is likely to happen. Prescriptive analytics recommend what to do.

Most organisations start at the descriptive level and find real value there: dashboards that surface the information decision-makers need without requiring a data analyst to pull a report every time. The goal is not to replace analysts — it is to free them from routine reporting so they can focus on the diagnostic and predictive work that actually requires judgement.

The most frequent analytics failure mode is building dashboards that nobody uses. This happens when the analytics layer is designed around what data is available rather than what decisions it needs to support. A dashboard that shows 47 metrics, none of which map to a specific operational decision, will not get used. The discipline of starting with decisions — and working backward to the data that informs them — is the same discipline that makes analytics valuable at every maturity level.

"The measure of a good data strategy is not how much data you have access to. It is how often data changes a decision that would otherwise have been made on instinct."

Data Strategy as a Prerequisite for AI

There is a reason legacy systems are the primary barrier to AI adoption for most large organisations. The data needed to train, evaluate, and operate AI tools is locked inside systems that were built before machine-readable data pipelines were a design consideration. Getting it out, into the formats AI tools can use, requires either the integration work described above — or replacing the systems that hold it.

Organisations that have invested in a solid data foundation — unified pipelines, governed entity definitions, reliable analytics — are the ones that can move from AI pilot to AI production in months rather than years. The data infrastructure is not a separate investment from AI readiness. It is the same investment. The organisations that treat their data strategy as a prerequisite to their AI strategy are consistently the ones that get to production faster and with fewer expensive course corrections.

Where to Start

The most effective data strategy engagements we run start with an audit of the current state: which systems exist, what data they hold, how it moves between them, where it is trusted and where it is not, and what business decisions are currently being made on data the organisation itself would not fully endorse if the quality were made visible.

That audit produces a clear picture of the gap between the current data architecture and what the business actually needs. From there, the strategy is a sequenced plan to close that gap — starting with the highest-value decisions and building the infrastructure that supports them, in an order that produces usable output at each stage rather than waiting for a complete transformation before anything is useful.

At iMSX, we have run data strategy and analytics architecture engagements for organisations across healthcare, financial services, resources, and government — from NSW Health to Glencore. If your data is creating more questions than answers, we can help you work out what the right architecture looks like and build toward it in a way that produces visible, measurable improvement at each step.

Most Organisations Have Too Much Data and Too Little Visibility