Data unification in Customer Insights — Data — a deep dive

How Customer Insights — Data unifies customer records across sources — match rules, merge logic, golden record creation, and the operational rhythm of identity resolution.

Updated 2026-10-17

A customer named "John Smith" in CRM, "J. Smith" in e-commerce, "John A. Smith" in loyalty — all the same person. Reconciling these into one unified profile is the core problem Customer Insights — Data (Microsoft's CDP) solves. Match rules, merge logic, and a unified profile produce the foundation for personalisation, analytics, and journey orchestration.

The unification problem. Multiple source systems each have their own customer records:

  • CRM has accounts and contacts.
  • E-commerce has user accounts.
  • Loyalty has members.
  • Support has cases linked to "Customer."
  • Marketing automation has subscribers.

Each system identifies customers differently — by email, phone, customer ID, account ID. Reconciling means deciding which records in different systems refer to the same real person.

The CI-Data unification flow.

  1. Ingest sources — connectors pull data.
  2. Map to a unified schema — standardise field meanings.
  3. Match — identify candidates that are the same person.
  4. Merge — combine matched records into one unified profile.
  5. Enrich — augment with calculated and inferred attributes.
  6. Export — push unified profile back to consuming systems.

Match rules. The match step uses configurable rules:

  • Exact match on email — strong signal.
  • Exact match on phone — strong.
  • Fuzzy match on name — weaker; can produce false positives.
  • Address proximity — for offline records.
  • Combined matching — name + email; name + phone; etc.

Different match rules at different confidence levels.

Match confidence.

  • High confidence — auto-merge.
  • Medium confidence — match candidate; flagged for review.
  • Low confidence — no merge.

Thresholds configurable; tuning is iterative.

Merge logic. When two records match:

  • Source priority — which source wins per field? Email from CRM, phone from e-commerce?
  • Most recent wins — latest update preserved.
  • Most complete wins — non-null value preferred.

Per-field merge rules; one field may have different rules than another.

Survivorship rules. Beyond merge:

  • First name — most recent.
  • Email — most recent verified.
  • Phone — verified preferred.
  • Address — most recent.
  • Custom attributes — per-attribute rules.

The golden record is constructed field by field.

Manual override. When auto-merge gets it wrong:

  • Operator can split incorrectly merged profiles.
  • Manually merge profiles that didn't auto-match.
  • Block specific records from matching.

The audit trail captures manual interventions.

Match performance. With millions of profiles, match runs are expensive:

  • Initial unification — bulk; can take hours.
  • Incremental — only new/changed records.
  • Match in batch — scheduled.

Tuning frequency vs latency: hourly for high-volume, daily for moderate.

Enrichment.

  • Calculated attributes — total purchases, recency, frequency, monetary.
  • Inferred attributes — lifetime value prediction, churn risk.
  • External enrichment — third-party data (demographics, firmographics).
  • Behavioural enrichment — derived from interactions.

The enriched profile is the consumable artifact.

Common matching challenges.

  • Name variations — Bill vs William vs W.
  • Address differences — "123 Main St" vs "123 Main Street".
  • Multiple email addresses per person.
  • Identity over time — names change (marriage), emails change.
  • B2B vs B2C blending — same person at work and home.

Each requires specific match rule strategy.

B2B unification. Beyond person:

  • Account unification — different systems' company records.
  • Account-contact relationships preserved.
  • Hierarchical accounts — parent / subsidiary.

CI-Data handles both B2C and B2B.

Match auditing.

  • Sample review — periodic sample of matches to verify correctness.
  • False positive rate — over-matching.
  • False negative rate — under-matching.

Audit drives match rule tuning.

Privacy and GDPR.

  • Right to know — fulfilled from unified profile.
  • Right to erasure — must propagate to all sources.
  • Consent — tracked per source.
  • DSAR — combined data subject access request.

CI-Data centralises this.

Common pitfalls.

  • Too-aggressive matching. Different people merged; PII leakage.
  • Too-conservative matching. Duplicates persist; analytics noisy.
  • Source schema drift. New source field; not mapped; data missing.
  • No audit. Quality of unification unknown.
  • Stale enrichment. Calculated attributes not refreshed; misleading.
  • Manual fixes lost. Manual overrides don't persist through re-unification.

Operational rhythm.

  • Continuous / hourly — incremental unification.
  • Weekly — match quality audit.
  • Monthly — rule tuning.
  • Quarterly — schema and source review.

Strategic positioning. Customer data unification is the foundation of customer-centric strategy. Without it, marketing is disjointed, service is impersonal, analytics is partial. With it, every consumer of customer data sees the same unified view — enabling consistent, personalised experiences. The investment is meaningful — months for initial unification — but the payoff compounds. CI-Data is Microsoft's CDP; for organisations on Dynamics 365 or Microsoft cloud, it's the natural choice. The technical setup is significant; the operational discipline of maintaining match quality and enrichment relevance is the longer commitment.

Related guides