Test data management for Dynamics 365

How to build and maintain test data for Dynamics 365 environments — anonymisation, synthetic data, data subsetting, and the workflows for keeping non-prod environments useful.

Updated 2026-08-02

Realistic test data is one of the most underrated assets in any Dynamics 365 programme. Without it, testing happens against thin or stale data that doesn't reveal the issues real production data would. With it, defects are caught early and stakeholders trust UAT outcomes. Test data management is the discipline of designing, generating, anonymising, and maintaining data for non-production environments.

The problem.

Production-like data is essential for realistic testing but contains sensitive content that can't be exposed in non-prod.
Synthetic data is safer but rarely captures real-world data quirks that cause production issues.
Subsets of production balance realism and volume but bring referential integrity challenges.
Stale data in test environments means tests pass that production would fail.

The right answer is usually a mix: subsetted, anonymised production data refreshed periodically, supplemented with crafted synthetic data for specific scenarios.

Anonymisation strategies.

Substitute — replace names with fictional alternatives.
Mask — replace specific characters (e.g., emails become j***@e***.com).
Shuffle — re-order column values across rows (preserves distribution, breaks identity).
Hash — irreversibly transform identifiers.
Tokenise — replace with consistent placeholder values.

Each has trade-offs:

Substitution preserves natural reading but loses statistical properties.
Shuffling preserves stats but breaks within-record consistency.
Hashing is irreversible but identifiers don't look natural.

What to anonymise.

Names — first, last, full names.
Emails — fictional or hashed.
Phone numbers — fictional or masked.
Addresses — fictional, but geographically plausible.
Tax IDs / SSNs — never preserved in non-prod.
Credit card data — should never be in Dynamics; certainly not in non-prod.
Health data — strict regulatory requirements.
Salary data — masked or shuffled.
Free-text fields — most problematic; may contain accidentally-pasted sensitive content.

Free-text challenges. Notes, descriptions, email bodies, case narratives — these contain unpredictable content. Solutions:

Pattern detection — find SSN-like patterns, credit-card-like patterns, replace.
Wholesale replacement — replace free text with synthetic content.
Manual review — for small datasets.

Each has cost; complete free-text anonymisation is hard.

Referential integrity in subsets. Subset production: pick 1,000 customers from 100,000. Need their:

Related contacts.
Open opportunities.
Cases.
Orders.
All recursive related data.

A naive subset breaks referential integrity. Tools that follow relationships (parent-to-child) preserve integrity. Building such tools for complex schemas (F&O especially) is significant work.

Synthetic data generation.

Templated — predefined templates for common entities.
Generative — rule-based generation (names from lists, addresses from city tables).
AI-generated — LLMs generate plausible text content.
Combinatorial — exhaustive coverage of choice combinations.

For specific test scenarios (boundary conditions, rare combinations), synthetic data is more valuable than production subsets — production rarely contains the edge case you want to test.

Tools.

Microsoft Test Data Generator — limited capability.
Configuration Migration tool — F&O-focused.
Third-party — Tonic.ai, IBM Optim, Delphix.
Custom scripts — Power Automate, Power Shell, custom .NET.

For large-scale TDM, dedicated commercial tools save time; for small projects, scripted custom approaches suffice.

Refresh cadence.

Per UAT cycle — refresh UAT data before each round.
Per major test phase — refresh SIT before integration testing.
Quarterly — refresh dev / playground environments.

Refresh always includes re-anonymisation; bypassing this is a compliance risk.

Test data for specific scenarios.

Performance testing — large volume data; realistic distribution.
UAT — recent production-like data; familiar to stakeholders.
Functional testing — small, carefully crafted data covering specific scenarios.
Demo — clean, presentable data without anomalies.

Each scenario has different needs; one dataset rarely serves all.

Data versioning. As schemas evolve:

New columns need data.
Deleted columns leave gaps.
Renamed columns break references.

Test data needs versioning aligned with solution versioning — restore data from a backup matched to a solution version.

Common pitfalls.

No regular refresh. Test data 6 months stale; UAT participants find issues that production has long since had.
Anonymisation skipped. Production data refreshed to non-prod without anonymisation; compliance breach.
Free-text leaked. Anonymisation script handles structured fields but misses free-text; sensitive data slips through.
Subset broken. Subsetting tool misses relationships; tests fail with FK errors.
Synthetic data unrealistic. Generic names, default addresses; tests don't catch edge cases that real data would.
Environment data drift. Each test environment has different data; reproducing issues across environments impossible.

Operational rhythm.

Pre-UAT — refresh, anonymise, validate.
Pre-release — production-like data for pre-prod.
Continuous — synthetic data templates for ongoing testing.
Periodic — full test data audit; identify stale or compromised sets.

Strategic positioning. Test data management is investment that pays back continuously. Teams that skimp on TDM ship more defects to production. Teams that invest in it have higher-confidence releases and more meaningful UAT. The investment scales with environment count and regulatory exposure; smaller projects can manage with lighter tooling, larger ones need dedicated TDM platforms. Either way, treat test data as a programme asset — designed, maintained, governed.

Related guides

← All guides Glossary →