Azure Data Factory with Dynamics 365
How to use Azure Data Factory for Dynamics 365 data integration — connectors, common patterns, performance tuning, and when ADF is the right tool vs alternatives.
Azure Data Factory (ADF) is Microsoft's cloud-native ETL/ELT platform. For Dynamics 365 customers, ADF is one of several options for data movement — to and from Dataverse, F&O, and across hybrid scenarios. Understanding when ADF wins (and when other tools win) is foundational for designing the right integration architecture.
What ADF provides.
- Connectors to 90+ sources — Dataverse, F&O, SQL, Synapse, files, APIs.
- Pipelines — orchestrated workflows of data movement and transformation.
- Mapping data flows — visual transformation logic.
- Triggers — schedule, event-based, manual.
- Monitoring — pipeline runs, errors, performance.
- Integration with Azure DevOps / Git — pipeline definitions in code.
It's a comprehensive platform — large investment, large capability.
Connectors for Dynamics 365.
- Dataverse connector — read and write to Dataverse tables.
- Common Data Service (legacy) — deprecated; Dataverse connector is current.
- D365 Finance and Operations connector — read F&O data entities.
- Generic OData connector — fallback for entities not in dedicated connector.
For F&O, the data entity model is the integration surface; ADF connects via the OData endpoint or direct database access in specific patterns.
Pipeline patterns.
- Daily full refresh — read everything from source, write to destination.
- Incremental — only changed records since last run; uses last-modified-date or change tracking.
- Bulk historical migration — one-time large load.
- Continuous replication — frequent small batches.
Incremental is the dominant pattern for ongoing integrations; full refresh is for small data or where change tracking isn't available.
Common scenarios.
- Dataverse → data lake for analytics (alongside or instead of Synapse Link).
- F&O → data warehouse for reporting.
- External SQL → Dataverse for master data sync.
- Files (CSV, JSON, Parquet) → F&O for migrations.
- Dataverse → SQL for cross-application analytics.
Performance tuning.
- Parallel copies — multiple data partitions read concurrently.
- Compute scale — Integration Runtime size (DIUs).
- Filtering at source — push filters down rather than pulling all then filtering.
- Compression — Parquet vs CSV; orders of magnitude smaller and faster.
- Batch sizes — tune insert/update batch sizes for destination performance.
A well-tuned pipeline can move millions of rows per minute; poorly tuned can take hours.
Authentication.
- Managed Identity — preferred; ADF runs with a managed identity, granted access to sources.
- Service Principal — when managed identity isn't applicable.
- Linked services with credentials — for SQL connections.
Managed identity eliminates secrets — major security advantage.
Mapping data flows. Visual transformation:
- Filter, aggregate, join, pivot.
- Conditional split.
- Surrogate key generation.
- Slowly Changing Dimension handling.
Behind the scenes, mapping data flows run on Azure Databricks; performance depends on cluster sizing.
Comparison with alternatives.
| Tool | Strengths | Weaknesses | |---|---|---| | ADF | Comprehensive, scalable, code-first | Setup complexity, cost | | Power Automate | Simple, low-code | Limited scale, throttling | | Synapse Link | Real-time Dataverse → Synapse | One-way, less flexible | | Fabric Pipelines | Newer, Fabric-integrated | Less mature than ADF | | DMF (in F&O) | F&O-native | F&O-only | | Custom code | Total flexibility | Maintenance burden |
For high-volume, complex, scheduled integrations: ADF. For low-volume reactive integrations: Power Automate. For real-time analytics replication: Synapse Link / Fabric Link.
Cost. ADF pricing:
- Per pipeline run — small charge.
- Per activity execution — depends on activity type.
- Compute time — for data flows and integration runtimes.
- Data movement DIUs — for copy operations.
High-volume daily runs can be tens of dollars per day. Budget and monitor.
Source control. ADF pipelines as JSON in git:
- Pipeline definitions reviewable.
- Environment promotion via Azure DevOps releases or GitHub Actions.
- Branch-based development.
Mature ADF deployments treat pipelines as code, not as artifacts edited in the portal.
Error handling.
- Activity-level retry — automatic retries with backoff.
- Failure paths — pipeline branches on success/failure.
- Email / Teams notifications — alerts on failure.
- Logging — to Log Analytics for centralised observability.
Robust pipelines handle expected failure modes (transient errors, downstream throttling) and alert on unexpected.
Common pitfalls.
- No incremental logic. Full refresh daily; expensive and slow.
- No watermarking. Incremental logic without proper checkpointing; data missed or duplicated.
- Hardcoded credentials. Stored in linked services; security risk.
- No alerting. Pipeline silently fails; data integration broken for days.
- Source schema changes. Source adds column; pipeline fails. Schema-on-read patterns help.
- Performance unoptimized. Default settings; slow operations; high cost.
Integration with Microsoft Fabric. Fabric pipelines (the spiritual successor to ADF) are emerging:
- Built into Fabric.
- Same conceptual model as ADF.
- Better integration with Lakehouse / Warehouse.
- More limited capability initially; closing the gap.
For new Fabric-centric deployments, Fabric pipelines may be preferred; for existing ADF investments, no urgency to migrate.
Strategic positioning. ADF is the workhorse for serious data integration in Microsoft's cloud. For Dynamics 365 customers, it's the right tool for high-volume, complex, scheduled integrations between Dynamics and other systems. The investment in setup pays back through reliability and scale; the cost is real but manageable. Most enterprise Dynamics 365 deployments end up with significant ADF footprint — it's an architectural pillar of the integration story.
Related guides
- Azure Functions for Dynamics 365 integrationsHow to use Azure Functions to extend and integrate Dynamics 365 — patterns, authentication, lifecycle, performance, and the trade-offs vs Power Automate.
- Azure Service Bus integration with DataverseHow Dataverse publishes change events to Azure Service Bus — registration, message format, queues vs topics, and resilient consumer patterns.
- Azure Synapse Link for DataverseHow Synapse Link replicates Dataverse data to Azure Data Lake Storage continuously — architecture, configuration, query patterns, and the path forward as Microsoft Fabric Link emerges.
- Virtual network data gateway for Dynamics 365How the virtual network (VNet) data gateway enables Dynamics 365 and Power Platform to access data in private Azure VNets — architecture, configuration, and the use cases.
- Logic Apps Standard vs ConsumptionThe two Logic Apps hosting models — Standard (single-tenant) vs Consumption (multi-tenant) — and how to choose between them for Dynamics 365 integrations.