ml-connector
Microsoft Dynamics 365 F&ODatabricks

Microsoft Dynamics 365 F&O and Databricks integration

Microsoft Dynamics 365 F&O runs financials, procurement, and supply chain. Databricks runs analytics, data engineering, and machine learning on Delta Lake. Connecting the two moves your ERP records out of the transactional system and into a warehouse where they can be modeled and reported without straining the ERP. ml-connector reads finance and procurement entities from F&O over OData and writes them as rows into Databricks tables, then triggers a Databricks Job to transform them. The flow is one-directional into Databricks, because Databricks holds no source-of-truth financial records to send back.

How Microsoft Dynamics 365 F&O works

Microsoft Dynamics 365 F&O exposes all public data entities through OData v4 (REST and JSON) at a single tenant-specific root URL such as https://contoso.operations.dynamics.com/data/. Key finance and procurement entities include VendorsV2, VendorInvoiceHeaders and VendorInvoiceLines, PurchaseOrderHeadersV2, CustomersV3, ProductsV2, MainAccounts, and the read-heavy GeneralJournalAccountEntries. Authentication is OAuth 2.0 client credentials through Microsoft Entra ID, with a Bearer token scoped to the environment host. F&O can push outbound Business Events (for example vendor invoice posted or purchase order confirmed), but those payloads are lightweight stubs carrying identifiers and a ControlNumber, so the full record is still read back over OData; large or steady reads are done by polling with server-driven paging.

How Databricks works

Databricks is a cloud data platform, not an ERP, so it has no native vendor, invoice, purchase order, payment, or GL account objects. Its REST API manages compute (clusters, Jobs, SQL warehouses) and data governance through Unity Catalog (catalogs, schemas, tables), but the REST table endpoints handle metadata only. Actual rows are written through the SQL Statement Execution API against a running SQL warehouse, or through a Databricks Job that runs a notebook. Authentication is OAuth 2.0 client credentials (machine-to-machine) using a service principal, with a Basic-auth call to the workspace /oidc/v1/token endpoint returning a one-hour Bearer token. Databricks has no inbound webhook to announce that ERP data should be pulled, so the load is driven on a schedule.

What moves between them

The flow runs one direction, from Microsoft Dynamics 365 F&O into Databricks. ml-connector reads vendors, vendor invoices, purchase orders, customers, products, main accounts, and posted general journal entries over OData and lands each entity as rows in its own Delta Lake table inside a Unity Catalog schema. F&O Business Events such as invoice posted or purchase order confirmed can trigger an incremental pull as records change, and a scheduled poll backfills any event that did not arrive. After a batch lands, ml-connector triggers a Databricks Job to model the raw tables. Nothing is written back to F&O, since Databricks holds analytics copies, not source financial records.

How ml-connector handles it

ml-connector stores both credential sets encrypted. On the ERP side it requests an Entra ID client-credential token scoped to the F&O environment host and refreshes it when a call returns 401, and it accepts the full tenant URL per customer because F&O publishes no shared base address. On the Databricks side it gets a service-principal OAuth token from the workspace /oidc/v1/token endpoint, also refreshed before the one-hour expiry. Because Databricks has no inbound table webhook, F&O is polled on a schedule, following @odata.nextLink for server-driven paging and honoring the Retry-After header on HTTP 429. Each F&O entity maps to a Delta Lake table, and natural keys such as InvoiceNumber with VendorAccount become the row identity so re-reads upsert instead of duplicate. Every load is submitted with a Jobs idempotency_token so a network retry returns the existing run rather than loading the batch twice. Real edge cases handled: cross-company reads need cross-company=true plus a dataAreaId filter for multi-entity tenants; financial dimensions arrive as a formatted display string; and the first OData call after an AOS restart can take 30 to 60 seconds, so timeouts are generous on the warm-up request. Every record carries a full audit trail and can be replayed if a downstream Databricks write fails.

A real-world example

A mid-sized industrial distributor with roughly 600 employees runs Microsoft Dynamics 365 F&O for finance, procurement, and inventory across several legal entities. The finance and operations teams wanted spend and margin dashboards that joined vendor invoices, purchase orders, and GL entries, but running those heavy queries directly against the ERP slowed down day-to-day transactions, and exporting to spreadsheets each month meant the numbers were always stale. With F&O connected to Databricks, vendors, invoices, purchase orders, and posted journal entries land in Delta Lake on a schedule, a Databricks Job models them into reporting tables, and analysts query the warehouse instead of the ERP. Dashboards refresh on current data and the transactional system is left to do its job.

What you can do

  • Land vendors, vendor invoices, purchase orders, customers, products, and posted general journal entries from F&O as rows in Databricks Delta Lake tables.
  • Map each F&O OData entity to its own Unity Catalog table, using natural keys so re-reads upsert rather than duplicate.
  • Bridge Entra ID client-credential tokens on the F&O side and a Databricks service-principal OAuth token on the warehouse side, refreshing both before expiry.
  • Trigger a Databricks Job after each batch lands, with an idempotency token so a retry never double-loads.
  • Poll F&O on a schedule with server-driven paging and Retry-After handling, plus a full audit trail and error replay on every record.

Questions

Which direction does data move between Microsoft Dynamics 365 F&O and Databricks?
It moves one direction, from F&O into Databricks. Vendors, invoices, purchase orders, customers, products, main accounts, and posted general journal entries are read over OData and written into Delta Lake tables. Databricks holds analytics copies rather than source financial records, so ml-connector does not write anything back into F&O.
Can Databricks store F&O invoices and GL entries directly through its REST API?
Not directly. The Databricks REST API manages table metadata in Unity Catalog, but it does not insert rows. ml-connector writes the actual F&O records using the SQL Statement Execution API against a running SQL warehouse, or through a Databricks Job, so the data lands in Delta Lake while the REST API governs the table definitions.
How does the integration know when to pull new F&O data without a Databricks webhook?
Databricks has no inbound webhook to request an ERP pull, so the load runs on a schedule you set. F&O Business Events such as invoice posted or purchase order confirmed can also trigger an incremental read, but because those payloads are only identifier stubs, ml-connector still calls back over OData for the full record, and a scheduled poll backfills anything an event missed.

Related integrations

Connect Microsoft Dynamics 365 F&O and Databricks

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started