ml-connector
Workday Financial ManagementDatabricks

Workday Financial Management and Databricks integration

Workday Financial Management runs accounts payable, procurement, and general ledger. Databricks stores and transforms that financial data for reporting and analysis. Connecting the two moves invoices, purchase orders, GL accounts, and journal entries from Workday into Databricks on a schedule you control, so your data warehouse always reflects the latest transactional state. No manual exports or middleware required.

How Workday Financial Management works

Workday Financial Management exposes suppliers, supplier invoices, purchase orders, GL accounts, and journal entries through both SOAP and REST APIs. The SOAP endpoint at https://{hostname}.myworkday.com/ccx/service/{tenant}/Financial_Management/v46.1 handles full CRUD operations with WS-Security UsernameToken authentication. The REST endpoint at https://{hostname}.workday.com/ccx/api/v1/{tenant}/ provides lighter reads with OAuth2 refresh-token authentication. Workday has no native webhooks for financial events, so data is retrieved by polling with date-range filters. Recommended polling intervals are 15 to 60 minutes for transactional entities and daily for reference data such as suppliers. The Integration System User account must be set up and assigned appropriate security groups before the connector can authenticate.

How Databricks works

Databricks is a cloud data platform that ingests and stores data in Delta Lake tables via REST APIs. The workspace-specific endpoint is https://{workspace-id>.cloud.databricks.com (AWS), https://{workspace-name}.azuredatabricks.net (Azure), or https://{workspace-id}.gcp.databricks.com (GCP), all using /api/2.0/ or /api/2.1/ prefixes. Authentication uses OAuth2 Client Credentials with a Service Principal client_id and client_secret, or optionally a Personal Access Token. Bearer tokens expire after 3600 seconds, so token refresh is required for long-running data loads. Databricks is a data platform with no native ERP or financial objects; it ingests and stores whatever structure the source provides.

What moves between them

Financial data flows from Workday Financial Management into Databricks. Invoices, purchase orders, GL accounts, and journal entries are polled from Workday on a regular schedule (15- to 60-minute intervals for transactional data) and written as new or updated rows into Databricks Delta Lake tables. No data flows back from Databricks to Workday. Reference data such as GL account hierarchies and supplier lists can be refreshed daily to keep dimension tables current. Each record ingested into Databricks carries the Workday source timestamp and the ml-connector sync batch timestamp, so lineage is preserved and records can be re-synced if downstream processing fails.

How ml-connector handles it

ml-connector uses Workday's SOAP API to retrieve financial records with date-range filters, polling at your configured schedule (typically 15 to 60 minutes for transactional data, daily for reference data). It generates valid WS-Security UsernameToken headers with every SOAP request using the Integration System User credentials you provide. On the Databricks side, ml-connector refreshes the OAuth2 bearer token every hour (before the 3600-second expiry) and writes transformed records as INSERT or UPSERT operations into Delta Lake tables. Before attempting the first insert, ml-connector creates or updates table schemas in your specified Databricks catalog and schema to match the Workday financial record structure. Invoices and POs are keyed by their Workday external ID to prevent duplicate rows on re-sync; GL entries are keyed by journal entry ID plus line number. If a Databricks write fails, ml-connector queues the batch for replay so no records are lost in transit.

A real-world example

A mid-market professional services firm runs Workday Financial Management for billing, procurement, and GL, and uses Databricks as their analytics platform. The finance team needs near-real-time visibility into invoice aging, AP cash flow, and project profitability, but currently exports Workday invoice and PO data monthly and re-loads it into Databricks by hand. With Workday and Databricks connected, invoices and POs flow into Databricks every hour, and the analytics team can build dashboards and reports directly against current data. Month-end close reconciliation starts with fresh numbers, and the finance team gains the ability to spot aging AP aging issues or budget overages the same day they occur.

What you can do

  • Poll Workday Financial Management SOAP APIs on a schedule and ingest invoices, purchase orders, GL accounts, and journal entries into Databricks Delta Lake.
  • Handle Workday OAuth2 token refresh and SOAP WS-Security authentication, managing credentials encrypted at rest.
  • Map Workday financial record structures into Databricks schemas and create or update tables as needed.
  • Track source timestamps and sync batch identifiers in every ingested record, preserving data lineage and enabling replay on downstream failures.
  • Deduplicate records by Workday external ID and replay failed batches without creating duplicates.

Questions

Does Workday Financial Management support webhooks so data can be pushed to Databricks in real time?
No. Workday Financial Management has no native webhook system for financial events. ml-connector polls the SOAP API on a schedule you define, typically every 15 to 60 minutes for invoices and POs, and daily for reference data like suppliers and GL accounts. This approach is reliable and keeps Databricks synchronized within your chosen delay window.
How does ml-connector handle authentication for both systems?
For Workday, ml-connector stores your Integration System User credentials and generates WS-Security UsernameToken headers with every SOAP request. For Databricks, it uses OAuth2 Client Credentials with your Service Principal client_id and client_secret, refreshing the bearer token hourly before the 3600-second expiry. Both credential sets are encrypted in ml-connector's store.
What happens if a Databricks write fails or the connection drops mid-batch?
ml-connector queues failed batches for replay and retries with exponential backoff. Because each Workday record is keyed by its external ID, re-playing the batch does not create duplicate rows in Databricks. You can also manually re-sync a date range if needed.

Related integrations

Connect Workday Financial Management and Databricks

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started