ml-connector
QADDatabricks

QAD and Databricks integration

QAD runs manufacturing, procurement, and finance. Databricks runs analytics and data engineering on Apache Spark and Delta Lake. Connecting the two moves QAD business records into Databricks so reporting, modeling, and spend analysis run on current operational data instead of stale exports. Databricks holds no invoices, vendors, or GL accounts of its own, so this connection treats it as a governed data destination rather than a financial system. ml-connector reads QAD on a schedule and writes the data into Delta Lake tables that are registered and access-controlled in Unity Catalog.

How QAD works

QAD Adaptive ERP exposes suppliers, purchase orders, supplier invoices, GL accounts, cost centers, items, goods receipts, AP payments, and customers through REST business document APIs, documented in Swagger inside each customer instance. The cloud product authenticates with a JWT session or OAuth2 bearer token against a tenant-specific URL, so there is no shared hostname. Older on-premise sites run QAD Enterprise Edition with the QXtend SOAP framework instead. QAD has no public webhook system for cloud connectors, so finance records are read by polling on a schedule.

How Databricks works

Databricks exposes compute, data governance, and identity through the Databricks REST API, a JSON over HTTPS interface split across 2.0 and 2.1 versioned paths against a workspace-specific URL. It authenticates with OAuth 2.0 client credentials using a service principal, returning a Bearer token that expires after one hour. Table data is not written through the REST API directly; instead ml-connector runs INSERT and MERGE statements through the SQL Statement Execution API against a running SQL warehouse, or triggers Databricks Jobs for larger loads. Databricks has no platform-wide webhooks for table writes or job completion, so loads are driven and confirmed by polling.

What moves between them

Data moves from QAD into Databricks. ml-connector reads QAD suppliers, purchase orders, supplier invoices, goods receipts, AP payments, GL accounts, and cost centers on a schedule and writes each record set into a matching Delta Lake table, keyed so re-reads update existing rows rather than duplicate them. New target tables are created and registered in Unity Catalog with the right catalog and schema before data lands. The flow is one-directional for business records: Databricks is read-only with respect to QAD, so ml-connector never writes analytics output back into the ERP. Cadence follows your reporting needs, from hourly polls to nightly batch loads.

How ml-connector handles it

ml-connector stores both credential sets encrypted. On the QAD side it accepts the full tenant URL per customer, since QAD publishes no shared base address, and validates entity paths against that instance. On the Databricks side it runs the OAuth 2.0 client credentials flow against the service principal and refreshes the one-hour Bearer token before it expires rather than waiting for a 401. Because neither QAD cloud nor Databricks offers webhooks for this data, the connection polls QAD on your schedule and loads each batch into Delta Lake through the SQL Statement Execution API, polling the returned statement id until the load finishes. Larger loads can run as Databricks Jobs, which accept an idempotency token so a network-error retry returns the existing run instead of duplicating it. Tables are registered in Unity Catalog first, so every write targets a catalog and schema that already exists, and workspaces still on the legacy Hive Metastore are detected so the correct path is used. Databricks rate limits return HTTP 429, so ml-connector backs off with jitter and retries, and every record carries a full audit trail and can be replayed if a load fails.

A real-world example

A mid-sized contract manufacturer runs QAD Adaptive ERP across several plants and wants a single view of procurement spend and supplier performance. Before the integration, an analyst pulled CSV extracts of purchase orders and supplier invoices from QAD each week and uploaded them into the data platform by hand, which meant the dashboards were always days behind and broke whenever a column changed. With QAD and Databricks connected, purchase orders, invoices, goods receipts, and AP payments land in governed Delta Lake tables automatically on a nightly schedule. Analysts query current data through SQL warehouses, and the manual extract-and-upload step is gone.

What you can do

  • Land QAD suppliers, purchase orders, supplier invoices, goods receipts, AP payments, and GL records into Delta Lake tables on a schedule.
  • Register and govern the target tables in Unity Catalog before data is written.
  • Run loads through the SQL Statement Execution API or Databricks Jobs, with idempotency tokens to prevent duplicate runs.
  • Bridge QAD tenant login and Databricks OAuth2 service principal tokens, refreshing the one-hour token before it expires.
  • Poll on a cadence you control, with 429 backoff, retries, and a full audit trail on every record.

Questions

Which direction does data move between QAD and Databricks?
Data moves from QAD into Databricks. ml-connector reads QAD records such as purchase orders, supplier invoices, goods receipts, and GL postings and writes them into Delta Lake tables. Databricks is treated as a read-only data destination, so ml-connector never writes analytics output back into QAD.
Can Databricks store QAD invoices and GL data directly?
Databricks has no native invoice, vendor, or GL account objects, so the data is stored as rows in Delta Lake tables rather than as ERP records. ml-connector creates and registers those tables in Unity Catalog, then loads the QAD data into them through the SQL Statement Execution API. From there the records are queryable through SQL warehouses for reporting and modeling.
How does the integration handle authentication and the lack of webhooks?
ml-connector accepts the full QAD tenant URL per customer and authenticates Databricks with OAuth 2.0 client credentials using a service principal, refreshing the one-hour Bearer token before it expires. Because neither QAD cloud nor Databricks offers webhooks for this data, the connection polls QAD on your schedule and loads each batch into Delta Lake, confirming completion by polling the returned statement or job run.

Related integrations

Connect QAD and Databricks

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started