QuickBooks Online and Databricks integration
QuickBooks Online is the accounting system for small-to-mid-market businesses, managing invoices, bills, expenses, payroll, and the general ledger. Databricks is the data intelligence platform where finance teams land, combine, and analyze that accounting data alongside operational and sales data. Syncing QuickBooks Online to Databricks connects your accounting ledger to your analytics ecosystem, so your CFO can report on cash, expenses, and payroll trends without re-keying data or battling spreadsheet versions. ml-connector handles the OAuth 2.0 token dance and maps QuickBooks Online entities into Databricks tables in your chosen catalog.
What moves between them
Financial records flow from QuickBooks Online into Databricks. On a schedule you define (daily, weekly, or hourly), ml-connector reads vendors, customers, invoices, bills, journal entries, GL accounts, and departments from QuickBooks Online via webhooks and polling, and writes them as rows into Databricks tables in your chosen catalog and schema. If QuickBooks Online fires a webhook for an update or delete, ml-connector fetches the full record and updates the corresponding Databricks table row. Deleted records in QuickBooks Online (marked inactive, never hard-deleted) are flagged in Databricks for archival logic. The sync is one-way: Databricks does not write back to QuickBooks Online.
How ml-connector handles it
ml-connector stores the QuickBooks Online OAuth 2.0 credentials encrypted and handles the hourly token refresh automatically. It subscribes to QuickBooks Online webhooks for Create/Update/Delete/Void events on accounts, invoices, bills, vendors, customers, and journal entries, and when a webhook arrives, it fetches the full record (since the webhook payload is minimal), normalizes the shape to match your Databricks schema, and upserts the row into the target table. Between webhooks, ml-connector also polls the QuickBooks Online CDC endpoint every few hours to catch any events that webhooks may have missed due to the best-effort delivery guarantee. On the Databricks side, ml-connector uses Service Principal credentials (client_id and client_secret) to obtain a bearer token, writes to tables in your catalog and schema, and handles token expiry by refreshing every 60 minutes. To bridge the two systems, you map each QuickBooks Online entity (invoice, bill, GL account) to a Databricks table schema ahead of time, and ml-connector validates that required fields exist before inserting. If a write fails (network timeout, invalid schema, quota exceeded), ml-connector retries with exponential backoff and logs the failure for audit.
A real-world example
A 150-person professional services firm runs QuickBooks Online for accounting and uses Databricks for a finance data warehouse shared with Excel-fluent controllers, project managers, and the CFO. Before the integration, each month the controller exported invoices and expenses from QuickBooks Online, massaged the data in Python, and loaded it into Databricks by hand--a process that took 4 hours and introduced version-skew risk (if an invoice was deleted or amended between export and load, the mismatch wasn't caught until reconciliation). With QuickBooks Online synced to Databricks, new invoices, bill payments, and GL postings land in Databricks within minutes of creation, controllers can build dashboard queries that reference a single source of truth, and the CFO can run month-end close reports against live data without waiting for the manual export cycle.
What you can do
- Sync QuickBooks Online vendors, customers, invoices, bills, and journal entries into Databricks tables on a schedule you control.
- Subscribe to QuickBooks Online webhooks for Create, Update, Delete, and Void events, and fetch full record details since webhook payloads are minimal.
- Poll the QuickBooks Online CDC endpoint every few hours to catch events missed by best-effort webhook delivery.
- Handle OAuth 2.0 token refresh for both QuickBooks Online (1-hour access tokens) and Databricks (3600-second bearer tokens) automatically.
- Validate entity schemas, upsert rows into your Databricks catalog and schema, and retry failed writes with a full audit trail.
Questions
- Does ml-connector support hard deletes in QuickBooks Online?
- No. QuickBooks Online does not hard-delete vendor, customer, or account records; they are marked inactive instead. ml-connector syncs these inactive records into Databricks and flags them so your archive or retention logic can decide whether to drop or preserve them.
- How does ml-connector handle the minimal webhook payloads from QuickBooks Online?
- QuickBooks Online webhooks contain only the entity ID, operation, and timestamp. When a webhook arrives, ml-connector immediately fetches the full record via GET request and normalizes it to your Databricks schema before writing the row. ml-connector also polls the CDC endpoint every few hours to catch any events missed by best-effort webhook delivery.
- What happens if a Databricks table schema does not match the QuickBooks Online record shape?
- ml-connector validates that required fields exist in your Databricks table before inserting. If a field is missing, the write is rejected, logged for audit, and retried on the next sync cycle. You define the mapping between QuickBooks Online entities and Databricks table schemas before the first sync.
Related integrations
More QuickBooks Online integrations
Other systems that connect to Databricks
Connect QuickBooks Online and Databricks
Free to use. Add your credentials, ping your real systems, and see if we fit.
Get started