ml-connector
Wave AccountingDatabricks

Wave Accounting and Databricks integration

Wave Accounting holds the books for small businesses. Databricks holds the analytics. Connecting the two lets your finance and data teams see the same numbers at the same time. Invoices, payments, customers, and transactions flow from Wave into Databricks automatically as they occur, so your data warehouse always reflects the current state of your accounts. No batch exports, no manual reconciliation.

How Wave Accounting works

Wave Accounting exposes customers, invoices, products, accounts, transactions, vendors, and sales taxes through a GraphQL single-endpoint API at https://gql.waveapps.com/graphql/public. Authentication is OAuth 2.0 Authorization Code Flow with 2-hour access tokens and refresh tokens, and requires that the connected business have an active Wave Pro subscription. Wave pushes events via webhooks for invoice.created, invoice.updated, invoice.paid, payment.created, customer.created, customer.updated, transaction.created, product.created, and product.updated. Failed deliveries are retried by Wave; ml-connector returns HTTP 200 to acknowledge and 500 to request retry. Webhook payloads include an HMAC-SHA256 signature in the x-wave-signature header and must be verified within a 5 minute window.

How Databricks works

Databricks is accessed via REST API with workspace or account-level base URLs and OAuth 2.0 client credentials (service principal with client_id and client_secret). Bearer tokens expire in 3600 seconds and require refresh. Key entities include catalogs, schemas, tables, and compute clusters. Databricks has no native accounting objects, so ml-connector creates financial tables (Invoices, Transactions, Customers, Payments, Products) within a schema of your choice, either in the default workspace scope or in a Unity Catalog for multi-workspace governance. Data writes to Databricks tables via REST are metadata operations; actual row inserts use SQL, so ml-connector executes CREATE TABLE IF NOT EXISTS and INSERT statements to materialize Wave records.

What moves between them

Wave events flow unidirectionally into Databricks. When a Wave invoice is created, paid, or updated, or when a payment, transaction, or customer record changes, ml-connector receives the webhook, verifies its signature, enriches it if needed (e.g., fetching full invoice detail from Wave's GraphQL API), and upserts the corresponding row in the Databricks table. Lookup tables (Customers, Products, Accounts) are refreshed to keep dimensions current. The integration runs in near-real-time for webhooks and once per day by polling for any missed events.

How ml-connector handles it

ml-connector stores the Wave OAuth refresh token encrypted and uses it to obtain fresh access tokens as needed. On the Databricks side, it stores the service principal credentials encrypted and refreshes the bearer token before expiry. For incoming webhooks, ml-connector verifies the HMAC-SHA256 signature against the x-wave-signature header using the shared webhook secret, rejecting any unsigned or expired payload. It then transforms the Wave GraphQL schema (which uses nested objects for invoice line items, customer addresses, and tax breakdown) into flat normalized table rows, applies any mappings or filters defined for the connection, and executes SQL INSERT or UPSERT statements in Databricks. Schema creation is idempotent: ml-connector runs CREATE TABLE IF NOT EXISTS on startup to ensure the target tables exist. Because Wave does not support patch operations on invoices (only create, approve, send, or delete), deletes in Wave are reflected as soft deletes or status updates in Databricks. Bills and purchase orders are not available in Wave's API, so the integration excludes accounts payable; only sales and customer data flows. Token expiry and network transients trigger automatic retry with exponential backoff, and every record carries an audit trail in Databricks for compliance and debugging.

A real-world example

A small accounting firm manages ten Wave Accounting businesses through a single multi-tenant Databricks workspace. Finance staff use Wave daily to enter invoices, record customer payments, and track business expenses, but need unified reporting across all ten clients at month-end close. Before the integration, the partner exported CSV reports from each Wave account and wrote custom SQL to union and reconcile them. With Wave and Databricks connected, every invoice, payment, and transaction posted in any Wave account flows automatically into a shared Databricks schema, partitioned by Wave business ID. Month-end reporting queries run against live Databricks tables instead of stale exports, and the reconciliation logic is centralized in a single dbt transformation pipeline rather than scattered across spreadsheets.

What you can do

  • Stream Wave invoices, payments, customers, and transactions into Databricks in near-real-time via webhooks.
  • Normalize Wave GraphQL schema (nested line items, addresses, tax breakdown) into flat Databricks table rows.
  • Verify webhook signatures with HMAC-SHA256 and reject expired or tampered payloads.
  • Refresh Wave OAuth tokens and Databricks service principal tokens automatically before expiry.
  • Upsert records idempotently so reruns and retries do not create duplicates.

Questions

Can ml-connector sync Wave bills and accounts payable into Databricks?
No. Wave Accounting does not expose bills or purchase orders via its GraphQL API, so the integration handles only sales and customer data (invoices, payments, transactions, customers, products). If you need accounts payable analytics, you will need a separate accounts payable system or vendor management platform connected to Databricks.
What happens if a Wave invoice is deleted or marked as invalid?
Wave allows invoices to be created, approved, sent, or deleted, but does not support a patch operation to modify an existing invoice. ml-connector records deletions as status updates in the Databricks table so the audit trail remains complete. You can filter by status in your BI queries to show only active invoices.
How does ml-connector handle Wave OAuth token expiry and Databricks service principal rotation?
ml-connector stores the Wave OAuth refresh token encrypted and automatically obtains a fresh access token before making API calls. Similarly, it stores the Databricks service principal client_secret encrypted and refreshes the bearer token before the 3600-second expiry window. If a credential is rotated in Wave or Databricks, you update the connection secret in ml-connector and the next sync picks up the new token automatically.

Related integrations

Connect Wave Accounting and Databricks

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started