ml-connector
Wave AccountingGoogle BigQuery

Wave Accounting and Google BigQuery integration

Wave Accounting runs invoicing, the chart of accounts, and bookkeeping for a small business. Google BigQuery is the warehouse where that financial data is stored and queried for reporting. Connecting the two copies Wave invoices, customers, products, GL accounts, and money transactions into BigQuery tables on a schedule, so analysts work from current numbers without exporting CSVs by hand. ml-connector handles the very different APIs on each side and keeps the warehouse tables in step with Wave. Because BigQuery is a passive store with no accounting logic, the books stay in Wave Accounting and the warehouse holds a read copy.

How Wave Accounting works

Wave Accounting exposes customers, invoices, products, chart-of-accounts GL accounts, money transactions, vendors, and sales taxes through a single GraphQL endpoint at gql.waveapps.com/graphql/public, where reads are queries and writes are mutations. It authenticates with OAuth 2.0 authorization-code tokens that last about two hours, refreshed with the offline_access scope, and every call is scoped to a businessId fetched after login. Wave uses offset-based pagination with page and pageSize, and returns GraphQL errors with an HTTP 200, so the errors array must be checked. Wave can push invoice, customer, transaction, and product events by webhook, signed with HMAC-SHA256 in an x-wave-signature header, so changes can arrive as a push rather than only by polling.

How Google BigQuery works

Google BigQuery exposes datasets, tables, rows, and jobs through its REST API v2 at bigquery.googleapis.com/bigquery/v2, with paths scoped by project and dataset. It authenticates with a Google service account using the JWT-bearer grant, exchanging a signed assertion for an access token that lasts one hour. Rows are written with the streaming insertAll call, which dedupes best-effort on a caller-supplied insertId, or batched through load jobs that take a caller-supplied jobId for idempotency. BigQuery has no outbound webhooks; it is a pull-only, push-in store, so reading data back means running a query job and paging the results. Tables are customer-defined, so the connector accepts a configurable dataset and table name for each entity.

What moves between them

The main flow runs from Wave Accounting into Google BigQuery. ml-connector reads invoices, customers, products, GL accounts, and money transactions from Wave and writes them as rows into the matching customer-defined BigQuery tables, such as invoices, customers, items, gl_entries, and transactions. Wave invoice, customer, transaction, and product webhooks trigger a write as soon as a record changes, and a scheduled poll backfills anything a webhook missed and picks up records created before the connection was live. BigQuery is treated as a read copy for analytics, so ml-connector does not write financial entries back into Wave; the chart of accounts and the books stay in Wave Accounting.

How ml-connector handles it

ml-connector stores both credential sets encrypted. On the Wave side it refreshes the OAuth 2.0 bearer token before its roughly two-hour expiry using the offline_access refresh token, and it fetches the businessId once after login because every Wave query and mutation is scoped to it. On the BigQuery side it signs a JWT with the service account private key and exchanges it for a one-hour access token, re-requesting the token before it expires. Wave records are written to BigQuery with the streaming insertAll call, and each row carries a stable insertId derived from the Wave record id so a re-read invoice is not double-inserted within the dedup window; large historical backfills go through a load job with a caller-supplied jobId so a retried batch returns the existing job rather than a duplicate. Wave invoice line items, customers, and GL accounts are mapped to the columns of the target tables first, so every row lands in a table whose schema already has those fields. Two real edge cases are handled: Wave returns GraphQL errors with an HTTP 200, so the connector checks the errors array rather than the status code, and the BigQuery streaming buffer can take up to about ninety seconds before new rows appear in a table list, so a poll does not treat a just-written row as missing. Wave does not publish firm rate limits, so the connector backs off on a 429, and every record carries a full audit trail and can be replayed if a BigQuery write fails.

A real-world example

A ten-person design studio invoices its clients in Wave Accounting and wants real revenue dashboards in Looker Studio, which reads from Google BigQuery. Before the integration, the bookkeeper exported invoice and transaction CSVs from Wave each month and loaded them into the warehouse by hand, so the dashboards were always a few weeks stale and a missed export left gaps. With Wave Accounting and Google BigQuery connected, each new or paid invoice and each money transaction flows into the warehouse tables within the polling window, and webhooks push the urgent changes sooner. The owner sees current billings and cash by client without waiting for a monthly reload, and the manual export step is gone.

What you can do

  • Load Wave Accounting invoices, customers, and products into customer-defined Google BigQuery tables.
  • Copy Wave GL accounts and money transactions into the warehouse so reporting reflects the current ledger.
  • Trigger writes from Wave invoice, customer, transaction, and product webhooks, with a scheduled poll as backfill.
  • Bridge Wave OAuth2 tokens and the BigQuery service-account JWT, refreshing each before it expires.
  • Dedup each row on a stable insertId and replay failed writes, with a full audit trail on every record.

Questions

Which direction does data move between Wave Accounting and Google BigQuery?
The main flow is Wave Accounting into Google BigQuery. Invoices, customers, products, GL accounts, and money transactions move from Wave into customer-defined warehouse tables. BigQuery is a read copy for reporting, so ml-connector does not write financial entries back into Wave; the books stay in Wave.
Does Wave push changes, or does ml-connector poll for them?
Both. Wave can push invoice, customer, transaction, and product events by webhook, signed with HMAC-SHA256, and ml-connector verifies each one before acting on it. A scheduled poll runs alongside the webhooks to backfill anything a push missed and to load records created before the connection went live, since BigQuery itself sends no events.
How does the integration avoid duplicate rows in BigQuery?
Each Wave record is written with a stable insertId derived from its Wave id, which gives BigQuery streaming inserts best-effort deduplication inside a roughly one-minute window. Large historical backfills go through a load job with a caller-supplied jobId, so a retried batch returns the existing job instead of creating a duplicate. Every write is also tracked in the audit trail for replay.

Related integrations

Connect Wave Accounting and Google BigQuery

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started