ml-connector
QuickBooks OnlineSnowflake

QuickBooks Online and Snowflake integration

QuickBooks Online holds your company's invoices, bills, expenses, and accounts. Snowflake holds your data warehouse. Connecting the two moves every accounting transaction from QuickBooks Online into Snowflake automatically, so your financial data is always current and queryable alongside other business data. No more manual exports; no more stale snapshots in your warehouse. ml-connector handles the OAuth dance, the different API styles, and keeps your books and your warehouse in lockstep.

How QuickBooks Online works

QuickBooks Online exposes vendors, customers, employees, items, accounts, departments, bills, invoices, payments, purchase orders, and journal entries through the QuickBooks Online Accounting API (v3) over REST HTTPS. Authentication uses OAuth 2.0 Authorization Code flow with access tokens that expire in 1 hour and refresh tokens that rotate every 24 to 26 hours. Webhooks fire on Create, Update, Delete, Merge, and Void for most entities, but webhook payloads contain only the entity ID and operation -- the full record must be fetched via GET. QuickBooks Online also provides a CDC endpoint at /v3/company/{realmId}/cdc with 30-day history, so you can also poll for changes. All creates and updates require exact object representation and a SyncToken for concurrency control.

How Snowflake works

Snowflake is a cloud data warehouse where you define your own tables and schemas. It exposes data through REST APIs (for SQL queries and resource operations) and accepts both Key Pair Authentication (RSA private key plus JWT) and Programmatic Access Tokens (long-lived bearer tokens). Snowflake's SQL API is pull-only; there are no webhooks to receive data. Change detection can be driven by timestamp watermarks, Snowflake Streams for native CDC, or Snowflake Tasks for scheduled checks. Async queries return HTTP 202 with a statement handle and require polling; the SQL API uses partition-based pagination, not offset and limit. Network policies require service-account egress IPs to be whitelisted, and the warehouse must have AUTO_RESUME enabled or queries fail.

What moves between them

Data flows from QuickBooks Online into Snowflake. Vendors, customers, employees, items, accounts, departments, bills, invoices, payments, purchase orders, and journal entries are synced from QuickBooks Online into Snowflake tables. Webhooks from QuickBooks Online notify ml-connector of changes, and CDC polling via the /cdc endpoint serves as a fallback or supplement. Full records are fetched from QuickBooks Online after each change and inserted or updated in Snowflake, preserving entity IDs and timestamps so the warehouse record matches the QuickBooks Online source. Snowflake Streams can be used to detect downstream changes within the warehouse, and the relationship is audit-logged so any record can be replayed if a downstream query or transformation fails.

How ml-connector handles it

ml-connector stores QuickBooks Online OAuth credentials (client ID, client secret, realm ID) encrypted and automatically refreshes the access token when it expires or a call returns a 401. It listens for QuickBooks Online webhooks and also polls the CDC endpoint periodically to catch any missed events, since webhooks may arrive out-of-order or be duplicated. When a change is detected, ml-connector fetches the full entity using the current OAuth token, transforms it into a Snowflake table row (inserting new records or updating existing ones using the QuickBooks Online ID as the primary key), and handles Snowflake's async query pattern by polling the statement handle until the query completes. If Snowflake rate-limits with HTTP 429, ml-connector backs off exponentially and retries. SyncTokens from QuickBooks Online are preserved in the Snowflake row so replay-from-this-point is always possible. Network policies are configured upfront so the connector's egress IP is whitelisted, and the Snowflake warehouse is set to AUTO_RESUME so it wakes automatically when queries arrive.

A real-world example

A mid-sized software-as-a-service company runs QuickBooks Online for invoicing, expenses, and reporting, and maintains a Snowflake warehouse for analytics and business intelligence. Before the integration, the finance team exported invoice and payment data from QuickBooks Online weekly and loaded it manually into Snowflake, creating a two-day lag and manual data-entry errors that broke downstream forecasting reports. With QuickBooks Online and Snowflake connected, every invoice, payment, bill, and journal entry flows into Snowflake within minutes of posting to QuickBooks Online. The company's analysts now query the warehouse daily without waiting for finance to prepare exports, and the data matches QuickBooks Online at all times.

What you can do

  • Sync vendors, customers, employees, items, accounts, and journal entries from QuickBooks Online into Snowflake tables in real-time via webhooks or CDC polling.
  • Preserve QuickBooks Online entity IDs and SyncTokens in Snowflake rows so the warehouse record is always traceable to the source and can be replayed.
  • Automate OAuth 2.0 token refresh and handle QuickBooks Online access-token expiry without manual re-authentication.
  • Poll Snowflake's async SQL API and handle HTTP 429 rate limits with exponential backoff so large batch loads do not fail mid-flight.
  • Build native CDC with Snowflake Streams to detect downstream changes and keep your warehouse and books in audit-ready lockstep.

Questions

How does ml-connector handle QuickBooks Online webhook payloads that contain only the entity ID?
Webhook payloads from QuickBooks Online include only the entity ID and the operation (Create, Update, Delete). ml-connector uses the entity ID to fetch the full record from QuickBooks Online via REST GET, so the complete data (all fields, amounts, metadata) is inserted into Snowflake. This two-step pattern ensures the warehouse always has the full record, not a partial stub.
What if a webhook from QuickBooks Online is delayed or duplicated?
ml-connector uses the QuickBooks Online entity ID as the primary key and the SyncToken as the version number. If a duplicate webhook arrives late, the upsert into Snowflake uses the newer SyncToken to avoid rolling back a more recent version. The CDC polling endpoint serves as a safety net to catch any webhooks that were lost, so no record is ever permanently missed.
How does ml-connector manage Snowflake's async query pattern and rate limits?
Snowflake's SQL API returns HTTP 202 with a statement handle when a query is submitted. ml-connector polls that handle in a loop until the query completes (success or error). If Snowflake returns HTTP 429 (rate limit), ml-connector backs off with exponential jitter and retries, ensuring large batch loads of invoice data do not fail partway through.

Related integrations

Connect QuickBooks Online and Snowflake

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started