Which direction does data move between Sage 50 and Google BigQuery?

Data moves one way, from Sage 50 into Google BigQuery. Vendors, customers, invoices, payments, and GL journal entries are read from the Sage 50 company file and written into BigQuery tables for reporting. BigQuery is a data warehouse rather than an accounting system, so ml-connector never writes records back into Sage 50, which stays the system of record.

Why does Sage 50 need a local agent instead of a cloud connection?

Sage 50 is desktop, on-premise software with no REST API and no webhooks. Its data can only be reached through a local SDK, the .NET SDK on the US edition or Sage Data Objects on the UK edition, which requires Sage 50 installed on the same machine and a Windows desktop session. ml-connector deploys a lightweight agent on that machine to read the data and send it to BigQuery over HTTPS.

How are duplicate rows avoided when neither system sends events?

Both systems are pull-only, so the agent polls Sage 50 on a schedule and filters by last modified or transaction date to read only new and changed records. Each row is streamed into BigQuery with a stable insertId derived from the Sage reference number, which gives best-effort deduplication. Large historical loads use a load job with a caller-supplied jobReference.jobId, so re-running the same batch returns the existing job rather than creating duplicates.

Sage 50Google BigQuery

Sage 50 and Google BigQuery integration

Sage 50 holds the books: vendors, customers, invoices, payments, and the general ledger. Google BigQuery is the cloud warehouse where that data gets queried and reported on. Connecting the two copies your accounting records out of the desktop software and into BigQuery tables on a schedule, so finance and analytics teams can run SQL against current data without touching the live company file. ml-connector handles the very different access models on each side: a local agent for Sage 50, the REST API for BigQuery. Because BigQuery is a warehouse and not an accounting system, the flow runs one way into BigQuery and the books stay in Sage 50.

How Sage 50 works

Sage 50 is desktop, on-premise accounting software with no REST API and no webhooks. Its data is reached through a local SDK: the Sage 50 .NET SDK on the US edition or Sage Data Objects on the UK edition, both of which require Sage 50 installed on the same machine and a desktop Windows session. It exposes vendors, customers, purchase and sales invoices, purchase and sales orders, AP payments, AR receipts, GL accounts and journal entries, employees, and inventory through SDK object calls. Authentication is a Sage 50 username and password verified against the company data file; there is no token, no OAuth, and no event push, so changes are found by polling on a schedule.

How Google BigQuery works

Google BigQuery is a serverless cloud data warehouse exposed through the BigQuery REST API v2 at https://bigquery.googleapis.com/bigquery/v2, scoped by project, dataset, and table. It has no fixed ERP objects; the customer defines their own tables such as invoices, vendors, and gl_entries inside a dataset. Rows are written with the streaming insert call tabledata.insertAll, or batch-loaded with a load job, and read back by submitting a query job and paging the results. Authentication is OAuth 2.0 with a Google service account using the JWT-bearer grant, and access tokens expire after one hour. BigQuery sends no webhooks, so it is a push-in and pull-out system only.

What moves between them

The flow runs one direction, from Sage 50 into Google BigQuery. The local agent polls Sage 50 for vendors, customers, invoices, payments, receipts, and GL journal entries, and ml-connector streams each record into the matching BigQuery table in the customer's dataset. Master data such as vendors and customers and transactional data such as invoices and ledger lines are mapped to the column schema the customer has defined for each table. BigQuery is treated as a read and report destination, so ml-connector never writes accounting records back into Sage 50; the desktop company file stays the system of record and the warehouse holds a queryable copy refreshed on the cadence you set.

How ml-connector handles it

ml-connector stores both credential sets encrypted. On the Sage 50 side it runs a local Windows agent in an interactive desktop session, since the SDK cannot run as a headless service, and opens the company file with the dedicated integration username and password and, on the US edition, the registered Application ID. Because neither Sage 50 nor BigQuery pushes events, the agent polls on a schedule, filtering Sage records by LastModifiedDate or transaction date against the last sync so only new and changed rows are read. On the BigQuery side ml-connector signs a JWT with the service account private key, exchanges it for a one-hour access token, and refreshes that token before it expires. Each Sage record is written with tabledata.insertAll and carries a stable insertId built from the Sage reference number, which gives best-effort dedup so a re-read row is not double counted; for large historical backfills it switches to a load job keyed by a caller-supplied jobReference.jobId for idempotency. Sage fields map to the columns the customer defined per table, and adding new columns is safe while renames require a new table. Edge cases the connector handles: only one SDK session may open a company file at a time, so the agent uses a separate Sage user from any interactive operator; the service account private key keeps its literal newline characters so JWT signing works; and writes need a billing-enabled BigQuery project because the free sandbox blocks streaming inserts. Every record carries a full audit trail and can be replayed if a write fails.

A real-world example

A regional construction firm with about 120 staff runs Sage 50 UK Accounts on a Windows server at the head office for all vendor bills, customer invoices, and the nominal ledger. The finance director wants a single dashboard that combines job costs, supplier spend, and cash position, but the data is locked inside the desktop company file and the team rebuilds the same spreadsheets by hand every month from CSV exports. With Sage 50 and Google BigQuery connected, a local agent polls the company file each night and streams suppliers, invoices, payments, and nominal journal lines into BigQuery tables, where the firm's reporting tool queries them directly. The monthly export-and-rebuild step disappears and every dashboard reads from the same refreshed numbers.

What you can do

Stream Sage 50 vendors, customers, invoices, payments, and GL journal entries into Google BigQuery tables on a schedule.
Run a local Windows agent next to the Sage 50 install so the desktop SDK can read data without a cloud API.
Bridge the Sage 50 username and password login to a Google service account JWT, refreshing the one-hour token automatically.
Dedupe each row with a stable insertId from the Sage reference, and backfill history with idempotent load jobs.
Keep Sage 50 as the system of record while BigQuery holds a queryable copy, with retries and a full audit trail on every record.

Questions

Which direction does data move between Sage 50 and Google BigQuery?: Data moves one way, from Sage 50 into Google BigQuery. Vendors, customers, invoices, payments, and GL journal entries are read from the Sage 50 company file and written into BigQuery tables for reporting. BigQuery is a data warehouse rather than an accounting system, so ml-connector never writes records back into Sage 50, which stays the system of record.
Why does Sage 50 need a local agent instead of a cloud connection?: Sage 50 is desktop, on-premise software with no REST API and no webhooks. Its data can only be reached through a local SDK, the .NET SDK on the US edition or Sage Data Objects on the UK edition, which requires Sage 50 installed on the same machine and a Windows desktop session. ml-connector deploys a lightweight agent on that machine to read the data and send it to BigQuery over HTTPS.
How are duplicate rows avoided when neither system sends events?: Both systems are pull-only, so the agent polls Sage 50 on a schedule and filters by last modified or transaction date to read only new and changed records. Each row is streamed into BigQuery with a stable insertId derived from the Sage reference number, which gives best-effort deduplication. Large historical loads use a load job with a caller-supplied jobReference.jobId, so re-running the same batch returns the existing job rather than creating duplicates.

Connect Sage 50 and Google BigQuery

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started