What financial records move from Intacct to Databricks?

Vendor master data, AP bill headers and line items, GL accounts, and dimension hierarchies flow from Intacct into Databricks tables. Payments and accruals are also extracted. All records are read-only in Intacct; ml-connector never writes financial data back into the source system.

How does ml-connector handle Intacct's XML gateway and session caching?

ml-connector calls Intacct's getAPISession endpoint with your senderId and user credentials and caches the returned sessionid for its full 50-minute validity period. Each polling request reuses the cached session until it expires, then refreshes it automatically. XML responses are parsed for application-level errors embedded in the body, since Intacct returns HTTP 200 even when errors occur.

How do records land in Databricks, and can they be queried immediately?

Extracted records are serialized to JSON and sent via REST to your Databricks workspace using OAuth2 Service Principal auth. Each record is appended or upserted into your target table, and the data is immediately queryable via SQL or Spark without additional loading steps.

Sage IntacctDatabricks

Sage Intacct and Databricks integration

Sage Intacct runs your accounting and AP operations. Databricks analyzes your financial data at scale. Connecting the two lets you extract invoices, vendor details, and GL account structures from Intacct and load them into Databricks tables for reporting, audit trails, and financial analytics without manual export. Your accounting team works in Intacct while the finance analytics team queries Databricks, and both stay in sync on a schedule you define.

How Sage Intacct works

Sage Intacct exposes vendors, AP bills, payments, GL accounts, and dimensions through a single XML gateway endpoint at https://api.intacct.com/ia/xml/xmlgw.phtml. Authentication is session-based, using a senderId, senderPassword, companyId, userId, and userPassword; the first call to getAPISession exchanges credentials for a sessionid cached for 50 minutes. Intacct does not push webhooks, so all reads are polling-driven. HTTP 200 responses may contain application-level errors inside the XML body, requiring parsing for errormessage tags and status codes.

How Databricks works

Databricks provides REST APIs across workspace-specific URLs (https://<workspace-id>.cloud.databricks.com and variants for Azure and GCP) with all API paths prefixed by /api/2.0/ or /api/2.1/. Authentication uses OAuth 2.0 Client Credentials (Service Principal with client_id and client_secret) with tokens expiring at 3600 seconds. Databricks has no native finance or accounting objects; it is a data platform that receives and stores data as tables within schemas, organized in catalogs under Unity Catalog governance. Writes to Databricks tables are metadata operations; actual data is stored in Delta Lake format for SQL and Spark analysis.

What moves between them

Financial records flow from Intacct into Databricks. Vendor master data, AP bill headers and line items, and GL account hierarchies are extracted from Intacct via its XML gateway on a polling schedule. Each record is transformed into a JSON row and sent to Databricks REST endpoints, where it is appended to or upserted into a target table within your chosen schema. GL accounts remain read-only in Intacct; ml-connector never writes back into the source system.

How ml-connector handles it

ml-connector calls Intacct's getAPISession endpoint with your senderId and user credentials, caches the returned sessionid for the full 50-minute duration, and then polls your chosen entities (VENDOR, APBILL, GLACCOUNT, DIMENSION) on a schedule. Each extracted record is serialized to JSON and sent via REST to Databricks using OAuth2 bearer token auth. The OAuth2 token is refreshed before each request to ensure it does not expire mid-operation. Intacct XML responses are parsed for application-level errors inside the body, and any control characters are stripped before processing. Retried operations use Intacct's uniqueid flag within the request control block to prevent duplicate server-side deduplication. In Databricks, records land in your specified table and can be queried immediately via SQL or Spark without further loading steps.

A real-world example

A mid-market financial services firm runs Sage Intacct for vendor management and AP processing, and Databricks as its central analytics warehouse. Each month, the accounting team records vendor invoices, payments, and accruals in Intacct. Before the integration, the finance analytics team manually exported vendor and invoice reports from Intacct and imported them as CSV into Databricks, then wrote SQL to reconcile invoice dates, amounts, and aging. With Intacct and Databricks connected, each new vendor and invoice flows into Databricks automatically on a nightly schedule. The analytics team queries live Intacct data directly in Databricks for aging analysis, cash flow forecasts, and vendor spend trends without re-keying or waiting for batch exports.

What you can do

Extract vendor master data and AP bill details from Intacct and load them into Databricks tables on a schedule tied to your accounting cycle.
Map Intacct GL accounts and dimensions into Databricks for accounting hierarchy and cost center analysis.
Refresh Intacct session credentials automatically and parse application-level XML errors so errors inside HTTP 200 responses do not go silent.
Authenticate Intacct with session-based credentials and Databricks with OAuth2 Service Principal, storing both encrypted.
Sync read-only financial records without writing back into Intacct, keeping the source system as the single source of truth.

Questions

What financial records move from Intacct to Databricks?: Vendor master data, AP bill headers and line items, GL accounts, and dimension hierarchies flow from Intacct into Databricks tables. Payments and accruals are also extracted. All records are read-only in Intacct; ml-connector never writes financial data back into the source system.
How does ml-connector handle Intacct's XML gateway and session caching?: ml-connector calls Intacct's getAPISession endpoint with your senderId and user credentials and caches the returned sessionid for its full 50-minute validity period. Each polling request reuses the cached session until it expires, then refreshes it automatically. XML responses are parsed for application-level errors embedded in the body, since Intacct returns HTTP 200 even when errors occur.
How do records land in Databricks, and can they be queried immediately?: Extracted records are serialized to JSON and sent via REST to your Databricks workspace using OAuth2 Service Principal auth. Each record is appended or upserted into your target table, and the data is immediately queryable via SQL or Spark without additional loading steps.

Connect Sage Intacct and Databricks

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started