Oracle NetSuite and Databricks integration
Oracle NetSuite runs your financials and operational records. Databricks ingests and transforms data for analytics. Connecting the two moves your AP and GL records from NetSuite into Databricks on a schedule you define, so your finance team can analyze spending, reconcile invoices across systems, and audit transaction flows without manual exports and re-keying. NetSuite and Databricks use different APIs and authentication flows that ml-connector bridges.
What moves between them
Invoice records, purchase orders, and GL account transactions flow from Oracle NetSuite into Databricks. Vendor bills and invoices are read via SuiteQL queries on a schedule (typically daily or weekly) and loaded into Databricks tables under a finance catalog. GL transactions and account balances are appended to a transactions table for audit and analysis. Account and vendor master records are synced weekly to keep dimension tables current. All data is append-only in Databricks; historical records are preserved for audit trails.
How ml-connector handles it
ml-connector uses OAuth 2.0 Client Credentials to authenticate to NetSuite and retrieves records via SuiteQL queries rather than relying on Event Subscriptions, since NetSuite webhooks lack HMAC signatures and require IP allowlists. The NetSuite OAuth certificate is stored encrypted and presented on each request. ml-connector refreshes NetSuite tokens before the 60-minute expiry and handles Databricks token refresh every 3600 seconds. On the Databricks side, ml-connector creates and manages tables within a finance catalog, mapping NetSuite GL accounts and vendors to Databricks dimension tables, and appends transaction records to fact tables. Because Databricks has no native webhooks for data writes, all loads use SQL inserts or batch upserts. ml-connector tracks the last-synced timestamp for each query so repeat runs skip already-ingested records, and every record carries a full audit trail including the source query timestamp, NetSuite internal ID, and sync run date.
A real-world example
A mid-sized professional services firm runs Oracle NetSuite for accounting and procurement across multiple offices, and uses Databricks to build a data lake for finance analytics, project costing, and vendor spend analysis. Before the integration, finance analysts exported invoice and GL data from NetSuite monthly in CSV format, transformed it manually in Excel, then loaded it into Databricks for analysis, a process that took two to three days and introduced transcription errors. With NetSuite and Databricks connected, invoice and GL records flow automatically on a weekly schedule into Databricks tables, historical data is preserved for trend analysis, and the finance team can run spend reports and cost allocation analyses the same week payables close. The monthly export-and-transform cycle is eliminated, and reconciliation between NetSuite and Databricks is deterministic.
What you can do
- Sync vendor bills, invoices, and purchase orders from Oracle NetSuite into Databricks tables on a scheduled interval for spend analysis and audit.
- Load GL account transactions and balances into Databricks fact and dimension tables for general ledger analytics and reconciliation.
- Authenticate Oracle NetSuite with OAuth 2.0 Client Credentials and a certificate, and Databricks with OAuth 2.0 at the workspace or account level.
- Poll SuiteQL queries to retrieve historical and incremental invoice and transaction records, with timestamp tracking to avoid duplicate loads.
- Preserve full audit trails on every record, including NetSuite internal IDs, sync timestamps, and source query details for compliance and root-cause analysis.
Questions
- Why does ml-connector use SuiteQL polling instead of NetSuite Event Subscriptions?
- NetSuite Event Subscriptions lack native HMAC signatures and require IP allowlists for security, making them less reliable for a multi-tenant platform like ml-connector. Polling via SuiteQL gives ml-connector full control over retry logic, timestamp tracking, and audit trails, and it handles bulk reads and historical backfills without relying on webhook push delivery.
- How does ml-connector handle token expiry on both sides?
- NetSuite OAuth tokens expire after 60 minutes with no refresh token, so ml-connector proactively refreshes tokens before expiry. Databricks bearer tokens expire every 3600 seconds, and ml-connector handles token refresh automatically on each request cycle. Both refresh operations use the stored client credentials and are transparent to the sync schedule.
- What happens to historical data when a new invoice is loaded into Databricks?
- All data loads are append-only into Databricks tables. ml-connector tracks the last-synced timestamp for each SuiteQL query, so repeat runs load only new and updated records, avoiding duplicates. Historical records remain in the table, giving the finance team a complete audit trail and the ability to run trend reports across multiple sync periods.
Related integrations
More Oracle NetSuite integrations
Other systems that connect to Databricks
Connect Oracle NetSuite and Databricks
Free to use. Add your credentials, ping your real systems, and see if we fit.
Get started