ml-connector
Oracle PeopleSoftDatabricks

Oracle PeopleSoft and Databricks integration

Oracle PeopleSoft manages finance, procurement, and HCM across your organization. Databricks powers analytics and reporting on cloud infrastructure. Connecting the two streams transactional data from PeopleSoft into Databricks tables where finance teams can query, aggregate, and report on vendor activity, customer records, journal entries, and payroll without manual export. ml-connector handles the self-hosted nature of PeopleSoft, the credential management across both platforms, and the polling schedule.

How Oracle PeopleSoft works

Oracle PeopleSoft is a self-hosted on-premise ERP and HCM platform. Each customer operates their own environment, either on their own servers or on Oracle Cloud Infrastructure. PeopleSoft exposes finance and HR data through REST listening connectors at customer-specific hostnames and port numbers, with endpoints for vendors, customers, purchase orders, journal entries, and employee records. Authentication uses HTTP Basic Auth (OPRID and password) for all PeopleTools versions, or OAuth2 Bearer tokens for PeopleTools 8.58 and later. PeopleSoft has no webhooks; data must be polled via REST with date-range filters to avoid overloading the system. Because each customer has a unique hostname and node configuration, there is no Oracle-managed sandbox - testing requires access to the customer's own development environment or images via Cloud Manager.

How Databricks works

Databricks is a cloud data platform built on Apache Spark and Delta Lake, accessed via REST APIs over HTTPS with workspace-specific base URLs that vary by cloud provider and workspace ID. All API calls require OAuth2 client credentials with a service principal, where the client obtains a bearer token from the workspace-level OIDC endpoint. The token expires after 3600 seconds and must be refreshed on each new sync cycle. Databricks provides SQL tables, data warehouses, and schema objects for storing structured data. Unlike finance ERPs, Databricks has no native finance objects such as invoices or GL accounts; it is a data destination only. Table writes via REST are metadata operations; actual data is written through SQL or Spark jobs that ml-connector can trigger after metadata setup.

What moves between them

Data flows from Oracle PeopleSoft into Databricks tables. ml-connector polls PeopleSoft REST endpoints on a schedule you define, extracting vendors, customers, purchase orders, journal entries, and payroll records with date-range filters to stay current. Each record is transformed into Databricks SQL table format and loaded into schemas you specify. Reference data such as vendor and customer master records can be synced in both directions if you need to control hierarchies or attributes from Databricks back to PeopleSoft. The sync runs on your schedule, not on PeopleSoft events, because PeopleSoft has no webhook capability.

How ml-connector handles it

ml-connector accepts the self-hosted PeopleSoft hostname, port, and node name for each customer, then authenticates with HTTP Basic Auth or OAuth2 depending on the PeopleTools version you run. It fetches data from PeopleSoft REST endpoints using date-range filters to retrieve only new and changed records since the last sync, avoiding large full refreshes. On the Databricks side, ml-connector obtains an OAuth2 bearer token from your workspace OIDC endpoint using client credentials, stores the workspace URL and catalog schema names you provide, and writes the transformed records to Databricks SQL tables. When the Databricks token expires after one hour, ml-connector automatically refreshes it before the next request. Because PeopleSoft REST endpoints return unstructured result sets and Databricks requires defined schemas, ml-connector maps source fields to table columns you define, handles data type conversions, and preserves audit information on every load. If a sync fails partway through, ml-connector retries the failed batch on the next cycle and logs the error for review.

A real-world example

A mid-sized diversified manufacturer runs Oracle PeopleSoft for procurement, finance, and payroll across multiple plants and a head office. Finance and supply chain teams need daily visibility into vendor spending, invoice volumes, and payment status, and today they export PeopleSoft reports by hand and paste them into spreadsheets for analysis. With PeopleSoft and Databricks connected, vendor, purchase order, and invoice data flows automatically from PeopleSoft into Databricks tables overnight. The finance team now runs SQL queries on current vendor balances, invoice aging, and spend by category without any manual work, and supply chain uses Databricks dashboards to monitor PO receipt rates and supplier performance in real time.

What you can do

  • Load Oracle PeopleSoft vendors, customers, purchase orders, and journal entries into Databricks SQL tables on a daily or weekly schedule.
  • Authenticate PeopleSoft with HTTP Basic Auth or OAuth2 depending on PeopleTools version, and Databricks with workspace-scoped OAuth2 credentials.
  • Use date-range filters on PeopleSoft queries to fetch only new and changed records, avoiding full refreshes and unnecessary load.
  • Transform and map PeopleSoft REST result fields to Databricks table columns, handling data type conversions and field validation.
  • Track every sync in audit logs with record counts, timestamp, and errors so you can replay failed batches and verify data completeness.

Questions

Does Oracle PeopleSoft support webhooks so ml-connector can be notified of new records?
No. PeopleSoft has no webhooks for REST endpoints. ml-connector polls your PeopleSoft instance on a schedule you define, using date-range filters to fetch new and changed records since the last sync. This protects your on-premise system from unexpected load spikes and keeps the polling cadence under your control.
How does ml-connector handle the fact that each PeopleSoft customer has a different hostname and port?
You provide the full hostname, port, and node name for your PeopleSoft instance when you set up the integration. ml-connector uses that information to build the correct REST endpoint URL for your environment. Because PeopleSoft is self-hosted with no Oracle-managed sandbox, testing uses your own development environment or Cloud Manager images.
What happens when the Databricks token expires after one hour?
ml-connector automatically refreshes the OAuth2 bearer token from your workspace OIDC endpoint before the next API call. You do not need to manually renew credentials; token refresh is handled transparently as part of each sync cycle.

Related integrations

Connect Oracle PeopleSoft and Databricks

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started