ml-connector
SAP S/4HANASnowflake

SAP S/4HANA and Snowflake integration

SAP S/4HANA runs your enterprise finance and procurement. Snowflake is your cloud data warehouse where finance teams query actuals, build analytics, and feed reconciliation. Connecting the two puts source transactions from SAP directly into Snowflake on a schedule tied to your close calendar, so your data warehouse always reflects the latest GL postings, supplier invoices, and purchase orders without re-keying or bulk file exports. Your finance team queries live data rather than month-old extracts, and your BI tools feed from a single source of truth.

How SAP S/4HANA works

SAP S/4HANA in Cloud Public, Private, or On-Premise edition exposes suppliers, customers, purchase orders, invoices, GL accounts, cost centers, and GL line items through OData V2 and OData V4 REST APIs. Each SAP tenant has a unique API endpoint URL; the cloud public edition uses https://<tenant-id>-api.s4hana.ondemand.com/sap/opu/odata/sap/<SERVICE_NAME>/. Authentication requires OAuth 2.0 client credentials, with scopes defined per Communication Arrangement, a special configuration object an SAP admin must create before API access works. SAP has no native webhooks for external systems, so queries are read by polling with OData filters on LastChangeDateTime or delta tokens for initial syncs. GL accounts, cost centers, and GL line items are read-only; paid invoice status and payment details are maintained in SAP.

How Snowflake works

Snowflake is a cloud data warehouse where connectors write financial records into user-defined tables via REST SQL APIs and Key Pair Authentication (RSA JWT). Snowflake has no built-in finance objects; all entities such as invoices, GL accounts, and cost centers are schema you design. The warehouse persists data indefinitely and serves analytics, reporting, and reconciliation queries. Snowflake's SQL API is pull-only; it has no webhooks for external systems to call when data changes. Snowflake uses partition-based pagination and gzip compression, and requires a warehouse with AUTO_RESUME enabled and network policies that whitelist the connector's service account egress IP. PAT tokens expire between 1 and 365 days; Key Pair JWT tokens expire in 1 hour and are preferred for server-to-server integration.

What moves between them

Supplier invoices, GL transactions, and purchase orders flow from SAP S/4HANA into Snowflake. ml-connector polls SAP on a schedule (typically daily or after month-end close), reads new and changed invoices and GL line items using OData filters, and inserts them into Snowflake tables with deduplication keys so reruns do not create duplicate rows. GL transactions are read-only in SAP, so ml-connector never writes GL postings back into SAP. Purchase orders and supplier master data (business partners) are synced to support invoice-to-PO matching and vendor reconciliation in Snowflake.

How ml-connector handles it

ml-connector stores the SAP OAuth client credentials and Snowflake JWT private key encrypted and handles the distinct authentication flows on each side. SAP tokens are short-lived (typically 12 hours), so ml-connector caches the token and refreshes before expiry to avoid mid-sync failures. The connector reads SAP OData using the LastChangeDateTime filter to fetch only invoices and GL items changed since the last poll, avoiding full table scans. For initial loads, it uses delta tokens if available, or issues multiple filtered requests to load large datasets in batches. On the Snowflake side, ml-connector assumes a warehouse with AUTO_RESUME enabled and checks that the warehouse is online before executing inserts. SQL statements are parameterized and use Snowflake's COPY command for bulk inserts where possible, falling back to INSERT with parameter multi_statement_count for smaller batches. Deduplication is tracked with a composite key (vendor_id, invoice_number, line_number) so multiple connector runs do not double-post invoices. Network egress from the connector must be whitelisted in Snowflake's network policy, and retry logic uses exponential backoff when Snowflake returns HTTP 429 rate limit errors.

A real-world example

A mid-sized industrial manufacturer operates SAP S/4HANA Cloud on premises with a complex vendor landscape and tight month-end close deadlines. The finance team currently exports AP registers and GL detail from SAP weekly via a batch job, loads them into an on-premise data warehouse, and manually reconciles invoice counts and GL totals against a shadow ledger kept in spreadsheets. The accounting manager spends three days at month-end matching what SAP says happened against what the data warehouse shows, often discovering data sync errors or incomplete uploads. With SAP S/4HANA and Snowflake connected, new supplier invoices and GL postings flow into Snowflake automatically on a daily schedule, tagged with their SAP document keys so they can be traced back. The finance team runs a reconciliation query in Snowflake each morning and sees invoices by vendor, cost center, and GL account with zero manual steps. Month-end close starts with the AP and GL accounts already validated and reconciled.

What you can do

  • Read supplier invoices, GL line items, and purchase orders from SAP S/4HANA on a polling schedule, using OData filters to fetch only changed records since the last run.
  • Write supplier invoice headers and line items into Snowflake tables with deduplication keys so reruns do not create duplicate rows.
  • Authenticate SAP S/4HANA with OAuth 2.0 client credentials stored per Communication Arrangement, refreshing short-lived tokens before expiry.
  • Authenticate Snowflake with Key Pair JWT authentication, whitelisting the connector service account IP in Snowflake network policies.
  • Handle OData pagination and gzip compression from SAP, and use Snowflake's COPY command for efficient bulk insert of large GL and invoice datasets.

Questions

Do I need to create a Communication Arrangement in SAP before the integration starts?
Yes. An SAP admin must create a Communication System, a Communication User, and a Communication Arrangement that grants OAuth client credentials and the OData scopes needed for invoices and GL queries. The OAuth token endpoint URL varies per tenant and must be copied from the Communication Arrangement OAuth details, not constructed manually. Without this setup, API calls return 401 Unauthorized.
What happens if Snowflake's warehouse is offline when ml-connector tries to write invoices?
ml-connector checks that the warehouse is online before executing any inserts. If the warehouse is offline or AUTO_RESUME is not enabled, ml-connector queues the request and retries after a delay. The operation is idempotent, so replaying is safe.
How does ml-connector prevent duplicate invoices when it runs multiple times?
Each invoice row is inserted with a composite deduplication key (vendor_id, invoice_number, line_number). If the same invoice is read from SAP in a second run, Snowflake's MERGE or INSERT IGNORE logic deduplicates based on the key, so the invoice is not written twice.

Related integrations

Connect SAP S/4HANA and Snowflake

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started