ml-connector
DATEVTaxJar

DATEV and TaxJar integration

DATEV runs the accounting ledger for German tax advisors and their clients. TaxJar calculates US sales tax and stores the order and refund transactions a merchant reports to state authorities. Connecting the two brings the sales tax TaxJar has already collected into DATEV as bookings, without anyone re-keying figures from a tax report. ml-connector reads TaxJar transactions on a schedule and submits them to DATEV as EXTF CSV jobs against the correct GL accounts and tax codes.

How DATEV works

DATEV is not a conventional REST API. A REST surface exists for client metadata (accounting:clients) and document uploads to DATEV Unternehmen Online (accounting:documents), but actual bookings are written by submitting EXTF-format CSV files to DATEV Rechnungswesen or DXSO XML files to DUO as asynchronous jobs, then polling the job status endpoint until it completes. Authentication is OAuth 2.0 Authorization Code with PKCE through Login mit DATEV; there is no machine-to-machine flow, so a tax advisor or client must consent interactively, and access tokens last only 900 seconds. There are no webhooks, finalized bookings are write-only and cannot be read back, and the standard chart of accounts is not retrievable over the API.

How TaxJar works

TaxJar exposes a REST JSON API over HTTPS, versioned with an x-api-version header and authenticated by a single API token, with no OAuth, client secret, or account id. It calculates sales tax in real time through POST /taxes and stores completed orders and refunds under /transactions, each keyed by a unique transaction_id, alongside customers with exemption types and the merchant's nexus regions. TaxJar publishes no webhooks, so the order and refund transaction lists are read by polling with from_date and to_date filters. Transaction endpoint calls do not count toward the monthly API threshold, and creating a transaction follows an upsert pattern: a POST that returns 422 falls back to a PUT, and a PUT that returns 404 falls back to a POST.

What moves between them

The flow runs from TaxJar into DATEV. ml-connector pulls finalized order and refund transactions from TaxJar over a chosen date window and turns each into DATEV booking rows, carrying the collected sales tax, net amount, document number, and posting date, with refunds posted as reversing entries. Those rows are assembled into an EXTF CSV file and submitted to DATEV as an asynchronous job, mapped to the matching DATEV GL accounts, tax codes, and cost centers. Cadence is scheduled rather than event-driven, because neither system emits webhooks, so the poll of TaxJar and the submission to DATEV run on the interval you set. DATEV bookings are write-only, so ml-connector never reads posted journals back into TaxJar.

How ml-connector handles it

ml-connector stores both credential sets encrypted. For DATEV it runs the OAuth 2.0 Authorization Code flow with PKCE (S256, a state value of at least 20 characters, and a nonce), sends the Bearer token and the X-DATEV-Client-Id header on every call, and refreshes the 15-minute access token using client_id alone, never the client secret. For TaxJar it sends the API token on each request and pins a version with the x-api-version header. Because both sides are pull-only, it polls TaxJar order and refund lists by date window and, after submitting an EXTF job to DATEV, polls the job status endpoint with exponential backoff and jitter until it reports complete or failed. TaxJar sales tax fields and product tax codes are mapped to DATEV tax codes and GL accounts up front, so every booking row lands on an account that already exists; the DATEV chart of accounts cannot be read over the API, so that mapping is configured, not discovered. EXTF files are written as UTF-8 with precomposed (NFC) characters and deterministic filenames, since DATEV rejects non-precomposed text silently and detects duplicate files by filename and document type, which makes a retried submission safe. On the TaxJar read side, a transaction_id is used as the natural dedup key so the same order is never posted twice. Every record carries a full audit trail and can be replayed if a DATEV job fails.

A real-world example

A mid-sized e-commerce seller of consumer goods is registered in Germany and keeps its books with a DATEV tax advisor, while selling into several US states through an online store that uses TaxJar to calculate and report sales tax. Before the integration, the bookkeeper exported a TaxJar transaction report each month and hand-entered the collected sales tax and net sales into DATEV, lining up tax codes and accounts by eye and correcting the inevitable transposition errors at close. With DATEV and TaxJar connected, each batch of finalized orders and refunds flows into DATEV as EXTF bookings on a schedule, mapped to the right tax codes and GL accounts, with refunds reversed automatically. The monthly re-keying step disappears and the sales tax already ties out when the advisor reviews the ledger.

What you can do

  • Post TaxJar order and refund transactions into DATEV as EXTF CSV bookings on a schedule.
  • Carry the sales tax TaxJar collected onto DATEV booking rows with the correct tax codes and GL accounts.
  • Submit DATEV jobs asynchronously and poll job status with backoff until each one completes or fails.
  • Bridge DATEV's OAuth 2.0 and PKCE login with TaxJar's static API token, refreshing the short-lived DATEV token automatically.
  • Deduplicate by TaxJar transaction_id and use stable EXTF filenames so retried submissions never double-post.

Questions

Which direction does data move between DATEV and TaxJar?
The flow is TaxJar into DATEV. Finalized order and refund transactions, with their collected sales tax, move from TaxJar into DATEV as EXTF booking rows. DATEV bookings are write-only and cannot be read back over the API, so ml-connector never writes financial entries from DATEV into TaxJar.
How does the integration handle the very different authentication on each side?
DATEV uses OAuth 2.0 Authorization Code with PKCE, so a tax advisor or client consents interactively through Login mit DATEV, and the access token lasts only 900 seconds. ml-connector stores that grant, sends the Bearer token and X-DATEV-Client-Id header, and refreshes using client_id alone. TaxJar uses a single static API token sent on every request, which ml-connector stores encrypted, so the two credential models are bridged without manual steps each run.
Why does the sync run on a schedule instead of in real time?
Neither system pushes events. DATEV has no outbound webhooks and TaxJar publishes none either, so both are read by polling. ml-connector pulls TaxJar transactions over a date window, submits them to DATEV as an asynchronous job, and polls the DATEV job status until it completes, all on the interval you set.

Related integrations

Connect DATEV and TaxJar

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started