ml-connector
TallyPrimeDatabricks

TallyPrime and Databricks integration

TallyPrime runs accounting and inventory management on the desktop; Databricks provides the cloud data platform for analytics and reporting. Connecting the two keeps your accounting records synchronized with your data warehouse without manual exports. Ledgers, vouchers, and inventory masters flow from TallyPrime into Databricks tables on a polling schedule, and your finance team can query, transform, and analyze the data in place without re-keying or intermediate staging.

How TallyPrime works

TallyPrime is a desktop accounting application that exposes ledgers, groups, vouchers (purchase, sales, payment, receipt, purchase order), and stock items through XML or JSON over HTTP POST to a local server on port 9000. The server is not enabled by default and must be manually started in TallyPrime Settings. Authentication is optional at the transport layer but requires the TallyPrime host IP, port, company name, and optionally a company password if the target company is protected. TallyPrime has no webhooks or event streams, so changes must be discovered by polling Day Book or specific collections with date range filters. Port 9000 is LAN-only, requiring a local agent running on the same machine or network as TallyPrime to bridge to cloud connectors.

How Databricks works

Databricks is a cloud data intelligence platform accessed via REST APIs at workspace-specific URLs (workspace-id.cloud.databricks.com for AWS, workspace-name.azuredatabricks.net for Azure, workspace-id.gcp.databricks.com for GCP). Authentication uses OAuth 2.0 Client Credentials via Service Principal with client_id and client_secret, with tokens expiring in 3600 seconds and requiring refresh logic. The platform supports Catalogs, Schemas, and Tables under Unity Catalog for data governance, but is a data warehouse only with no native finance objects or GL account hierarchy. Data written to tables is done via SQL or Spark; REST API writes are metadata-only. Webhooks exist only for MLflow Model Registry and are not applicable to cluster or table events.

What moves between them

The primary flow is TallyPrime to Databricks. Ledgers (accounts), groups (dimensions), vouchers (transactions across purchase, sales, payment, and receipt categories), and stock items are polled from TallyPrime on a configurable schedule and inserted or appended to Databricks tables. The connector queries TallyPrime's Day Book or specific collections using date range filters (YYYYMMDD format) and compares returned records against the last synchronized state to detect new and modified records. Scheduling can align with your accounting period close cycle or run at shorter intervals (5 to 15 minutes practical minimum) depending on your analytics requirements. No data flows back from Databricks to TallyPrime; the integration is read-only on the TallyPrime side.

How ml-connector handles it

ml-connector requires a local agent running on the same machine or LAN as TallyPrime to access port 9000 and bridge HTTP requests to the cloud connector service. On connection, the agent presents the TallyPrime host IP, port 9000, company name (required, case-sensitive), and optional company password in the SVCURRENTCOMPANY and authentication envelope fields of each Export Data request. TallyPrime must be running and have the target company loaded for API calls to succeed. The connector polls using date range filters (SVFROMDATE, SVTODATE) to retrieve ledgers, groups, and vouchers from the Day Book, and compares record IDs against the last synchronized state to detect new and modified entries. TallyPrime returns all matching records in a single response with no pagination, so the response size depends on the date range selected. Databricks authentication uses OAuth 2.0 Service Principal credentials with a client_id and client_secret to obtain a 3600-second bearer token from the workspace token endpoint; the connector refreshes the token before expiry to maintain continuous writes. Records are inserted into Databricks tables as raw TallyPrime XML or JSON-parsed rows; downstream SQL or Spark transformations can normalize and model the data for analysis. The connector tracks the last synchronized timestamp per table to avoid re-processing older records, and retries on transient failures.

A real-world example

A mid-sized accounting firm in India uses TallyPrime across multiple customer books to manage their own accounting and provide bookkeeping services to small business clients. The firm needs to consolidate all client ledgers into a central data warehouse for cross-client analytics, compliance reporting, and audit trails, but exporting from TallyPrime by hand and loading into their Databricks warehouse is a monthly manual process that takes days and introduces errors. With TallyPrime and Databricks connected, the firm's ledgers, vouchers, and inventory masters sync automatically on a configurable schedule (daily, weekly, or by accounting period). Databricks tables become the single source of truth for aggregated reporting, audit validation, and client invoicing, eliminating the export and load delay.

What you can do

  • Sync ledgers, groups, and vouchers from TallyPrime to Databricks tables on a configurable polling schedule that aligns with your accounting close cycle.
  • Deploy a local network agent to securely bridge TallyPrime's port 9000 to cloud connector, requiring no changes to TallyPrime itself.
  • Handle TallyPrime company names, passwords, and HTTP credentials encrypted, and refresh Databricks OAuth tokens automatically before expiry.
  • Detect new and modified records in TallyPrime using date range filters and compare against last synchronized state to avoid re-processing.
  • Load TallyPrime data as raw XML or JSON-parsed rows into Databricks tables for downstream SQL and Spark transformations.

Questions

What records flow from TallyPrime to Databricks?
Ledgers (accounting accounts), groups (organizational and analytical dimensions), vouchers across all categories (purchase, sales, payment, receipt, purchase order), and stock items (inventory masters) flow from TallyPrime to Databricks. The connector polls TallyPrime's Day Book or specific collections using date range filters and syncs records to Databricks tables on a schedule you control.
Why does TallyPrime integration require a local agent?
TallyPrime's HTTP server on port 9000 is LAN-only and not accessible over the internet, so a local agent running on the same machine or network as TallyPrime must be deployed to bridge the requests. The agent handles the HTTP POST operations to TallyPrime and relays the XML or JSON responses to the cloud connector.
How does the connector handle TallyPrime company passwords and Databricks authentication?
TallyPrime company names and optional passwords are stored encrypted in the connector's secure credential vault. Databricks authentication uses OAuth 2.0 Service Principal with a client_id and client_secret to obtain bearer tokens that expire in 3600 seconds; the connector refreshes the token automatically before expiry to ensure continuous data writes.

Related integrations

Connect TallyPrime and Databricks

Free to use. Add your credentials, ping your real systems, and see if we fit.

Get started