Insights

The Commerce Data API Layer: What Developers Keep Getting Wrong

Daniel Nguyen 22 April 2024

Anyone who has tried to build a product that aggregates commerce data across multiple platforms will tell you that the most underestimated part of the problem is not the API rate limits, the schema inconsistencies, or the authentication complexity — it's the semantic fragmentation. A "product" in one platform's data model is an atomic entity with variants. In another, it's a parent with child SKUs. In a third, it's a catalogue entry that may or may not map to a physical inventory item depending on whether the seller is a first-party or a third-party. The same physical object has a fundamentally different data representation depending on where it lives, and the code that normalises those representations requires either deep familiarity with each platform's data model or a normalisation layer that's been genuinely built for the problem.

The default approach that most development teams take when they need to ingest commerce data is to build point integrations: a connector for each platform, maintaining the schema translation logic in-house, updated as each platform evolves its API. This is workable at small scale and becomes a significant engineering liability at medium scale. A marketplace discovery product that needs to index product data from twelve commerce platforms is maintaining twelve sets of normalisation logic, twelve authentication flows, twelve rate-limiting strategies, and twelve sets of update handling for when any of those platforms changes their schema. The total engineering cost of that maintenance burden is rarely accounted for honestly in the build/buy decision, because it's diffuse — it shows up as interrupt-driven platform engineering work that slows down product development without appearing as a discrete cost line.

Carted, which we backed at pre-seed in 2024, is building the commerce data API layer that abstracts this complexity — a developer-facing normalisation infrastructure that presents commerce product and merchant data in a consistent schema regardless of the source platform, and handles the authentication, rate limiting, and schema evolution management so that the consuming application doesn't have to. The product architecture bet is that the normalisation problem is generic enough across commerce applications that a shared infrastructure layer creates more value than bespoke integration work for each consuming product. That bet is more defensible in 2024 than it was in 2018, because the number of commerce platforms that matter in the APAC market has grown to the point where covering them adequately is a genuine project.

The mistake we see teams building in this space make consistently is optimising for API breadth — the number of platforms covered — at the expense of normalisation depth for the platforms that actually matter. A developer building a product for the Australian mid-market retailer segment needs the major hosted and open-source commerce platforms covered with high fidelity, not forty platforms covered at the level of a basic product title and price. The commercial value of a commerce data API layer is directly correlated with how well it handles the edge cases and data quality issues of the platforms your customers actually use, not the count of integrations in the marketing material.

The medium-term opportunity we see in this category is the accumulation of behavioural signals across the normalised data layer — not just product data, but pricing history, availability patterns, and demand signals that accumulate as the platform processes transaction data from multiple merchants. A commerce data API that starts as an infrastructure service and develops into a data network has a compounding value proposition that is qualitatively different from a pure connectivity play. The teams that understand that trajectory and are building toward it from the start are the ones we're most interested in backing.