Agentic Data Engineering Resources

Resource

How to Build SaaS Integrations That Scale

Authentication, rate limits, schema normalization: why SaaS integrations break at scale and the patterns that keep AI agent pipelines running.

Pedro Lopez

March 9, 2026

Summarize with AI:

Connecting to one SaaS API is a weekend project. Connecting to twenty, each with different authentication flows, rate limits, schemas, and permission models, is a months-long engineering investment that grows with every new source.

For AI engineers building agent applications, the problem compounds: agents need not just synced records but fresh, permission-scoped context from customer tools delivered to vector databases or accessible through Model Context Protocol (MCP) servers.

TL;DR

Custom SaaS integrations follow a predictable collapse: the first agent connector works, the tenth consumes more engineering time than your actual product.
Authentication, rate limiting, and schema normalization each multiply in complexity with every new source, and they compound together.
AI agents demand capabilities beyond traditional record syncing: unstructured data handling, embedding generation, vector database delivery, and permission-scoped context.
Event-driven architectures, incremental sync with Change Data Capture (CDC), and multi-tenant isolation from day one are the patterns that hold up at scale.

What Makes SaaS Integrations Hard to Scale?

Authentication complexity, rate limit variance, and schema divergence each multiply independently with every new source. Worse, they compound together in ways the first few agent connectors don't reveal.

Authentication Across Providers

Salesforce uses OAuth 2.0 with refresh tokens that expire. HubSpot uses private app tokens. Google Workspace uses service accounts with domain-wide delegation. Managing credential storage, token refresh, rotation, and revocation across dozens of providers and hundreds of customer connections is a full authentication infrastructure project on its own.

The failure modes are subtle. When multiple concurrent requests detect token expiration simultaneously, each independently attempts a refresh. This creates race conditions that can invalidate tokens entirely. Auth0 limits active refresh tokens to 200 per user per application; exceeding that silently revokes the oldest token. A token expires at 2 AM, a sync fails without alerting anyone, and the agent serves stale data until someone notices.

A custom script handles one provider's auth flow. At ten providers, you're maintaining ten different auth implementations with ten different failure patterns. At fifty, authentication alone becomes a dedicated engineering workstream that never ships a product feature.

Rate Limits and Reliability

Many APIs enforce different rate limits with different scoping rules, but integration platforms often still use a single generalized retry strategy (such as exponential backoff with jitter) across them. The following table shows how four major SaaS providers handle rate limiting, each requiring its own tracking and backoff logic:

Provider	Rate Limit Model	Specifics
Salesforce	Daily quota + concurrency	100,000 API calls per 24-hour rolling period per org (Enterprise Edition), plus 1,000 per user license; max 25 concurrent long-running requests
Slack	Per-method tiers	Tier 1 through Tier 4 plus Special Tier, scoped per app per workspace per method; non-Marketplace apps restricted to 1 request/min on conversations.history and conversations.replies as of May 2025
HubSpot	Tiered burst + daily	Private apps: 190 requests per 10 seconds (burst) plus 650K–1M requests/day depending on Professional or Enterprise tier; public OAuth apps: 110 requests per 10 seconds per account
NetSuite	Concurrency-based + frequency	Account-wide concurrency cap tied to service tier (e.g., 5 base for Standard, 15 for Premium) plus 10 per SuiteCloud Plus license; additional frequency limits per 60-second and 24-hour windows

Hitting a rate limit mid-sync means partial data, broken state, and retry complexity. Each provider returns rate limit errors differently: HTTP 429 with varying Retry-After headers, custom error codes like Salesforce's REQUEST_LIMIT_EXCEEDED, or Stripe's Stripe-Rate-Limited-Reason header indicating which specific limiter was exceeded.

Production integrations need queuing infrastructure, exponential backoff with jitter, dead-letter queues for persistent failures, and per-provider rate limit tracking. Building this for one API is straightforward. Building it for dozens, each with different limits and error patterns, is infrastructure work that has nothing to do with your product. Every new provider you add inherits the full complexity of every provider before it, and the retry infrastructure never converges.

Schema Normalization

Salesforce calls it "Account." NetSuite calls it "Customer." HubSpot calls it "Company." Zendesk calls it "Organization." The same concept, four different names, four different object structures. The same phone number arrives as "+1-555-123-4567" from one system, "5551234567" from another, and "(555) 123-4567" from a third. HubSpot stores dates as Unix timestamps in milliseconds; Salesforce uses ISO 8601 format. Address fields are compound objects in Salesforce, separate properties in HubSpot, and plain text in Zendesk.

The divergence goes deeper than naming. Salesforce maintains two separate objects, Contacts and Leads, to represent individuals, while HubSpot uses a single unified Contact object. Syncing between them requires architectural decisions about field availability, validation rules, and duplicate handling that cascade through your entire pipeline.

Without normalization, downstream systems like agents, analytics tools, and vector databases receive inconsistent data that requires per-source parsing logic everywhere it's consumed. Normalization should happen once, at the integration layer. The teams that push normalization to consumers end up maintaining the same transformation logic in five different places, and none of them agree.

What Architecture Patterns Matter at Scale?

Those challenges explain why integrations break. The architecture decisions that follow determine whether they hold up under production load.

Event-Driven vs. Polling

Polling (checking for changes on a schedule) is simple but wasteful. It adds load to source systems with repeated queries and introduces latency bounded by the polling interval. Event-driven patterns like webhooks deliver changes as they happen, but they require reliability infrastructure: retry queues with exponential backoff, idempotency handling for at-least-once delivery, and dead-letter queues for events that exhaust retries.

In practice, most production integrations use a hybrid. Webhooks handle events where the source supports them, and polling with incremental sync covers everything else. The architecture decision is which pattern to use per source and how to fall back gracefully when a webhook provider drops events or a polling source changes its API. Getting this wrong means silent data loss that only surfaces when an agent serves an answer based on records that stopped syncing weeks ago.

Data Freshness: Full Refresh vs. Incremental Sync

Full refresh re-syncs everything on every run. It requires no cursor tracking, no delete handling, and no state management, which makes it simple. But when a source contains millions of records, transferring the full dataset on every sync consumes API quota, compute, and time.

Incremental sync tracks what changed since the last run and processes only new or modified records. It requires cursor management, handling of deletes (often through soft delete flags or periodic full refresh reconciliation), and state persistence across runs. The complexity is higher, but it's essential when sources are large or API rate limits are tight.

Change Data Capture (CDC) detects changes at the database transaction log level for sub-minute freshness. CDC reads insert, update, and delete operations directly from logs without querying production tables. This provides the lowest source system impact. The tradeoff: CDC requires source-level support and tables with primary keys. Most teams use a tiered approach where CDC handles high-frequency database sources, incremental sync covers API-based SaaS tools, and full refresh applies to small or infrequently updated datasets. Choosing the wrong tier for a given source either wastes API quota or delivers stale context to agents at exactly the moment freshness matters.

Per-User Scoping and Multi-Tenant Isolation

Customer-facing integrations demand three things working simultaneously: per-customer credential storage where Customer A's Salesforce tokens are isolated from Customer B's, per-user permission scoping where the agent sees only what the end user is authorized to access, and per-tenant configuration because each customer maps fields differently.

The integration layer can be multi-tenant by design, but it is not required; credible architectures also use per-customer isolated or hybrid designs. Tenant identity must be resolved before any business logic executes: before credential lookups, before data queries, before sync operations. Credentials, sync configurations, and data routing must be scoped to individual customers and users from the start. Bolting multi-tenancy onto a system designed for single-tenant use creates security gaps where valid authentication tokens can cross tenant boundaries if the application logic is leaky. By the time you discover the gap, the blast radius already includes customer data you were contractually obligated to isolate.

How Do AI Agents Change the Requirements?

Traditional SaaS integrations sync structured records between systems: Customer Relationship Management (CRM) contacts to marketing automation, Human Resources Information System (HRIS) records to payroll. AI agent applications introduce requirements that go beyond what conventional integration patterns were designed to handle, and those requirements reshape the entire pipeline.

Beyond Record Syncing

An agent reasoning over customer support data needs structured ticket records and the unstructured PDF attachments, Slack thread conversations, and Confluence documentation linked to those tickets. The integration layer must handle files and records from the same source in the same pipeline, with automatic metadata extraction (author, date, source, document type) for downstream retrieval.

This unstructured data requires processing steps that don't exist in traditional integrations: semantic chunking to break content into meaningful segments, embedding generation to convert text into vector representations, and vector storage for similarity search.

A traditional pipeline often validates data against a schema and performs create, read, update, and delete (CRUD) operations, but widely used integration pipelines typically also include capabilities such as data discovery and mapping, transformation, orchestration, and event-driven logic.

An AI agent pipeline parses documents, chunks content, generates embeddings, extracts metadata, and loads vectors into a database with indexing, roughly 2.5x more processing steps. Each step introduces technical requirements absent from conventional record syncing.

Agents also need data delivered differently. Instead of syncing to another SaaS tool, agents need data loaded into vector databases (Pinecone, Weaviate, Milvus, Chroma) with embeddings generated, or accessible through MCP servers and Agent MCP for on-demand querying from AI development environments. The destination changes everything about how the pipeline is built, tested, and monitored.

Permission-Scoped Context and Self-Service Connection

An enterprise agent accessing customer support data must respect the source system's permissions. A manager sees all tickets, while a support rep sees only their assigned tickets. Without row-level and user-level access controls enforced at the integration layer, agents leak data across permission boundaries. Loading corporate data into a central vector store without access control lists (ACLs) gives anyone interacting with the agent access to the entire dataset. Authorization must happen before semantic search. It must filter which embeddings a user can access, not just which results are displayed.

The end user also needs to connect their own tools without filing a support ticket or waiting for engineering. That shift turns integration activation from an engineering task into a product feature and removes a bottleneck that otherwise grows linearly with every new customer onboarded.

What Does Purpose-Built Infrastructure Look Like?

The challenges add up to something bigger than any single fix can address. They point to a category of work that sits between the source systems and the agent frameworks consuming their data.

Context Engineering for AI Agents

Authentication across providers, schema normalization, data freshness, permission scoping, unstructured data handling, and vector database delivery collectively form a context engineering problem. Context engineering is the practice of preparing and managing data that AI agents use for reasoning and decision-making. Unlike traditional data engineering, which involves designing, building, and managing data infrastructure to collect, transform, and deliver data at scale, context engineering specializes in making data accessible and relevant for LLM consumption. It treats the context window as a constrained resource requiring systematic management of retrieval, memory systems, and tool integrations.

Building this infrastructure from scratch means solving every challenge (covered above) for every source. Purpose-built platforms solve it once.

Airbyte Agents

The infrastructure layer that context engineering requires already exists. Airbyte Agents provides 600+ source replication connectors with built-in authentication and rate limit handling, automatic schema normalization, incremental sync with CDC for data freshness, unified structured and unstructured data handling with automatic metadata extraction, row-level and user-level access controls, and delivery to vector databases.

This reduces the custom integration surface area for every new data source.

What's the Fastest Path to Scalable SaaS Integrations for AI?

Every new source amplifies the complexity of every source before it. Teams that treat each integration as a one-off project discover this too late, after technical debt has consumed the engineering bandwidth meant for agent features.

Airbyte Agents provides the infrastructure layer so engineering teams focus on agent logic, retrieval quality, and user experience rather than data plumbing. For teams that need governed retrieval and lower-latency access to prepared business context, Context Store fits naturally into that architecture. Get a demo to see how Airbyte powers AI agents with governed, sub-minute access to enterprise SaaS data, or try Airbyte Agents today.

Frequently Asked Questions

What is the difference between internal and customer-facing SaaS integrations?

Internal integrations connect your organization's own tools with a single set of credentials, while customer-facing integrations connect your product to your customers' tools, requiring multi-tenant isolation, per-customer credentials, and self-service activation. AI agent applications are typically customer-facing: each customer connects their own SaaS sources, and traditional workflow automation tools often fall short because each customer needs secure, isolated connections.

Should you build integrations in-house or use a platform?

Build in-house for one to three critical integrations where you need maximum control and your team has deep expertise in the specific APIs. Use a platform when you need to scale beyond that, because maintenance complexity across dozens of sources diverts engineering from your core product. Most teams underestimate maintenance costs by 3-4x.

How many integrations do AI agent applications typically need?

A customer support agent typically needs 3-8 core sources (ticketing, knowledge base, CRM, communication tools), while an enterprise search agent could need 10-20+ sources covering every document repository, communication platform, and business application. The number grows with each customer because each customer uses a different tool stack.

What is Model Context Protocol (MCP) and how does it relate to integrations?

MCP is an open standard protocol that lets AI development environments (Claude Desktop, Cursor, ChatGPT, Codex, VS Code, Windsurf) interact with external data sources and tools through a consistent interface. For SaaS integrations, MCP servers allow agents to query and manage data pipelines through natural language rather than custom code.

How long does it take to build a single SaaS integration from scratch?

The time to build, test, and implement a single integration is often around 470 hours, especially once you include auth, retries, monitoring, and ongoing API changes. The build-versus-buy decision tips toward a platform after the third or fourth agent connector, once the true weight of ongoing maintenance becomes clear.

Try Airbyte Agents

Airbyte connects your agents to all of your data and assembles context before they run. Build agents that actually know your business.

Try it free Talk to sales

How to Build SaaS Integrations That Scale

Related posts

Try Airbyte Agents