How to Build Embedded Integrations in SaaS

Most teams pick their embedded integration architecture based on how fast they can ship the first connector. Then they spend the next two years paying for that decision in maintenance, credential failures, and schema drift. The architecture choice becomes even more consequential as AI agents shift the integration requirement from syncing records to delivering deep, permission-governed, multi-format context.

TL;DR

  • Embedded integrations let customers connect tools inside your UI across three layers: embedded UI (auth), embedded runtime (sync logic), and embedded marketplace (catalog).
  • Four approaches trade off control vs. speed differently: custom build, unified API, embedded iPaaS, and connector platforms.
  • Maintenance dominates total cost of ownership — credential failures, schema drift, rate limits, and API deprecations accumulate faster than the initial build.
  • AI agents require a different architecture than record syncing: provider-specific fields, unstructured docs, embeddings, sub-minute freshness, row-level permissions, and vector DB delivery.

What Are Embedded Integrations?

An embedded integration is integration functionality built into your SaaS product so customers can connect their tools without leaving your UI. The customer doesn't go to a separate platform, import CSVs, or configure webhooks manually.

Engineers commonly group three architectural layers under "embedded":

  • Embedded UI: A white-label auth widget or connection interface. The customer clicks "Connect Salesforce," completes OAuth, and the connection is live.
  • Embedded Runtime: Sync logic, data transformation, and error handling running on the provider's infrastructure but orchestrated by your product via API.
  • Embedded Marketplace: A catalog of available integrations customers can discover, configure, and activate from within your product.

Most embedded integration platforms provide all three. The choice of approach determines how much you build vs. buy.

Which Embedded Integration Architecture Fits Your Use Case?

The right choice depends on what kind of data your product needs and how much maintenance your team can absorb. The following table compares the four approaches across the dimensions that matter most in production.

Dimension Custom Build Unified API Embedded iPaaS Connector Platform (Airbyte)
What you build Everything: auth, mapping, sync, UI, monitoring Application logic on top of normalized API Workflow configuration using visual designer + pre-built connectors Agent logic on top of replicated data
Data depth Full (you control what you access) Shallow (normalized common fields only) Medium (full provider API available, but you build each workflow) Deep (full replication preserving provider-specific fields)
Unstructured data If you build it No Limited (depends on connector and workflow) Yes (files + records in same connection, automatic metadata)
Auth management You build OAuth, token refresh, credential storage Provider handles all auth Provider handles connector auth; you configure per-customer Provider handles auth via embeddable widget
Customer self-service You build the UI Pre-built auth widget + linking component White-label marketplace + configuration UI Embeddable widget for data source connection
Time to first integration 2–4 months per integration Days per category Weeks per integration Hours to days per connector
Maintenance owner You (100%) Provider (connectors) + You (app logic) Provider (connectors) + You (workflow logic) Provider (connectors) + You (agent logic)
Data freshness You control (build polling, webhooks, Change Data Capture (CDC)) CRON polling (minutes to hours) Depends on platform and connector Incremental sync + CDC (sub-minute)
Permission scoping You build it Over-scoped by default (unified OAuth) Integration-specific scopes Row-level + user-level access control lists (ACLs) built in
Deployment Your infrastructure Provider cloud by default, with additional self‑hosted, hybrid, and on‑premises options supported by major vendors Provider cloud (some offer private cloud) Cloud, multi-cloud, on-prem, hybrid
Best for Core differentiating integrations requiring full control Broad, shallow integrations across many providers in one category Complex, customer-configurable workflows across categories AI agents needing deep, governed, multi-format context

When custom build is worth the cost: Your integration is the product differentiator, and you have the engineering capacity to maintain it. Custom projects consistently take 2–4 months per integration, with actual timelines frequently reaching 2–3x initial estimates. Most products have one to three integrations worth building custom and dozens that aren't.

When unified APIs make sense: Your product needs to support many providers in the same category with standard data. A unified API gets you there in days per category rather than months per provider, though you're limited to the normalized fields the provider exposes. 

When embedded iPaaS fits: Your customers need configurable, multi-step workflows that vary by customer. Embedded iPaaS platforms also work well when you integrate across multiple application categories; unified APIs handle this poorly since each category requires a separate provider. The tradeoff is implementation time: each workflow requires configuration, testing, and ongoing maintenance as connectors update.

When a connector platform fits: Your product needs full-depth data replication with provider-specific fields, unstructured files, automatic embeddings, row-level permissions, and vector database delivery. A connector platform like Airbyte handles all of this in one pipeline, with the provider maintaining connectors and surfacing schema changes automatically. This is the architecture AI agents require: governed, multi-format context delivered where your agent can retrieve it, not a normalized subset of structured records.

What Does Maintenance Actually Cost?

Every approach claims to reduce maintenance. The actual burden breaks into six categories, and each approach shifts ownership differently.

Maintenance Category What Breaks Custom Build Unified API Embedded iPaaS Connector Platform
Credential lifecycle OAuth tokens expire, refresh tokens rotate, API keys revoked You handle everything Provider handles Provider handles Provider handles
Schema drift Provider adds/removes/renames fields across API versions You detect and fix Provider absorbs for normalized fields; custom fields unaffected (not in schema) Provider updates connector; you update workflow if affected Provider updates connector; schema changes surfaced automatically
Rate limit changes Provider adjusts limits by plan tier or policy You detect and handle Provider handles within their calling patterns Provider handles at connector level; you handle at workflow level Provider handles
API version deprecation Provider sunsets API versions (e.g., Salesforce SOAP → REST, HubSpot V1 → V3) You migrate manually Provider migrates Provider migrates connector; you verify workflow compatibility Provider migrates
New provider features Provider ships new endpoints, objects, or capabilities You add support manually Available only if they fit normalized schema Available if connector is updated and you build workflow Available when connector is updated
Customer support for broken connections Customer's integration fails; your team investigates Full investigation burden on your team Provider logs available; investigation split Platform provides logs, alerts, debugging tools Platform provides observability, tracing, metrics

Schema Drift Is the Hidden Tax

Salesforce maintains a release schedule that regularly adds, renames, or deprecates fields. HubSpot migrated from V1 to V3 APIs, and this migration changed field naming conventions entirely; their marketing_emails stream migration note documents significant schema changes and removed fields that were previously present. Even "stable" providers make incremental schema changes that break mapping logic. The cumulative effect is a maintenance load that grows with every provider you support, whether you built the connector or not.

Credential Lifecycle Compounds at Scale

OAuth token lifetimes vary dramatically by provider. Salesforce uses an activity-based expiration model and breaks OAuth convention by not returning the expires_at parameter in token responses. HubSpot refresh token responses may include a refresh token rule your system must always follow. When a customer's refresh token fails (revoked, expired session, changed permissions), the integration silently stops working.

At 10 customers, you handle this manually. At 1,000 customers across 15 providers, credential failures become a daily support ticket category. That daily volume is where the maintenance cost shifts from engineering time to product trust.

What Changes When AI Agents Need the Data?

AI agents shift the integration requirement from record syncing to context delivery. The following table maps each agent-specific requirement against the four approaches.

Agent Requirement Why It Matters Custom Build Unified API Embedded iPaaS Connector Platform
Provider-specific fields Agent needs actual Salesforce stage values, not normalized labels Full access Lowest common denominator (LCD) schema only If workflow accesses full API Full replication
Unstructured data (docs, messages, files) Agent reasons over documents, not just records If you build it Not supported Limited by connector Files + records in same pipeline
Automatic embeddings + metadata RAG pipeline needs vectors, not raw text You build pipeline Not supported Not supported Automatic generation
Sub-minute freshness Stale data produces stale answers If you build CDC CRON polling Platform-dependent Incremental sync + CDC
Row-level permissions Agent must only see what querying user can access You build ACLs Over-scoped OAuth If connector supports Built-in ACLs
Vector database delivery Agent retrieves via semantic search You build pipeline Not supported Not supported Delivers to Pinecone, Weaviate, Milvus, Chroma
On-prem / data sovereignty Enterprise won't send data to third-party cloud Your infrastructure Provider cloud only Some offer private cloud Deploy anywhere

Consider an agent answering "What's the status of the Acme renewal?" That agent needs the deal record from CRM (with provider-specific stage values), the latest proposal document from Google Drive, the relevant Slack thread, and the permission verification that the user asking is authorized to see Acme's data. 

Normalization discards provider-specific data at ingest time, and iPaaS platforms don't generate embeddings or deliver to vector databases. These aren't missing features a product update could add — they reflect architectural decisions about what the pipeline preserves and where it delivers. For AI agent applications, "integration" means governed, multi-format context delivery, and that demands different architecture.

How Does Airbyte's Agent Engine Deliver Context for AI Agents?

Airbyte's Agent Engine provides embedded data infrastructure designed for the shift from record syncing to governed, multi-format context delivery for AI agents. The embeddable widget lets customers connect data sources through a white-label UI, but the pipeline behind it is purpose-built for agent context delivery:

  • 600+ connectors with full data replication (provider-specific fields preserved)
  • Structured records and unstructured files in the same connection with automatic metadata extraction
  • Embedding generation and delivery to vector databases (Pinecone, Weaviate, Milvus, Chroma)
  • Row-level and user-level access controls across all sources
  • Incremental sync with CDC for sub-minute replication
  • Deployment anywhere (cloud, multi-cloud, on-prem, hybrid)

Each of these capabilities maps directly to an agent requirement that traditional embedded integration platforms leave unaddressed. The difference between syncing a contact record and delivering governed, retrieval-ready context is the difference between a product that reads data and one an AI agent can reason over.

What's the Best Way to Build Embedded Integrations for Your Product?

Start with the use case. If your product syncs structured records across a single SaaS category, a unified API is the fastest path. If customers need configurable multi-step workflows, embedded iPaaS provides the flexibility. If one or two integrations are your core differentiator, build those custom and use a platform for the rest. 

If your product delivers AI agent context, the architecture changes entirely, and that requires context engineering infrastructure purpose-built for the job.

Get a demo to see how Airbyte's Agent Engine provides data infrastructure for AI Agents.

You build the agent. We'll bring the data.

Authenticate once. Fetch, search, and write in real-time.

Try Agent Engine →
Airbyte mascot


Frequently Asked Questions

What is the difference between embedded iPaaS and unified API?

Embedded iPaaS provides a visual workflow builder with pre-built connectors for creating complex, customer-configurable integrations across any software category. Unified APIs provide a single normalized interface per category (CRM, HRIS, Accounting) with a standard data model. Embedded iPaaS offers more flexibility and depth per integration; unified APIs offer more speed and breadth across providers in a category.

How long does it take to build an embedded integration?

Timelines range from hours (connector platform) to months (custom build), with the comparison table above showing specifics per approach. The hidden variable is maintenance: the initial build is often 20–30% of the total cost of ownership over two years.

What maintenance do embedded integrations require?

Maintenance grows with the number of providers you support, not the number of connectors you build. Schema drift, credential failures, and API deprecations compound across providers regardless of whether you or a platform owns the connector. The maintenance table above breaks down ownership by approach.

Can embedded integrations handle unstructured data?

Most embedded platforms are built for structured records. Processing documents, messages, or recordings alongside those records, and making that content retrievable for AI agents through embeddings and vector search, requires deeper infrastructure than traditional embedded platforms provide.

Do I need different integration infrastructure for AI agents?

The agent requirements table above maps the gap. Traditional platforms cover record syncing well. Agent context delivery requires a pipeline that preserves source fidelity, processes multiple formats, generates embeddings, enforces permissions, and delivers to vector databases as a single integrated system.

Loading more...

Try the Agent Engine

We're building the future of agent data infrastructure. Be amongst the first to explore our new platform and get access to our latest features.