Agentic Data Engineering Resources

Resource

What Is Agent Engine? A Guide for Engineering Teams

Agent Engine gives AI agents typed, authenticated read/write access to SaaS APIs via open-source Python connectors and hosted credential management.

Airbyte Engineering Team

March 6, 2026

Summarize with AI:

The hardest part of building an AI agent isn't the agent. It's everything between the agent and the data: authentication flows, schema handling, token refresh, credential isolation per customer. That infrastructure consumes more development time than the reasoning logic it supports. A proof of concept with two integrations works fine, but a production app with twenty data sources and a hundred end users turns your team into an OAuth maintenance crew instead of an AI engineering team.

Agent Engine exists to take that infrastructure off your plate.

TL;DR

Agent Engine separates two concerns: open-source Python agent connectors for typed API access and a hosted Agent Engine platform for multi-tenant credential management.
Agent connectors work as standalone Python packages inside any framework (Pydantic AI, LangChain, MCP) without the Airbyte platform.
Hosted mode handles OAuth flows, token refresh, and per-customer credential isolation so your team doesn't build auth infrastructure per provider.
Start with open-source connectors for single-environment work, then add the platform when you need multi-customer authentication and data isolation.

What Is Agent Engine?

Agent Engine is Airbyte's platform for giving AI agents typed, authenticated read and write access to SaaS APIs. It wraps authentication, type safety, and error handling into two components that serve different stages of product development.

The first is a library of agent connectors: open-source Python packages that let AI agents call third-party APIs through strongly typed, well-documented tools. Each connector works as a standalone package you install with pip or uv and import into your Python app, agent framework like Pydantic AI or LangChain, or Model Context Protocol (MCP) server.

The second is the Agent Engine platform, a subscription-based cloud service that manages credentials, customer isolation, and data replication for multi-tenant applications where each end user connects their own SaaS accounts. It handles OAuth (Open Authorization) flows, token refresh, and credential storage so your engineering team doesn't build this infrastructure per provider.
‍

What Are Agent Connectors?

Agent connectors are open-source Python packages that let AI agents call SaaS APIs in real time and get back typed responses. Unlike Airbyte's data replication connectors, which move data from sources to destinations on a schedule, agent connectors run inside your application and return results directly as the agent reasons.

Dimension	Data Replication Connectors	Agent Connectors
Purpose	Batch ELT/ETL: build a full historical dataset in a warehouse or data lake	Operational AI: answer a question, fetch fresh data, or perform an action while an agent reasons
Topology	Source-to-destination pairing managed by the Airbyte platform	Standalone Python packages imported into your app or agent; no pairing or sync pipeline
Execution	Jobs orchestrated by the platform with schedules and state tracking	Runs inside your Python app or agent loop and returns results immediately
Data flow	Writes data into destinations; maintains state for incremental sync	Streams typed responses back to the caller and supports write actions (create, update). Optional Context Store replicates a data subset for search operations.
Platform required	Yes (Airbyte Cloud or Self-Managed)	No, works as regular Python packages. Platform optional for multi-customer credential management
Relationship	Complements agent connectors; use for building historical datasets	Complements replication connectors; use for on-demand agent data access

Each agent connector lives in the Airbyte Agent Connectors repository and ships with a Python client with typed methods generated from Airbyte's connector definitions, connector-specific documentation covering supported operations and authentication requirements, and built-in input validation.

The typed interface matters because it reduces hallucination risk. When an agent calls a connector, it gets back typed response objects, not raw JSON that needs parsing and validation. The agent works with structured data that matches a defined schema, which also eliminates the engineering work of writing custom response parsers for each API.

Each connector exposes a unified interface through the execute method. It takes an entity (like "customers"), an action (like "list" or “create”), and optional parameters. Results stream back as typed Python objects, and the connector handles authentication, schema validation, and error handling consistently regardless of which SaaS API it wraps. Beyond read operations, connectors also support write actions, letting agents create records, update fields, and trigger workflows in connected systems.

Airbyte publishes a growing library of agent connectors covering CRMs, marketing platforms, developer tools, support systems, and more, with new connectors added weekly. For the full list of available connectors and their documentation, see the agent connectors page.

Every connector shares a single interface, which means the real question isn't whether your agent can talk to these APIs. It's how you manage credentials when each of your customers needs their own connection.

How Do Agent Connectors Work?

Each agent connector supports two execution modes, and the choice depends on whether you are building for a single environment or managing data access for multiple end users.

Dimension	Open Source Mode	Hosted Mode
Credential storage	You store API credentials locally (e.g., .env file)	Airbyte Cloud stores credentials securely
Authentication	You provide credentials directly to the connector	You provide Airbyte client_id and client_secret; platform handles provider auth
Multi-customer support	You manage credential isolation yourself	Each customer gets an isolated environment with separate credentials, connectors, and data
Platform dependency	None (pure Python package)	Requires Agent Engine platform
Best for	Single-environment development, prototyping, personal tools	Multi-tenant applications where end users connect their own SaaS accounts
Setup complexity	Lower: install package, provide API key, call operations	Higher: set up platform account, enable connectors, implement authentication module

Open Source Mode

Install the connector package, provide API credentials, and call operations directly. A GitHub connector in open source mode looks like this:

You then register the connector's execute method as a tool in your agent framework. With Pydantic AI, this looks like:

This is the fastest path from zero to a working agent-API integration.

Hosted Mode

When your application serves multiple end users, each with their own Salesforce account, their own Stripe keys, and their own HubSpot instance, managing credentials locally doesn't scale. In hosted mode, you provide your Airbyte client_id and client_secret plus an identifier for the end user. The Agent Engine platform stores each user's provider credentials securely and handles token refresh. For example, a linear agent connector in hosted mode looks like this:

The difference between modes is where credentials live and who manages them. In hosted mode, API calls proxy through Airbyte rather than going directly to the vendor. This gives the platform a central point for audit trails and automatic token rotation. Sensitive API credentials never leave Airbyte's infrastructure.

That convenience comes with a tradeoff. The proxy adds a network hop. For most use cases the latency is negligible, but teams with strict performance requirements should test with their specific connectors before committing to hosted mode for latency-sensitive paths.

What Does the Agent Engine Platform Do?

The Agent Engine platform adds infrastructure that multi-tenant applications need but teams rarely want to build themselves. Customer isolation, managed authentication, and a context store each address a distinct production requirement.

Customer Isolation

In Agent Engine, a "customer" represents an end user of your service who connects their own data sources. Each customer gets an isolated environment with separate credentials, separate connector configurations, and separate data. When you build a customer support agent that accesses each customer's Zendesk instance, Agent Engine keeps Customer A's tickets completely isolated from Customer B's. The API uses the terms 'workspace' and 'external_customer' interchangeably to refer to a customer environment.

The platform enforces this isolation through a token hierarchy. Operator tokens provide organization-wide access with a 15-minute expiration. Scoped tokens restrict access to a single customer's data with a 20-minute expiration. Widget tokens add cross-origin resource sharing (CORS) protection for embeddable UI components. This three-tier system scopes each API call to exactly one customer's environment, and the short expiration windows limit exposure if a token is compromised.

Authentication Module

The platform provides a white-label UI component, the authentication module, that end users interact with to connect their SaaS accounts. You embed this in your application. When a user clicks "Connect Salesforce," the authentication module handles the OAuth flow, stores the resulting tokens securely, and makes the connection available to your agent.

Your backend generates widget tokens by calling the Airbyte API with an operator token, the customer's identifier, and an allowed origin for CORS protection. The module handles connector selection, OAuth redirects, callback handling, and credential validation. After authentication, the agent can execute operations against the user's connector immediately.

The authentication module ships as an npm package (@airbyte-embedded/airbyte-embedded-widget) and fires events your application can listen to, such as source_created or source_create_error. This lets you update your UI as users connect accounts. Your backend never touches OAuth tokens directly. The allowed_origin parameter on widget tokens provides CORS protection and must match your frontend's origin exactly, including port.

Context Store

The context store is Airbyte-managed storage that copies a subset of data from connected agent connectors for search actions. It selects data it considers relevant to search, maintains isolated data stores per source, and refreshes hourly. Agents can query the context store with sub-second latency, which avoids the processing time and cost of complex queries against live vendor APIs.

The context store populates automatically during initial setup. Initial population time depends on the volume of data in your connected sources, and you can't run search actions until the first full population completes. The refresh rate is fixed at hourly and can't be configured, but you can disable the context store entirely. When you disable it, Airbyte removes cached data from storage. Re-enabling requires a full re-sync from scratch.

Engineers who have configured their own object storage can skip the context store and make that data available to agents through self-implemented tools instead.

How Does Agent Engine Work With MCP?

Agent Engine provides MCP servers for different use cases in agent development.

Connector MCP

The Connector MCP runs locally and exposes three tools to agents: list configured connectors, describe what a connector can do, and execute operations against data sources. You configure it with a YAML file that specifies which connectors to use and their credentials. It works with Claude Code and other MCP-compatible clients for interactive agent development.

PyAirbyte MCP

The PyAirbyte MCP server manages Airbyte data pipelines through AI assistants. It supports listing connectors, validating configurations, and running sync operations. Install it via uvx --python=3.11 --from=airbyte@latest airbyte-mcp and connect it to any MCP-compatible client including Claude Desktop, Cursor, Cline, Warp, and Claude Code. This server is marked as experimental and the API may change between minor versions. Credentials are never exposed to the large language model (LLM). The server reads actual values from environment variables while the LLM sees only variable names.

Airbyte Knowledge MCP

The Airbyte Knowledge MCP connects AI agents to up-to-date information about Airbyte's features, APIs, and best practices. It provides semantic search over Airbyte's documentation, website, OpenAPI specs, YouTube content, and GitHub issues, discussions, and pull requests. The server is hosted at https://airbyte.mcp.kapa.ai with no local installation required. MCP gives agents a protocol-level way to discover and call connectors, but the harder problem, managing which credentials each customer's agent is allowed to use, still requires the platform layer.

When Should You Use Agent Engine?

Agent connectors fit when your agent calls SaaS APIs during its reasoning loop and needs consistent behavior across providers.

Live API Access During Agent Reasoning

Your agent needs to query SaaS data mid-reasoning rather than work from a pre-loaded dataset. If an agent reads from a CRM system, billing platform, or analytics tool while deciding what to do next, agent connectors give it direct access with guardrails against malformed responses.

Frequent Integration Expansion

You're adding new SaaS integrations regularly and need your agent orchestration code to stay stable. Because every connector shares a common interface, adding a data source means installing a package, not rewriting framework integration code.

Multi-System Coordination in a Single Agent

A single agent coordinates across GitHub, Jira, Salesforce, or other systems in one reasoning loop. Each connector is a standard Python dependency, so composing them requires no adapter code or custom abstraction layers.

Add the platform when your application goes multi-tenant and you need managed credential infrastructure. If your use case is batch data replication to a warehouse or data lake, use Airbyte's data replication connectors instead. Agent connectors are for operational AI access with sub-second response times, not historical data consolidation.

How Do You Get Started With Agent Engine?

Start with the open-source connectors. Install a connector package, provide an API key, and call operations from your agent in minutes. The Python SDK tutorial walks through building a Pydantic AI agent with a GitHub connector from scratch. You create a project with uv, add the agent, and build a command-line chat interface for natural language data interaction. The MCP tutorial shows how to expose connectors to Claude Code through natural language.

When your application needs multi-customer credential management, add the Agent Engine platform. The hosted execution tutorial covers setting up platform credentials, creating customer-scoped tokens, and running connector operations through Airbyte's infrastructure. The sooner you validate the connector interface in a single-tenant prototype, the less rework you face when multi-tenant requirements arrive.

What's the Simplest Way to Productionize Agent Data Access?

The gap between a working agent prototype and a production multi-tenant application is almost entirely infrastructure. Agent Engine closes that gap by separating typed API access from multi-tenant credential management, so teams adopt each layer only when they need it.

Airbyte's Agent Engine gives engineering teams a path from single-environment prototyping to production multi-tenant deployments without rebuilding integration infrastructure at each stage.

Connect with an Airbyte expert to see how Airbyte powers production AI agents with reliable, permission-aware data.

You build the agent. We'll bring the data.

Authenticate once. Fetch, search, and write in real-time.

Try Airbyte Agents Engine →

Frequently Asked Questions

What is Agent Engine?

Agent Engine is Airbyte's platform for giving AI agents typed, authenticated read and write access to SaaS APIs. It includes open-source Python connectors that agents import and call directly, and a hosted platform that manages credentials and data replication for multi-tenant applications. The connectors work as standalone packages without the Airbyte platform, or through the platform for multi-customer credential management.

How are agent connectors different from Airbyte's data replication connectors?

Data replication connectors move large volumes of data from sources to destinations on a schedule. Agent connectors are lightweight Python packages that let AI agents call SaaS APIs on demand and return typed responses immediately. The two connector types complement each other.

Do I need the Agent Engine platform to use agent connectors?

No. Agent connectors are regular Python packages you install and use without any Airbyte platform. In open source mode, you provide API credentials directly and manage them yourself. The platform is needed when your application serves multiple end users who each connect their own SaaS accounts.

What SaaS tools does Agent Engine support?

Agent Engine provides a growing library of connectors for SaaS tools including Salesforce, HubSpot, Slack, Jira, Notion, Google Drive, Stripe, and GitHub, with new connectors added weekly. See the full list on the agent connectors page.

How does Agent Engine handle authentication for multiple users?

The Agent Engine platform provides a white-label authentication module you embed in your application. When an end user connects their SaaS account, the module handles the OAuth flow and stores tokens securely. Each customer gets an isolated environment with separate credentials and data, and your agent provides a customer identifier to scope operations to the correct environment.

Try Airbyte Agents

Airbyte connects your agents to all of your data and assembles context before they run. Build agents that actually know your business.

Try it free Talk to sales

What Is Agent Engine? A Guide for Engineering Teams

Related posts

Try Airbyte Agents