Agentic Data Engineering Resources

Resource

Building a HubSpot Agent Connector Using the HubSpot API

Build a production HubSpot agent connector with OAuth lifecycle management, rate-limit handling, multi-tenant isolation, and fresh CRM context.

Pedro Lopez

March 9, 2026

Summarize with AI:

Building a HubSpot agent connector looks like a solved problem: create a private app, generate a token, call the endpoint, and feed the results to your LLM. You can pull contacts in an afternoon, but production requirements show up fast once you need to access customer HubSpot data across many accounts.

The agent connector code is the easy part; what breaks teams is the infrastructure required to keep calls reliable, fresh, and permissioned across every customer account.

TL;DR

Production HubSpot agent connectors require multi-tenant OAuth lifecycle management, per-tenant isolation, and reliable permissioning, not just API calls.
Rate limits vary by auth method and endpoint (especially Search), so agent connectors need queuing, backoff, and caching to stay responsive.
HubSpot API quirks (version fragmentation, internal property names/IDs) force schema normalization and mapping layers.
Fresh context at scale requires webhooks + incremental sync (not polling) and association-chaining to assemble complete CRM context.

What Does the HubSpot API Require?

Production constraints hide in the layers that a single-account prototype never exposes: authentication lifecycle, rate limit ceilings that vary by tenant, and an API surface area fragmented across multiple versions. Each layer compounds once you start serving customer data.

Authentication at Scale

For single-tenant use (your own HubSpot account), private app tokens work. For customer-facing agents accessing customer CRM data, OAuth 2.0 is required. Each customer authorizes your app through an OAuth flow that grants scoped access.

The authentication method you choose determines the ceiling of your agent connector.

Dimension	Private App Token	OAuth 2.0
Setup complexity	Low (generate token in dashboard)	High (build OAuth flow, redirect URIs, consent screen)
Multi-tenant support	No (single account only)	Yes (each customer authorizes independently)
Token lifecycle	Static (manual rotation recommended every 180 days)	Access tokens expire in 30 minutes; refresh tokens persist until app uninstall
Scope control	Scopes set at creation, fixed	Scopes requested per install, granular with required and optional sets
Agent use case fit	Internal tools, prototypes, HubSpot MCP server	Customer-facing agents, production multi-tenant

The production complexity starts with the token lifecycle. HubSpot access tokens now expire in 30 minutes, and developers must parse expires_in dynamically from each response rather than hardcoding expiry periods. During a refresh, HubSpot may issue a new refresh token, and your system must detect and store it. Fail to update the refresh token, and every subsequent refresh attempt fails permanently for that customer.

Picture the failure scenario: an access token expires while a sync is mid-flight. With a 30-minute token window, this race condition is common. The sync fails with a 401. Your refresh job fires and gets new tokens, but the in-flight sync retries with the old refresh token, which may no longer be valid. The sync fails silently. The agent serves stale data for hours until someone notices.

With hundreds of customer connections, token refresh becomes a background infrastructure service: store tokens with encryption, track expiry per tenant, refresh proactively 2–5 minutes before expiration, handle concurrent refresh race conditions with distributed locking, and alert when refresh fails. At that point, you're maintaining a credential management system.

Scope management introduces its own product tradeoffs. HubSpot uses required scopes and optional scope parameters. Required scopes that include premium-tier features (like custom objects on Enterprise plans) block installation entirely for customers on lower tiers. Optional scopes allow installation to proceed with graceful degradation.

Over-scope, and customers reject the install; under-scope, and your agent can't access the data it needs. Worse, OAuth scopes are sticky per user once authorized. Adding new scopes later means existing customers must disconnect and re-authorize your application across every tenant.

Rate Limits That Vary by Everything

An agent querying a customer's CRM for deal context, associated contacts, company details, and recent activity can consume dozens of API calls per query. The limits that govern those calls depend on how your agent connector authenticates and what HubSpot tier the customer is on.

Auth Method / Tier	Per 10 Seconds	Daily Limit	Notes
Private app (Free/Starter)	100	250,000	Daily limit shared across all private apps in the account
Private app (Professional/Enterprise)	190	650,000–1,000,000	Higher ceiling but still shared
OAuth app (marketplace)	110 per account	Not enforced separately	Fixed at 110 regardless of customer tier; API Limit Increase add-on does not apply
CRM Search API	5 requests/second	Separate from general limits	Impacts agent queries that filter/search CRM objects

Multiply concurrent users across multiple customer accounts and rate limits become the binding constraint on your architecture. Production agent connectors need per-account rate tracking, request queuing, exponential backoff with jitter, and graceful degradation when limits are hit.

Agents that filter or search CRM objects hit the CRM Search API's separate, more restrictive limit of 5 requests per second. If your agent's primary interaction pattern is search-based, this becomes the tightest ceiling. Search endpoints also enforce a hard cap of 10,000 total results. Any customer with a CRM larger than 10,000 records forces your agent connector into query partitioning strategies that add latency and code complexity to every search workflow.

API Surface Quirks

HubSpot's API spans multiple major versions simultaneously, and your agent connector will call across all of them:

V3 handles core CRM operations (contacts, deals, companies)
V4 is required for association labeling

V1 endpoints for Contact Lists and Marketing Email are being sunset on April 30, 2026. Every version deprecation becomes a migration project that touches your entire agent connector because workflows routinely cross version boundaries.

Internal property names don't match UI labels. The API returns firstname where the UI shows "First Name." The dealstage property returns internal stage IDs rather than human-readable labels like "Qualified to Buy." This requires a separate lookup table. Pipeline searches require numeric IDs, not names. Developers must maintain a mapping layer so the agent works with field names that actually mean something to users.

Pagination is cursor-based using limit and after parameters, with a maximum of 200 records per page for the Search API and 100 for most other endpoints. Missing or incorrect pagination logic means your agent connector processes partial datasets silently. Community-documented bugs include duplicate rows and out-of-order results in some endpoints.

What Architecture Decisions Matter for Agent Connectors?

The API constraints above shape a set of architectural decisions that determine whether your agent connector survives contact with production workloads.

Multi-Tenant Credential Isolation

A single customer's expired token should never cascade into a system-wide failure. Each customer's OAuth tokens, sync configuration, and data routing must live in isolation: per-tenant credential storage (encrypted, with Row-Level Security (RLS) enforcement), per-tenant sync state tracking, and per-tenant error handling. Getting this isolation wrong means a single token expiry at 2 AM can cascade into stale data across your entire customer base.

Assembling Full CRM Context

An agent answering "What's the status of the Acme deal?" needs more than the deal record. It needs the associated contacts (who's involved), the linked company (account details), recent engagements (emails, calls, notes), and potentially ticket history. Assembling that into one view is what an account snapshot agent produces for a rep before a call. HubSpot stores these as separate objects linked through its Associations API.

Fetching complete context means chaining API calls. A single agent query looks like this behind the scenes:

<pre><code>1. GET /crm/v3/objects/deals/{id}: fetch the deal record
2. GET /crm/v4/objects/deals/{id}/associations/contacts: get associated contact IDs
3. POST /crm/v3/objects/contacts/batch/read: batch fetch each associated contact
4. GET /crm/v4/objects/deals/{id}/associations/companies: get linked company ID
5. GET /crm/v3/objects/companies/{id}: fetch company details
6. Repeat for engagements, line items, tickets</code></pre>

A single agent query generates 5–7 calls. The Associations API doesn't support multi-level queries, so you can't retrieve "tasks associated with contacts associated with a company" in a single call. Batch association reads are capped at 1,000 IDs per request, which HubSpot reduced from no enforced limit per a February 2025 changelog. Multiply by concurrent users and the rate limit math gets tight fast.

Caching association graphs and pre-fetching related objects during sync (rather than at query time) reduces latency and API consumption. Association relationships (IDs only) can tolerate longer cache time-to-live (TTL) values of 5–15 minutes, while object properties need shorter TTLs of 1–5 minutes. That caching layer is the difference between an agent that responds in seconds and one that chains live API calls while the user waits.

Keeping Context Fresh Without Burning Quotas

Across fifty customer accounts polling contacts, deals, companies, and tickets every five minutes, wasted API calls on unchanged data add up to thousands per hour. The right freshness strategy depends on what tradeoffs your system can absorb.

Approach	Latency	API Cost	Complexity	Best For
Polling (5-min intervals)	5+ minutes	High (burns calls on unchanged data)	Low	Prototypes, low-volume use
Webhooks	Sub-minute	Low (events pushed, no polling)	Medium (HMAC-SHA256 signature validation, delivery failures)	Immediate event awareness
Incremental sync + Change Data Capture (CDC)	Sub-minute to minutes	Low (processes only changes)	High (cursor tracking, state management)	Production agents at scale

The production architecture combines webhooks and incremental sync: webhooks for immediate event awareness (subscribing to events like contact.creation and deal.propertyChange), incremental sync for catching anything webhooks miss.

HubSpot publishes no webhook delivery service-level agreements (SLAs), provides no mechanism for detecting missed events, and offers no failure notifications. Periodic batch reconciliation using the Search API with hs_lastmodifieddate filters is the only way to guarantee eventual consistency.

Tracking sync state per object type per customer account, with atomic cursor updates, is the underlying context engineering problem most teams underestimate until they're debugging silent data staleness across dozens of tenants.

MCP as a Shortcut (With Limits)

HubSpot now offers two MCP server options: a remote MCP server at mcp.hubspot.com that supports OAuth 2.0, and a local CLI-based Developer MCP server (@hubspot/mcp-server) configured with a private app token.

The remote server provides read-only access across CRM objects including contacts, companies, deals, tickets, and more. For internal use, either option is a fast path: connect through OAuth via the remote server or install the local package with a private app token, and Claude Desktop can search contacts and manage deals.

The remote MCP server's OAuth support lets individual users authorize access to their own HubSpot account, but it doesn't solve multi-tenant programmatic access where your application manages connections across many customer accounts simultaneously. Multi-tenant customer-facing agents where each user connects their own account through your OAuth flow need the full credential and sync infrastructure described above. That architectural gap is where the afternoon project becomes an ongoing engineering commitment.

When Does Purpose-Built Infrastructure Make Sense?

Custom agent connector code works at small scale, but the economics shift when you multiply maintenance across tenants, API versions, and HubSpot tier variations.

The Maintenance Multiplier

Building a HubSpot agent connector for one customer account is a manageable project. Maintaining it across fifty accounts, each on different HubSpot tiers with different rate limits, different custom properties, and different association structures, is a different category of work.

Every HubSpot API version deprecation becomes a maintenance task across all tenants. Token refresh failures at 2 AM become on-call incidents. A customer adding custom properties to their CRM breaks assumptions in the normalization layer that worked for every other account. Scope changes required by new agent features trigger re-authorization workflows across every tenant.

Custom code can do all of this. Whether agent connector maintenance should consume the engineering hours that could go toward agent logic, retrieval quality, and user-facing features is the real decision.

What Airbyte Agents Handles

Airbyte Agents includes a Python-based HubSpot agent connector with strongly typed tools for HubSpot interaction. The platform manages HubSpot-specific rate limits and schema normalization, mapping internal property names to standard models so your agent works with human-readable fields out of the box.

The platform also maintains relevant data subsets in Airbyte-managed storage, allowing search in under 0.5 seconds without repeated vendor API calls. Built-in row-level and user-level access controls enforce permission boundaries across tenants. OAuth lifecycle management and vector database delivery for RAG pipelines handle the infrastructure layer.

What's the Fastest Way to Ship a HubSpot Agent Connector?

The fastest path depends on the use case. For internal, single-tenant agents accessing your own HubSpot, HubSpot's MCP server or a private app token with the Python SDK gets you to a working demo in hours. For customer-facing agents accessing customer HubSpot data at scale, the OAuth lifecycle, rate limit management, schema normalization, and multi-tenant isolation work is infrastructure that either you build and maintain, or a platform handles for you.

Airbyte Agents provides this agent connector infrastructure, and its Context Store helps teams ship agent features rather than maintain data plumbing.

Get a demo to see how Airbyte Agents helps teams ship HubSpot agent connectors without maintaining the data plumbing, or try Airbyte Agents today.

Frequently Asked Questions

Does HubSpot have an MCP server?

Yes. HubSpot offers a remote MCP server at mcp.hubspot.com with OAuth 2.0 support, and a local Developer MCP server (@hubspot/mcp-server) using a private app token. Both provide tools for reading, searching, and managing CRM objects through MCP-compatible clients like Claude Desktop, but neither solves multi-tenant programmatic access for customer-facing agents.

What are HubSpot's API rate limits for agent applications?

OAuth apps get 110 requests per 10 seconds per account, fixed regardless of customer tier. Private apps vary from 100–190 requests per 10 seconds with daily limits from 250,000 to 1,000,000+. The CRM Search API has a separate 5 requests-per-second limit with a 10,000 result cap.

Can you use HubSpot API keys for new integrations?

No. HubSpot deprecated API keys as of November 30, 2022. All new integrations must use OAuth 2.0 (for customer-facing and multi-tenant) or private app access tokens (for single-tenant).

How do you keep HubSpot agent data fresh without exceeding rate limits?

Subscribe to HubSpot webhooks for immediate event notifications (contact.creation, deal.propertyChange) instead of polling. Combine that with incremental sync using hs_lastmodifieddate filters to catch events webhooks miss. Periodic batch reconciliation is required to guarantee eventual consistency.

What is the difference between HubSpot's V3 and V4 APIs?

V3 covers core CRM operations for contacts, deals, and companies. V4 is required for association labeling between CRM objects. Most production connectors call endpoints across both versions in a single workflow.

Try Airbyte Agents

Airbyte connects your agents to all of your data and assembles context before they run. Build agents that actually know your business.

Try it free Talk to sales

Building a HubSpot Agent Connector Using the HubSpot API

Related posts

Try Airbyte Agents