DataGOL accelerates AI Agent adoption in enterprises using Airbyte

Table of contents
Company Size
Startup (10-50 employees)
Region
North America
Industry
Agents and Data Platform / SaaS
Sources
HubSpot, Salesforce, NetSuite, Postgres, MySQL, MSSQL, Redshift, Snowflake, S3, REST APIs, Fellow, custom CRMs, Databricks, BigQuery, IBM DB2
Destination
DataGOL AI Native Lakehouse
Tech Stack
- Source(s): HubSpot, Salesforce, NetSuite, Postgres, SQL databases, Redshift, Snowflake, S3, custom REST APIs
- Destination(s): DataGOL Lakehouse
- ELT: Airbyte Open Source, Eliminates the need to manage infrastructure using Airflow, DBT’s , Apache Spark
- Transformation: Managed Apache Spark
- Agentic Harness: Long-Running agents, Context Management, Safe Code Execution, Agent-to-Agent Communication.
- Supported File format: Iceberg, Parquet, JSON, Avro, Databases
- Database: DataGOL AI Native Lakehouse
Key Results
- Enabled ease of preparation of AI ready data for creating custom agents
- Replaced custom bash scripts, Python jobs, and Java code with a single, centralized data ingestion platform
- Eliminated custom scheduling and retry logic by leveraging Airbyte's built-in pipeline orchestration
- Gained full observability into every pipeline run, including success/failure tracking and schema change detection
- Unlocked access to 600+ connectors, dramatically reducing the time needed to onboard new client data sources
- Accelerated custom connector development from days to hours using Airbyte's Connector Builder and SDK
- Simplified deployment to a single Kubernetes package, removing the overhead of managing multiple codebases
- Choose Airbyte over Fivetran for its open source flexibility, ease of setup, and connector extensibility
About DataGOL
DataGOL is an AI Native Data and Agents Platform.
DataGOL provides the infrastructure to launch AI agents, combining the modern data stack, context management, and agentic capabilities into a single platform that enables organizations to deploy AI internally and build AI-native products externally.
Custom scripts and cron jobs strain a growing AI platform
For a company whose value proposition depends on making client data AI-ready, reliable data ingestion is the foundation of everything. Before Airbyte, that foundation was fragile.
"We need all of our clients' data clean, transformed, and easy to access so we can power downstream AI pipelines like RAG and text-to-SQL," says Abhinandan, Senior Data Engineer at DataGOL. "Before Airbyte, our ingestion logic was spread across bash scripts, Airflow jobs, and custom code. Every new client made that harder to manage."
Each client brought a different mix of CRMs and proprietary APIs, and each required its own bespoke ingestion logic. Some code lived in Java, some in Python in a separate project, and scheduling was held together with cron jobs.
"We had ingestion code in Java, scheduling scripts in Python, cron jobs holding it all together," says Jyotish Bora, VP of Engineering at DataGOL. "There was no single orchestration layer. Every new client meant more code to maintain, more things that could break."
The system worked, but it demanded constant attention, made debugging difficult, and slowed down client onboarding as the company scaled.
Open source flexibility tips the scales toward Airbyte
When DataGOL began evaluating replacements, they needed broad connector coverage, REST API support, and the flexibility to build and extend custom connectors.
"Open source was the biggest factor," says Jyotish Bora. "We could see every connector that was available and update them to fit our needs."
The team evaluated Fivetran alongside Airbyte but quickly moved on. "Fivetran didn't have an open source version, and we couldn't even get a single data source connected during evaluation," Aasim recalls. "With Airbyte, we spun up a Docker container and were ingesting data within minutes."
DataGOL chose to self-host Airbyte Core on Kubernetes, giving them:
- Access to 600+ pre-built connectors for popular sources like HubSpot, Salesforce, and NetSuite
- The Connector Builder for YAML-based, no-code connector creation
- The Airbyte SDK for fully custom Python connectors
- Built-in scheduling, retry logic, and pipeline orchestration
- A centralized dashboard for monitoring all pipeline activity
New connectors in hours, not days
One of Airbyte's biggest benefits for DataGOL has been the ability to rapidly build connectors for clients with niche or proprietary data sources. For standard REST APIs, the team uses Airbyte's YAML-based Connector Builder. For more complex integrations, they write Python using the SDK.
"Clients often have one-off APIs or hand us data in S3," Jyotish explains. "One client used Fellow for meeting transcripts, which didn't have a connector. We built a custom one with the Airbyte SDK and had it running quickly."
"The patterns are consistent and the abstraction layer is well-defined, so standing up a new connector takes hours, not days," says Jyotish.
"Our main motivation was the time we were spending building and testing custom connectors," adds Aasim. "With Airbyte, REST-based ingestion is fully covered. If something is missing, we build it with the SDK and move on."
Rather than spending days writing and testing bespoke data scripts for each new client, the team can stand up connectors rapidly and focus engineering effort on the AI features that differentiate the platform.
One dashboard, full visibility, zero cron jobs
Moving to Airbyte significantly simplified the ingestion process for REST API sources. Airbyte allows scalable ingestion which DataGOL utilizes. Different destination file formats are possible. Airbyte’s ingestion mechanisms including full refresh and incremental append enable DataGOL customers to utilize time travel and data snapshots in DataGOL’s platform. Airbyte helped DataGOL’s REST API ingestion operational layer which DataGOL was maintaining manually: scheduling of ingestion pipelines, retries, failure tracking and so on.
"We have visibility into every pipeline run now: what succeeded, what failed, and why," says Abhinandan. "We can track everything from one dashboard and fix issues proactively instead of chasing problems across different codebases."
"Everything ships as a single package in our Kubernetes cluster. We deploy Airbyte once and scheduling, retries, monitoring are all managed. Before, we were writing all of that logic ourselves," says Jyotish.
The standardization extends to how the team builds and maintains integrations. Every connector follows the same well-defined pattern, which means any engineer on the team can pick up, debug, or extend another's work without ramping up on a one-off script.
"Everything follows the same pattern now, so it's far easier to maintain and debug," says Abhinandan. "New engineers can pick up any connector and immediately know how it works."
The data backbone for an AI-first platform
For DataGOL, Airbyte is not just a data integration tool. It is the first step in every customer engagement.
"Whenever we onboard a new business, the first thing we notice is that their data is everywhere," Jyotish reflects. "Airbyte is step one. We ingest from all their different sources, and that becomes the foundation for everything we build on top."
"Airbyte is a mature platform with a wide connector ecosystem," says Abhinandan. "Once you learn the patterns, pulling data from any new source becomes straightforward."
With Airbyte as their ingestion backbone, DataGOL continues to expand across industries, confident that no matter what combination of data sources a new client brings, they have a reliable and extensible way to get that data into the hands of AI.


