Data glossary

What is Data Catalog?

A Data Catalog is a centralized store where all your metadata data about your data is made searchable.

Think about a Google Search for your internal Metadata. This is vital, as with Data Lakes and other data stores, and you want the ability to search for your data. Data is growing exponentially, with 90% of the world’s data being generated alone in the last two years. It's hard to keep this amount over time. A data catalog solves the problem of the fast-growing handling of data internally.

An interesting read about the beginning of the Data Catalog is explained in the 2017 published paper about a Data Context Service. See as well the Awesome Data Discovery and Observability list on GitHub for an extensive list of existing tools.

Getting started is easy

Start breaking your data siloes with Airbyte