Data Glossary 🧠

Search

Search IconIcon to open search

What is Apache Parquet?

Last updated Sep 7, 2022 - Edit Source

Apache Parquet is a free and open-source column-oriented  Data Lake File Format in the Apache Hadoop ecosystem. It is similar to RCFile and  ORC, the other columnar-storage file formats in Hadoop, and is compatible with most of the data processing frameworks around Hadoop.

Read more about how to build a Data Lake on top of it on our  Data Lake and Lakehouse Guide.