Exploring PostgreSQL Data Types: A Comprehensive Guide

February 23, 2024
20 min read

Postgres is one of the most trusted open-source relational database management systems. One of the primary reasons behind its adoption by several organizations is its rich set of native data types. The built-in Postgres library offers several general-purpose data types that help you determine the location and preferred format for storing your data in tables. From a multitude of Postgres data types, take a look at a few commonly used ones.

Five Major Postgres Data Types

Understanding Postgres data types helps you organize and manage your data better. It is especially useful when you have massive amounts of data and need to find coherent insights from them quickly. Here are a few important data types offered by Postgres.

Numeric Data Types

Numeric data types under Postgres encompass a range of options for storing whole numbers, arbitrary precision numbers, floating-point for decimals, and serial types.

For whole numbers, Postgres provides smallint, integer, and bigint types, each with different ranges. The integer type is the most commonly used data type due to its balanced attributes of range, storage size, and performance. The smallint is typically reserved for scenarios where disk space conservation is crucial, while the bigint is employed when the integer’s type range is insufficient.

Arbitrary precision numbers under the numeric type are ideal for storing precise values, such as monetary amounts. Numeric calculations yield exact results wherever possible, though they tend to be slower when compared to integer types. Precision refers to the total significant digits in a number, including both sides of the decimal point. The scale indicates the count of decimal digits to the right of the decimal point. For example, the number 46.7132 has a precision of 6 and a scale of 4. 

In addition to the regular numeric values, PostgreSQL incorporates special values, Infinity, -Infinity, and NaN, from the IEEE 754 standard. These represent positive infinity, negative infinity, and “not-a-number,” respectively. You must remember to specify these values in quotes while using them in SQL commands. The infinity values can be denoted as inf and -inf, while the NaN denotes undefined or nonsensical results in calculations.

The floating-point data type represents the shortest precise decimal form when the output is in text form. The decimal value produced is closer to the true stored binary value than any other representable value in the same binary precision. There are two data types under the floating-point: real and double precision. For float8 values, which are double precision, the output will use 17 significant decimal points at most. On the other hand, float4 values (real) will use a maximum of nine decimal digits. 

Additionally, there are serial types under numeric data types. However, they are not true data types but serve as a shorthand for creating unique identifier columns.

Character Data Types

There are two primary character data types in Postgres: character varying(n) and character(n), where n is a positive integer. All these types can store strings up to n characters (not bytes) in length. If you attempt to store a longer string in a column of these types, it will result in an error. However, the string will be truncated to the maximum length if you have extra spaces as characters.

Additionally, PostgreSQL introduces a text type that can store strings of any length. Although it is not a part of the SQL Standard, this data type is supported by several other SQL database management systems. The text data type is native to Postgres and is used for most built-in functions operating on strings.

The varchar type is an alias for character varying, while the bpchar (with a length specifier) and char represent character type. When you are specifying the length for varchar or char, the value of n must be greater than zero and not exceed 10,485,760. If you use bpchar or varchar without a length specifier, it will accept strings of any length. Similarly, if you input a char without a length specifier, it will be equivalent to character(1).

In PostgreSQL, there are two additional fixed-length character types. They are primarily intended for internal system catalogs rather than general-purpose usage. The name type is designed to store identifiers. The “char” type (used in quotes) can store a single ASCII character as it utilizes only one byte of storage.

Binary Data Types

In PostgreSQL, binary strings are sequences of octets (or bytes). They can be distinguished from character strings in several ways. 

Binary strings allow you to store octets of value zero and other non-printable octets, typically those outside the decimal range 32 to 126. This contrasts with character strings, which do not allow zero octets or any other octet values or sequences considered invalid by your database’s selected character set encoding. Binary strings are suitable for storing data and are considered as “raw bytes,” while character strings are appropriate for storing text.

The binary data type of Postgres is represented by bytea. It supports two formats for input and output. The hex format encodes your binary data as two hexadecimal digits per byte. The entire string is preceded by the sequence “\x.” The escape format is the traditional Postgres bytea type format. It represents a binary string as a sequence of ASCII characters. Since hex is compatible with a wide range of external applications and protocols and is faster to convert than the escape format, it is widely used.

Date/Time Data Types

PostgreSQL provides comprehensive support for SQL date and time data types. Dates are calculated according to the Gregorian calendar, even for years predating its introduction. Date and time inputs are accepted in various formats, including ISO 8601, SQL-compatible, traditional Postgres, and others. It is important to note that the day appears before the month if the DMY field order has been specified in SQL and Postgres. Otherwise, the month appears before the date by default. You must also enclose any data or time literal input in single quotes, similar to text strings.

The data types time, timestamp, and interval accept an optional precision value, “p,” specifying the number of fractional digits retained in the seconds field. The default precision has no explicit bound but ranges from 0 to 6. For the interval type, there is an additional option to restrict the set of stored fields by specifying one of the following phrases:

Compared to other SQL standard-compliant databases, you get greater flexibility in handling date/time input in Postgres. The SQL standard dictates that if you are using the query timestamp, it is equivalent to using a timestamp without time zone. This behavior is also adopted by Postgres. Additionally, timestampz is an accepted abbreviation for timestamp with timezone, which is also a PostgreSQL extension.

For the data type timestamp with time zone, the internally stored value is always in Universal Coordinated Time (UTC). It is also known as Greenwich Mean Time. If you input a value explicitly mentioning the specific time zone, it gets converted to UTC through the appropriate offset for that time zone. If you do not state a time zone in the input string, Postgres assumes the time zone through your system’s TimeZone parameter. Subsequently, it converts the time zone to UTC using the corresponding offset.

Boolean Data Type

PostgreSQL adheres to the standard SQL type boolean, which represents true or false values. Boolean constants in SQL queries can be represented by keywords TRUE, FALSE, and NULL. The input function for the boolean data type in Postgres accepts the following string representation for the “true” state:

  • true
  • yes
  • on
  • 1

The string representations for the “false” state are:

  • false
  • no
  • off
  • 0

It is important to note that while Postgres accepts these representations, the keywords TRUE and FALSE are the preferred, SQL-compliant method for writing boolean constants.

How to Make Better Use of Postgres Data Types?

Once you have understood the different Postgres data types, it is time to apply them to your datasets. Inserting the available data under the right column and type is crucial, as it is much easier and faster to perform in-depth analysis on an organized dataset. First, you must consolidate your data from different sources before loading it into PostgreSQL. To make this process simple and quick, you can establish a data pipeline with Airbyte.

Airbyte is one of the most well-known data integration and replication platforms today. In its 350 pre-built connectors library, you can easily find a dedicated connector for Postgres. Airbyte allows you to extract data from multiple sources and bring it all together under a single data pipeline. Thus, you can configure the source and destination in just two simple steps and directly load your dataset in PostgreSQL.

When you are establishing the data pipeline, you do not have to write a single line of code. You can be assured of the security of your data, as Airbyte cannot view or store any data as it moves through the pipeline. The platform also helps you update your Postgres dataset through its Change Data Capture (CDC) abilities. You have to establish an incremental sync frequency, and only the changes that occur in your dataset at the source will be moved to Postgres.

Once your large datasets have moved into PostgreSQL, examine data types, you can run queries and fix errors, if any. You can further organize your data types, filter them, and perform better analysis to get accurate results.

Wrapping Up

If you are looking to move your data into PostgreSQL, it is important to be acquainted with some of the platform’s widely used and essential data types. Apart from the Numeric, Character, Date/Time, Boolean, and Binary Data Types, there are many diverse data types that can come in handy. Postgres offers Monetary, Geometry, Full Text Search, Composite, JSON Types, and so much more. You can explore the different Postgres data types here.

Limitless data movement with free Alpha and Beta connectors
Introducing: our Free Connector Program
The data movement infrastructure for the modern data teams.
Try a 14-day free trial