Postgres Array data type
Manage collections of elements using arrays
In Postgres, the ARRAY
data type is used to store and manipulate collections of elements in a single column. An array can have variable length and one or more dimensions, but must hold elements of the same data type. Postgres provides a variety of functions and operators for working with arrays.
Arrays are particularly useful when dealing with multiple values that are logically related. For instance, they can store a list of phone numbers for a contact, product categories for an e-commerce item, or even multi-dimensional data for scientific or analytical computations.
Storage and syntax
Arrays in Postgres are declared by specifying the element type followed by square brackets. For example,
INTEGER[]
defines an array of integers.TEXT[][]
defines a two-dimensional array of text values.NUMERIC[3]
defines an array of three numeric values. However, note that Postgres doesn't enforce the specified size of an array.
Array literals in Postgres are written within curly braces {}
and separated by commas. For instance,
- An array of integers might look like
{1, 2, 3}
. - Multidimensional arrays use nested curly braces, like
{{1, 2, 3}, {4, 5, 6}}
.
The ARRAY
constructor syntax can also be used to create arrays. For example,
ARRAY[1, 2, 3]
creates an array of integers.ARRAY[[1, 2, 3], [4, 5, 6]]
creates a two-dimensional array.
Example usage
Consider the case of maintaining a product catalog for an online store. The same product may belong to multiple categories. For example, an iPad could be tagged as 'Electronics', 'Computer', or 'Mobile'. In this case, we can use an array to store the categories for each product.
First, let's create a products
table with some sample data:
The units_sold
column is a two-dimensional array that stores the number of units sold for each product. The first dimension represents the year, and the second dimension represents the quarter.
Now, we can access the values in the array column categories
, and use it in our queries. For example, the query below finds products belonging to the Electronics
category.
Note that the ANY
operator checks if the value specified exists in the array.
This query returns the following result:
Other examples
Indexing arrays
Elements in an array can be accessed by their index. Postgres arrays are 1-based, meaning indexing starts at 1.
For example, to get the first category of each product:
This query returns the following result:
Multiple elements can be accessed using the SLICE
operator. For example, to get the first three categories of each product:
This query returns the following result:
Multidimensional arrays can be accessed using multiple indices. For example, to get the number of units sold in the last quarter of the first year for each product, we can use the query:
This query returns the following:
Modifying arrays
Array values can be modified using functions or by directly indexing into the array. You can change specific elements of an array, add or remove elements, or even replace the entire array.
For example, the query below replaces the Audio
category across all products with Sound
.
This query returns the following result:
Array functions and operators
Postgres provides a variety of functions and operators for working with arrays. You can find the full list of functions and operators in the Postgres documentation.
We'll look at some commonly used functions below.
Length of an array
We can query the number of categories each product has been tagged with:
This query returns the following result:
The array_length
function returns the length of the array in the specified dimension. In this case, we specified the first dimension, which is the number of categories for each product.
Expanding an array into rows
We can use the unnest
function to expand an array into rows. For example, to get the number of laptops sold in each quarter, we can use the query:
This query returns the following result:
We could use the output of unnest
to calculate the total number of units sold for each product; for example:
This query returns the following result:
Concatenating arrays
We can concatenate two arrays using the ||
operator. For example, the query below produces a list of all categories across all products.
This query returns the following result:
Aggregating values into an array
We can use the array_agg
function to produce an array from a set of rows. For example, to get a list of all products that are in the Electronics
category, we can use the query:
This query returns the following result:
Additional considerations
-
Performance and UX: While arrays provide flexibility, they can be less performant than normalized data structures for large datasets. Compared to a set of rows, arrays can also be more tedious to work with for complex queries.
-
Indexing: Postgres lets you create indexes on array elements for faster searches. Specifically, an inverted index like
GIN
creates an entry for each element in the array. This allows for fast lookups but can be expensive to maintain for large arrays. -
No type enforcement: Postgres supports defining the size of an array or the number of dimensions in the schema. However, Postgres does not enforce these definitions. For example, the query below works successfully:
It is therefore up to the application to ensure data integrity.
Resources
Need help?
Join our Discord Server to ask questions or see what others are doing with Neon. Users on paid plans can open a support ticket from the console. For more detail, see Getting Support.