Cross-shard SELECT

A cross-shard SELECT query doesn't have one or has several sharding keys, which requires it to be executed by all shards. PgDog can perform it in parallel, assembling the results from each shard automatically. This makes it a powerful scatter/gather engine, with data nodes powered by PostgreSQL.

How it works

When PgDog receives a SELECT query with no (or multiple) sharding keys, it connects to all databases and sends the query to all of them in parallel.

If the result needs post-processing, e.g., to support sorting or aggregation, it will buffer the rows in memory and perform the necessary operations. Otherwise, PgDog will stream the rows directly to the client.

Predicate push-down

PgDog pushes all filtering, sorting and aggregation statements to the database. If the query is correctly constructed, the shards will return very few rows, allowing searches of vast quantities of data without causing out-of-memory errors or latency issues in the proxy.

Supported features

The SQL language allows for powerful data filtration and manipulation. While we aim to support most operations, currently, support for some cross-shard operations is limited as documented below:

Operation	Supported	Limitations
`SELECT` columns		None.
`ORDER BY`		Columns must be part of the returned tuples. See sorting.
`DISTINCT` / `DISTINCT BY`		Columns must be part of the returned tuples.
`GROUP BY`		Columns must be part of the returned tuples. See aggregates.
CTEs		CTE must refer to data located on the same shard.
Window functions		Not currently supported.
Subqueries		Subqueries must refer to data located on the same shard.

Sorting

If the query contains an ORDER BY clause, PgDog can sort the rows returned from all shards automatically. This works by buffering data returned from all servers and sorting it in memory.

Currently, two forms of the ORDER BY SQL syntax are supported:

Syntax	Example	Notes
Order by column name	`ORDER BY id, email`	The column must be present in the returned tuples.
Order by column position	`ORDER BY 1, 2`	The column is referenced by its position in the returned tuples.

Sorting by multiple columns is supported, including opposing sorting directions, for example:

SELECT id, email, created_at FROM users
ORDER BY
    1 ASC,
    created_at DESC

Note that columns in the ORDER BY clause are retrieved from the table. PgDog cannot sort by columns it doesn't receive from the databases.

Functions

PgDog currently doesn't support sorting results using a function, for example:

SELECT email FROM users ORDER BY coalesce(email, '')

To make this work, we need to implement all SQL functions inside PgDog. This is on the roadmap, but not currently a priority since the query can be easily rewritten to execute the function inside the database:

SELECT
    coalesce(email, '') -- executed by Postgres
FROM users
ORDER BY 1 -- sorted by both Postgres and PgDog

As a general rule, it's better to perform all data manipulations inside the target clause (SELECT [...]) and use the numbering notation in the ORDER BY clause to refer to the desired sorting order, e.g.:

SELECT
    id,
    email,
    first_name || last_name,
    substring(first_name FROM 1 FOR 5) AS short
FROM users ORDER BY
    3 ASC, -- first_name || last_name
    1 DESC -- id

ORMs

ORMs like SQLAlchemy or ActiveRecord, more often than not, will write queries that work with PgDog out of the box. This is because they tend to fetch entire rows and use fully-qualified names in all parts of the statement, including the ORDER BY clause.

For example, this is what a first Rails/ActiveRecord query looks like:

SELECT * FROM users ORDER BY users.id LIMIT 1

The users.id column is present in the returned row, so PgDog can read it and sort the rows in the desired order.

Aggregates

Aggregates are transformative functions: instead of returning rows as-is, they return calculated summaries, like a sum or a count. Many aggregate functions are cumulative: the final value can be calculated from partial results returned by each shard.

If an aggregate function is supported (see list of supported functions below), this is handled by PgDog automatically:

Aggregate functions	Supported	Notes
`count`, `count(*)`		Works for most data types.
`max`, `min`, `avg`, `sum`		Works for most data types.
`stddev`, `variance`		Works for most data types.
`percentile_disc`, `percentile_cont`		Not currently supported and very expensive to calculate on large datasets.
`*_agg`		Not currently supported.
`json_*`		Not currently supported.

Aggregate functions have to appear in the target clause of the statement (SELECT [...]), and can also be combined with sorting, for example:

SELECT COUNT(*), is_admin
FROM users
GROUP BY 2
ORDER BY 1 DESC

`HAVING` clause

The HAVING clause requires additional filtering of the results and is not currently supported. See #695 for more details.

Rewriting queries

For some aggregate functions to work as expected, each shard may need to return columns and intermediate calculations not present in the original query.

For example, to get an average of a column, we need to fetch the row count, multiply it by the avg of the column on each shard, and divide it by the total count of rows on all shards.

If the count function isn't present in the query, PgDog will automatically rewrite the query to add it. This allows queries, like the following example, to just work without modifications:

OriginalRewritten

SELECT avg(price) FROM orders;

SELECT avg(price), count(price) FROM orders;

PgDog automatically removes the count column from the returned rows, so applications don't have to handle this complexity.

The following aggregate functions take advantage of this feature:

avg
stddev
variance

Configuration

This feature is enabled by default and requires no additional configuration.

Supported data types

The following table lists the data types supported by PgDog for ordering and aggregation. Since Postgres clients can request results in either text or binary format, each one must be handled separately:

Data type	Sorting	Binary format
`BIGINT`, `INTEGER`, `SMALLINT`
`VARCHAR`, `TEXT`
`NUMERIC`
`REAL`, `DOUBLE PRECISION`
`INTERVAL`		No
`TIMESTAMP`, `TIMESTAMPTZ`		No
`UUID`
`VECTOR`	Only by L2

pgvector data types

VECTOR type doesn't have a fixed OID in Postgres because it comes from an extension (pgvector). We infer it from the <-> operator used in the ORDER BY clause.