A comparative explanation of recursion in PostgreSQL
One of PostgreSQL features that I had the most trouble getting
my head around was the use of WITH RECURSION
, particularly due
to the fact that this type of recursion doesn’t resemble any of
the examples of this concept that I had previously encountered.
Even Postgres' documentation aknowledges this:
Strictly speaking, this process is iteration not recursion, but RECURSIVE is the terminology chosen by the SQL standards committee.
Let’s look at an example to see what we are talking about.
The following code takes the table people
and lists all the
descendants of ‘Frank Sinatra’ using the father
field.
It starts by selecting all of Sinatra’s children and then all
of his children’s children and so on until it reaches the point
where there are no more results.
WITH RECURSIVE descendants_of(name, father) AS (
SELECT name, father
FROM people
WHERE father = 'Frank Sinatra'
UNION ALL
SELECT list.name, list.father
FROM descendants_of AS working_table,
people AS list
WHERE list.father = working_table.name
)
SELECT name, father FROM descendants_of;
PostgreSQL does this by creating a temporary working table (descendants_of
), populating it with the results of the first select and then using it to filter the results against the people
table.
This might not seem like the traditional recursive function that we all know, but in a closer look, we can find some parallels with a recursive tail call. Let’s transform this code to Javascript to examine some similarities:
// SELECT name, father FROM people WHERE father = 'Frank Sinatra'
with(
people,
people.filter( row => row.father === 'Frank Sinatra'),
people.filter( row => row.father === 'Frank Sinatra')
)
function with(list, accumulator, workingTable) {
//WHERE list.father = accumulator.name
let children = list.filter(row => row.father === workingTable.name);
// this check is implicit in WITH RECURSIVE
if (children.length === 0) return accumulator;
// recursive call to traverse the list
return with(list, accumulator.concat(children), children);
}
Here, Javascript’s filter()
is akin to SQL’s WHERE
while concat()
acts as a replacement of UNION ALL
.
The above function performs a recursive call and passes
itself the complete ‘table’ of family relationships as
first argument and, as second argument, the rows of the
table (or the objects in the array) which meet the
condition row.father == workingTable.name
.
Both pieces of code make use of the same conceptual understanding of recursion, comprised of (i) a data structure akin to a linked list and (ii) an accumulator, which stores the results in succesive iterations.
If we look at the manual, we read that this expression uses two tables:
For UNION (but not UNION ALL), discard duplicate rows and rows that duplicate any previous result row. Include all remaining rows in the result of the recursive query, and also place them in a temporary intermediate table.
In the above code, the children
variable acts as an intermediate
table in which the result of the query over the working table
is stored. In the next iteration, the intermediate table, as its
names indicates, becomes the working table.