Recursive Postgres Query That Finds Descendants of a Node in a Tree
Voilà, the recursive query we are going disect in this snippet:
WITH RECURSIVE tree AS (
SELECT id, name, parent_id FROM nodes WHERE id = 25
UNION ALL
SELECT nodes.id, nodes.name, nodes.parent_id FROM nodes, tree WHERE nodes.parent_id = tree.id)
Here we define a virtual table called tree:
WITH RECURSIVE tree
This SQL finds the root node and initializes the recursive function:
SELECT id, name, parent_id FROM nodes WHERE id = 25
The next part recursively fetches the descendants of the root node:
SELECT nodes.id, nodes.name, nodes.parent_id FROM nodes, tree WHERE nodes.parent_id = tree.id)
This SQL query fetches the root node and all its descendants from the tree:
SELECT * FROM tree;
ActiveRecord / Rails
This example shows how to use CTEs in ActiveRecord to find the parent and suborganizations in a conglomerate:
class Organization
belongs_to :parent, class_name: 'Organization', required: false
has_many :children, class_name: 'Organization', foreign_key: :parent_id
scope :roots, -> { where(parent_id: nil) }
def ancestors(id, columns=Organization.column_names)
cols = columns.join(', ')
child_id = ActiveRecord::Base.connection.quote(id)
Organization.from <<~SQL
(WITH RECURSIVE org_tree(#{cols}, path) AS (
SELECT
#{cols}, ARRAY[id]
FROM
organizations
WHERE
id = #{child_id}
UNION ALL
SELECT
#{columns.map { |col| "organizations.#{col}" }.join(', ') }, path || organizations.id
FROM
org_tree
JOIN
organizations ON organizations.id = org_tree.parent_id
WHERE NOT
organizations.id = ANY(path) -- avoid infinite loops where e.g. parent = self
) SELECT #{cols} FROM org_tree WHERE id != #{child_id}) as organizations
SQL
end
def descendants(id, columns=Organization.column_names)
cols = columns.join(', ')
parent_id = ActiveRecord::Base.connection.quote(id)
Organization.from <<~SQL
(WITH RECURSIVE org_tree(#{cols}, path) AS (
SELECT
#{cols}, ARRAY[id]
FROM
organizations
WHERE
id = #{parent_id}
UNION ALL
SELECT
#{columns.map { |col| "organizations.#{col}" }.join(', ') }, path || organizations.id
FROM
org_tree
JOIN
organizations ON organizations.parent_id = org_tree.id
WHERE NOT
organizations.id = ANY(path) -- avoid infinite loops where e.g. parent = self
) SELECT #{cols} FROM org_tree WHERE id != #{parent_id}) as organizations
SQL
end
end
# Find parent companies
Organization.find(1).ancestors
# Find suborganizations
Organization.find(1).descendants
Details
- Use UNION instead of UNION ALL if you have duplicates, or if you’re not worried about performance when Postgres has to scan for duplicates.
- WITH Queries in Postgres
- Recursive queries can loop indefinitely if e.g. self.parent = self. If this happens, use the ANY() function to check if path has been visited once already. Alternatively, use validations to make sure circular references don’t happen, and hope for the best.
- You can use Arel’s recursive feature to generate recursive CTE queries with Arel.
- Rails CTE gem: https://github.com/christianhellsten/cte-rails