descendants snippets

Recursive Postgres Query That Finds Descendants of a Node in a Tree

Tagged recursive, tree, parent, descendants, child, postgres, cte  Languages sql

The recursive query:

WITH RECURSIVE tree AS (
  SELECT id, name, parent_id FROM nodes WHERE id = 25
  UNION ALL
  SELECT nodes.id, nodes.name, nodes.parent_id FROM nodes, tree WHERE nodes.parent_id = tree.id)

This is where we define a virtual table called tree:

WITH RECURSIVE tree

This SQL finds the root node and initializes the recursive function:

SELECT id, name, parent_id FROM nodes WHERE id = 25

The next part recursively fetches the descendants of the specified node:

SELECT nodes.id, nodes.name, nodes.parent_id FROM nodes, tree WHERE nodes.parent_id = tree.id)

This SQL query shows the descendants, and the parent which has an ID of 25:

SELECT * FROM tree;

In Ruby you could use this code to generate your SQL:

def tree_sql(opts={})
    table = opts.fetch(:table) # e.g. 'categories'
    cols = opts.fetch(:cols) # e.g. %w(id name)
    <<-SQL
      WITH RECURSIVE #{table}_tree AS (
        -- initialize
        SELECT
          #{cols.join(', ')}, 0 as n_depth
        FROM
          #{table}
        WHERE
          -- Use bind variables and ? here if you want
          id = #{parent_id}
        UNION ALL
          -- iterate
          SELECT
            #{cols.map { |col| "#{table}.#{col}" }.join(', ')}, n_depth +1
          FROM
            #{table}, #{table}_tree
          WHERE
            #{table}.parent_id = #{table}_tree.id)
    SQL
  end

In Rails you probably want the method to return an ActiveRecord::Relation object so you can chain e.g. pagination, order, and limit method calls. One way of doing that is shown here:

class User
  def descendants
      ids_query = User.select('users.id') # return only ids
         .joins(:memberships)
        .group('users.id') # avoid duplicates
        .where("memberships.organization_id in (#{tree_sql})", organization).to_sql
      User.where("id in (#{ids_query})")
  end
end

Details: * Use UNION instead of UNION ALL if you have duplicates, or if you're not worried about performance when Postgres has to scan for duplicates. * WITH Queries in Postgres * Recursive queries can loop indefinitely if e.g. self.parent = self. If this happens, use the ANY() function to check if path has been visited once already. Alternatively, use validations to make sure circular references don't happen, and hope for the best. * Use Arel's recursive feature to generate recursive CTE queries with Arel.