Working with hierarchical data structures in Rails

enum.

yield parent_product parent_product = parent_product.

parent end end.

to_a.

reverse + [self] endendOkay, maybe a little bit more fancy than necessary there.

Still.

This is all well and good, and for a single Product this is absolutely fine.

The downside here is that this approach runs a single query for every Product instance in the chain.

If your hierarchy is five levels deep, that’s five queries.

Again — for one product, that’s okay, but if you start wanting to do this on, say, a product list… you run into trouble real quick — you’re suddenly doing hundreds or thousands of queries.

Did I say this was the easy one?This is where I got stuck.

Reluctantly, I took the problem to Stack Overflow.

The answer I got there put me almost all the way to solving it — in essence, you can tell Rails where to preload the data from, like an includes, and recursively run that at each level of the hierarchy.

Now, instead of running 500 queries for your product list, you’re only running one per layer of hierarchy.

The code I came out with in the end is a monkey-patch for ActiveRecord::Relation that adds in four hierarchical preload methods.

You can find it in this gist, which for the sake of brevity I will not include here.

This approach cleans up the resulting code a bunch as well: instead of chained calls to includes, preloading a whole chain can look like this instead:Product.

all.

deep_preload_chain(:parent)This also works for referenced tables: perhaps you have an Order model, which has_many :products.

You can create a list of orders, with product parent chains preloaded, with this:Order.

all.

deep_preload_reference_chain(products: :parent)That’s got upwards traversal covered.

What about the other way?Traversing DownThere was no pretty way to do this.

There really wasn’t.

Moving down the tree (i.

e.

to find a list of children of a certain node) is a more difficult proposition than moving up: when you’re moving up, there’s only ever one parent; moving down, there can be multiple children on each parent node.

ActiveRecord’s preloader didn’t like that, despite my efforts to coax it to my way of thinking.

So, I looked elsewhere.

MySQL 8.

0+ has a wonderful new feature (that other databases have had rather longer, I should add) of recursive CTE’s.

These let you specify an ‘anchor point’ — in our case, that’d be the node whose children we want to find — and then recurse down the tree, finding all children of the previous node and putting everything together with a UNION.

This is where I went with this side of the problem.

Putting together a recursive CTE for the product model we had earlier looks something like this:WITH RECURSIVE CTE (id, parent_id) AS ( SELECT id, parent_id FROM products WHERE parent_id = PARENTID UNION ALL SELECT p.

id, p.

parent_id FROM products p INNER JOIN CTE ON p.

parent_id = CTE.

id)SELECT * FROM CTE;That query sends you back a list of record IDs; all of those IDs are records that — somewhere along the line — have record PARENTID in their parent chain.

This list you can simply pass back into Product.

where to get an ActiveRecord::Relation out of it.

I put the raw SQL for this query in lib/queries, so my code looks like this:def children query = File.

read(Rails.

root.

join('lib/queries/cte.

sql')) query = query.

gsub('PARENTID', id.

to_s) i = ActiveRecord::Base.

connection.

execute(query).

to_a.

map(&:first) Product.

where(id: i)endAnd that’s where I stopped.

With those two fundamental operations, I’ve been able to do everything I’ve needed to do for my app — there may be more operations necessary for a fully functional data structure, but given the effort these two took… that’s a job for another day.

When those operations are actually necessary.

.. More details

Leave a Reply