Rails association collections know two ways to determine their size:
- Through a dumb count as soon as you query the collection for its size.
This doesn’t require any additional work but puts quite a load on the database as soon as you have a large list of objects and want to know the size of an associated collection for each object. For 100 posts with comments, this approach would query the database 100 times withSELECT count(*) AS count_all FROMcommentsWHERE (comments.post_id = 1234). This might be okay for single objects which are queried from time to time but not for collections that need to be displayed frequently. - Through counter caches that are a bit of a pain in the ass to set up correctly and gave me the fishy impression that they are easily corrupted.
Now, I admit that I didn’t spend too long investigating the implementation of the counter cache but after fiddling around with it for an hour before I finally found out how to properly initialize the cache in my migration and after discovering that the cache can only be changed relative to its current content, I left with a bad feeling.
All this hassle is absolutely unavoidable if you need maximum performance. In this article I’ll use blogposts and comments as an example, though only because this is familiar to many people. A real blog with a bunch of posts on its index page, receiving any considerable amount of hits per day is a bad candidate for the trick I describe here.
In raw SQL, if you want to find out the amount of associated rows in a different table, you use a JOIN, combined with COUNT() and GROUP BY, like this:
FROM posts
LEFT OUTER JOIN comments ON posts.id = comments.post_id
GROUP BY posts.id
Admittedly, this is slower than a counter cache but is not as difficult to set up, doesn’t risk errors due to a corrupt cache and if you chose your database keys wisely it is a pretty nice compromise.
When you ask an association collection for its size(), rails checks for the presence of a counter cache attribute in the current ActiveRecord object. If it finds one, it uses its content to return the collection size, otherwise the database is queried.
The catch is now, that in the above example, we inserted a perfectly valid counter cache column into our posts without specifying caching in the association declaration in the class. ActiveRecord inserts that columns content into a comments_count attribute in our Posts but since it doesn’t know exactly what to do with it, it doesn’t cast it into an integer but leaves it in there as a String. That makes size(), or to be more precise, the count_records() method trip with a “String can’t be coerced into Fixnum” error.
To fix this, I wrote an extension for the has_many Association:
def count_records
count = if has_cached_counter?
@owner.send(:read_attribute, cached_counter_attribute_name).to_i
elsif @reflection.options[:counter_sql]
@reflection.klass.count_by_sql(@counter_sql)
else
@reflection.klass.count(:conditions => @counter_sql, :include => @reflection.options[:include])
end
@target = [] and loaded if count == 0
if @reflection.options[:limit]
count = [ @reflection.options[:limit], count ].min
end
return count
end
end
Uh, well, yeah, the only change here is the .to_i at the end of line 4 but hey, what did you expect?
Save that in lib/optional_join_counter.rb and extend your association with has_many :whatevers, :extend => OptionalJoinCounter
Now, to get back to the example, imagine we want to use this trick to count our comments. The above SQL can either be written by hand, or with Rails finder options:
:select => ‘posts.*, count(comments.id) as comments_count’,
:joins => ‘LEFT JOIN comments ON posts.id = comments.post_id’,
:group => ‘posts.id’
Leave a comment