<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jan Varwig &#187; joins</title>
	<atom:link href="http://jan.varwig.org/archive/tag/joins/feed" rel="self" type="application/rss+xml" />
	<link>http://jan.varwig.org</link>
	<description>Somewhere between Hello World and HAL9000</description>
	<lastBuildDate>Sat, 03 Dec 2011 00:15:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Rails: Determine the association collection size through joins</title>
		<link>http://jan.varwig.org/archive/rails-association-collection-size-though-joins?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=rails-association-collection-size-though-joins</link>
		<comments>http://jan.varwig.org/archive/rails-association-collection-size-though-joins#comments</comments>
		<pubDate>Sun, 01 Jun 2008 15:25:21 +0000</pubDate>
		<dc:creator>Jan</dc:creator>
				<category><![CDATA[on Rails]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[activerecord]]></category>
		<category><![CDATA[associations]]></category>
		<category><![CDATA[cache]]></category>
		<category><![CDATA[joins]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[ror]]></category>
		<category><![CDATA[rubyonrails]]></category>
		<category><![CDATA[sql]]></category>

		<guid isPermaLink="false">http://jan.varwig.org/?p=82</guid>
		<description><![CDATA[Rails association collections know two ways to determine their size: Through a dumb count as soon as you query the collection for its size. This doesn&#8217;t require any additional work but puts quite a load on the database as soon as you have a large list of objects and want to know the size of [...]]]></description>
			<content:encoded><![CDATA[<p>Rails association collections know two ways to determine their size:</p>

<ol>
<li><strong>Through a dumb count as soon as you query the collection for its size.</strong><br />
This doesn&#8217;t require any additional work but puts quite a load on the database as soon as you have a large list of objects and want to know the size of an associated collection for each object. For 100 posts with comments, this approach would query the database 100 times with <code>SELECT count(*) AS count_all FROM</code>comments<code>WHERE (comments.post_id = 1234)</code>. This might be okay for single objects which are queried from time to time but not for collections that need to be displayed frequently.</li>
<li><strong>Through counter caches that are a bit of a pain in the ass to set up correctly and gave me the fishy impression that they are easily corrupted.</strong>
Now, I admit that I didn&#8217;t spend too long investigating the implementation of the counter cache but after fiddling around with it for an hour before I finally found out how to properly initialize the cache in my migration and after discovering that the cache can only be changed relative to its current content, I left with a bad feeling.<br />
All this hassle is absolutely unavoidable if you need maximum performance. In this article I&#8217;ll use blogposts and comments as an example, though only because this is familiar to many people. A real blog with a bunch of posts on its index page, receiving any considerable amount of hits per day is a bad candidate for the trick I describe here.</li>
</ol>

<p>In raw SQL, if you want to find out the amount of associated rows in a different table, you use a <code>JOIN</code>, combined with <code>COUNT()</code> and <code>GROUP BY</code>, like this:</p>

<div class="dean_ch" style="white-space: wrap;"><span class="kw1">SELECT</span> posts.*, COUNT<span class="br0">&#40;</span>comments.id<span class="br0">&#41;</span> <span class="kw1">AS</span> comments_count<br />
<span class="kw1">FROM</span> posts<br />
<span class="kw1">LEFT</span> <span class="kw1">OUTER</span> <span class="kw1">JOIN</span> comments <span class="kw1">ON</span> posts.id = comments.post_id<br />
<span class="kw1">GROUP</span> <span class="kw1">BY</span> posts.id</div>

<p>Admittedly, this is slower than a counter cache but is not as difficult to set up, doesn&#8217;t risk errors due to a corrupt cache and if you chose your database keys wisely it is a pretty nice compromise.
When you ask an association collection for its <code>size()</code>, rails checks for the presence of a counter cache attribute in the current ActiveRecord object. If it finds one, it uses its content to return the collection size, otherwise the database is queried.</p>

<p>The catch is now, that in the above example, we inserted a perfectly valid counter cache column into our posts without specifying caching in the association declaration in the class. ActiveRecord inserts that columns content into a <code>comments_count</code> attribute in our Posts but since it doesn&#8217;t know exactly what to do with it, it doesn&#8217;t cast it into an integer but leaves it in there as a String. That makes <code>size()</code>, or to be more precise, the <code>count_records()</code> method trip with a &#8220;String can&#8217;t be coerced into Fixnum&#8221; error.</p>

<p>To fix this, I wrote an extension for the has_many Association:</p>

<div class="dean_ch" style="white-space: wrap;"><span class="kw1">module</span> OptionalJoinCounter<br />
&nbsp; <span class="kw1">def</span> count_records<br />
&nbsp; &nbsp; count = <span class="kw1">if</span> has_cached_counter?<br />
&nbsp; &nbsp; &nbsp; <span class="re1">@owner</span>.<span class="me1">send</span><span class="br0">&#40;</span><span class="re3">:read_attribute</span>, cached_counter_attribute_name<span class="br0">&#41;</span>.<span class="me1">to_i</span><br />
&nbsp; &nbsp; <span class="kw1">elsif</span> <span class="re1">@reflection</span>.<span class="me1">options</span><span class="br0">&#91;</span><span class="re3">:counter_sql</span><span class="br0">&#93;</span><br />
&nbsp; &nbsp; &nbsp; <span class="re1">@reflection</span>.<span class="me1">klass</span>.<span class="me1">count_by_sql</span><span class="br0">&#40;</span>@counter_sql<span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">else</span><br />
&nbsp; &nbsp; &nbsp; <span class="re1">@reflection</span>.<span class="me1">klass</span>.<span class="me1">count</span><span class="br0">&#40;</span><span class="re3">:conditions</span> =&gt; <span class="re1">@counter_sql</span>, <span class="re3">:include</span> =&gt; <span class="re1">@reflection</span>.<span class="me1">options</span><span class="br0">&#91;</span><span class="re3">:include</span><span class="br0">&#93;</span><span class="br0">&#41;</span><br />
&nbsp; &nbsp; <span class="kw1">end</span><br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; <span class="re1">@target</span> = <span class="br0">&#91;</span><span class="br0">&#93;</span> <span class="kw1">and</span> loaded <span class="kw1">if</span> count == <span class="nu0">0</span><br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; <span class="kw1">if</span> <span class="re1">@reflection</span>.<span class="me1">options</span><span class="br0">&#91;</span><span class="re3">:limit</span><span class="br0">&#93;</span><br />
&nbsp; &nbsp; &nbsp; count = <span class="br0">&#91;</span> <span class="re1">@reflection</span>.<span class="me1">options</span><span class="br0">&#91;</span><span class="re3">:limit</span><span class="br0">&#93;</span>, count <span class="br0">&#93;</span>.<span class="me1">min</span><br />
&nbsp; &nbsp; <span class="kw1">end</span><br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; <span class="kw2">return</span> count<br />
&nbsp; <span class="kw1">end</span><br />
<span class="kw1">end</span></div>

<p>Uh, well, yeah, the only change here is the <code>.to_i</code> at the end of line 4 but hey, what did you expect?</p>

<p>Save that in <code>lib/optional_join_counter.rb</code> and extend your association with <code>has_many :whatevers, :extend =&gt; OptionalJoinCounter</code></p>

<p>Now, to get back to the example, imagine we want to use this trick to count our comments.
The above SQL can either be written by hand, or with Rails finder options:</p>

<div class="dean_ch" style="white-space: wrap;">Post.<span class="me1">find</span> <span class="re3">:all</span>,<br />
&nbsp; <span class="re3">:select</span> =&gt; <span class="st0">&#8216;posts.*, count(comments.id) as comments_count&#8217;</span>,<br />
&nbsp; <span class="re3">:joins</span> =&gt; <span class="st0">&#8216;LEFT JOIN comments ON posts.id = comments.post_id&#8217;</span>,<br />
&nbsp; <span class="re3">:group</span> =&gt; <span class="st0">&#8216;posts.id&#8217;</span></div>
]]></content:encoded>
			<wfw:commentRss>http://jan.varwig.org/archive/rails-association-collection-size-though-joins/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

