Space Vatican

Ramblings of a curious coder

First, Foremost and [0]

This doesn’t work (the something field will not be updated):

1
2
post.comments.first.something = true
post.comments.first.save

but this does

1
2
post.comments[0].something = true
post.comments[0].save

as does

1
2
3
post.comments.each {|c| puts c.id}
post.comments.first.something = true
post.comments.first.save

Both of these would have worked in rails 2.0.x and previous versions. So what changed?

Quacks like a duck but breathes fire

As you may know, post.comments looks an awful lot like an array but isn’t an array. It’s an association proxy. It has methods defined on it for things like finding objects from the database, the count method that does an sql count and things like that. When you ask it to do something that can only really be done by having the ruby objects in memory it will load the objects from the database into an actual ruby array and pass methods onto that (this all happens via method_missing). So far business as usual, rails has been like this for a long long time. In particular were you to call the first or the last methods on an association then the array would be loaded and first would be called on that array.

This sort of depends on your problem domain, but a lot of the time loading the entire array just to look at the first or last element of it is wasteful. You’ve always been able to do some_association.find :first (and as of 2.1 some_association.find :last) but that flows a little less easily off the tongue and of course doesn’t play nicely if you pass your association to some code that thinks it’s just working with an array. So a few months ago changes were made to make first and last load just that one item from the database[1]. Of course if the target array is already loaded then it just returns the first item from that array.

At the end of the day what that ends up meaning is that in the first example I gave, each call to post.comments.first returns a different object, ie the one that we call save on is not the same as the one we made the change to. The second and third examples are ok purely because they force the array to be loaded which in turn means that calls to first no longer hit the database in that way[2].

Of course if you’re doing things right your unit tests would catch this sort of thing, but it’s still likely to leave you scratching your head a little (I certainly recall spending a few minutes looking at code very similar to the first example and wondering why it no longer worked). Slightly more subtle are performance problems, for example if you were iterating over various attributes then you’d be hitting the database each time to load somethings.first.

I’m not sure what to think about this sort of thing. There is a perfectly sound rationale for doing this but it introduces little ifs, buts and maybes into the illusion that association proxies behave like Arrays. As far as performance goes the implications vary. For big associations it can be a huge win, other times loading 1 object instead of 3 will make little to no difference. In other places I do genuinely want to load the whole array but I’d rather write first than [0] if I’m accessing the first element. Maybe the example I gave is a little artificial, maybe not, but at the end of the day, first no longer being a synonym for [0] is a habit that is hard to break.


[1] I’m simplifying things quite considerably here - there are a number of edge cases which that code has to tread around quite carefully, including unsaved parent objects, unsaved children objects, custom finder sql etc… [2] While I’ve concentrated on first, everything I’ve said here applies to last too. In a way it’s slightly worse in that (at least for me) the difference in comfort between writing last and [-1] is greater than the difference between first and [0]