Space Vatican

Ramblings of a curious coder

RailsConf Presentation Code

The code from our presentation is now available here. It’s worth a look even if you weren’t at our presentation.

Knock yourselves out!

Thoughts on Jeremy’s Keynote

I really enjoyed Jeremy Kemper’s talk on wednesday. The sort of talk that has you itching to run home and try out what you’ve seen. All good stuff. For those of you who weren’t there, Jeremy was talking about performance.

The key point is that it’s all about the user experience: how fast do our users think the app is? Part of it is your ruby code (and Jeremy had plenty to say about that, with GC tips, profiling tips etc…) but a huge chunk is the network. A common trick is to bundle up your assets, what with the limit on the number of concurrent loads from a domain and the part that latency plays, loading 1 medium or biggish javascript (or stylesheet file or whatever) is almost always preferable to 5 small files. With 2.1 Rails makes this easy and will bundle up your js for you, but there’s another trick you can play.

You could stick assets like your javascript files on some sort of content distribution network, close to your users wherever they are. This would be an inordinate amount of effort to go to though just to host a few javascript files. Luckily, google has done that for you with their Ajax Libraries API. They are hostting copies of common javascript frameworks, including prototype, scriptaculous, dojo, jquery and mootools. You get to use their content distribution network and can assume they’ve got all the caching and compression stuff done right. The libraries are all versioned too (ie you say ‘give me prototype 1.6.0.1’), so no worries about that.

There is of course another advantage: instead of your browser caching a copy of prototype.js from every site that uses it, your users will only be caching it once (per version). Chances are when they come to your site it will already be in the cache.

Jeremy also mentioned the importance of putting yourself in your users shoes and seeing what your webapp is like when viewed through a slow or high(er) latency connection. I described one way of doing this alternative a few months ago and it really can be quite the eye-opener.

Ways in Which the Console Tricks You

IRB, and by extension Rails’ script/console is an incredibly useful tool. Being able to explore your objects or an API without having to write and compile a test program each time is absolutely priceless. There are however a few cases where the console can lead you astray.

When you evaluate something in the console it will display the result of calling inspect on whatever your statement(s) returned. Inspect is just a function so it can lie to you, be selective about what it shows or even have side effects.

Prior to Rails 2.0 the inspect function on ActiveRecord::Base dumped pretty much everything known about the object. This could get a bit cumbersome quite easily. For example if an association was loaded then the dump of all those objects would be included in the output taking up pages and pages and adding very little value. Starting with Rails 2.0 only the value of each attribute is listed:

1
2
>> Conversation.find :first
=> #<Conversation id: 2, title: nil, created_at: "2008-07-22 00:12:24">

Rails doesn’t list all attributes however: it knows which columns exist on that table and uses that information to destroy only those attributes. This often confuses people who are using :select to add attributes as it looks as though attributes aren’t there. Oh ye of little faith who do not believe until they have seen!

1
2
3
4
>> c = Conversation.find :first, :select => "*, 'hello world' as bonus_attribute"
=> #<Conversation id: 2, title: nil, created_at: "2008-07-22 00:12:24">
>> c.bonus_attribute
=> "hello world"

The extra piggy backed attribute was there all along, ActiveRecord’s implementation of inspect just chose not to show it to you. In theory an object’s implementation of inspect could do anything it wants. Usually it will be trying to be helpful, but that’s not to say it won’t mislead you.

Side effects

Every now and again the following pops up on one the the rails mailing lists or IRC channels:

1
2
>> conversation.messages
=> [#<Message id: 1, body: nil, conversation_id: 2, created_at: "2008-07-22 00:12:30"]

ZOMG! Just doing that loaded the association (or the all objects in the named scope), so doing conversation.messages.find… will load the collection before doing the find. What a waste! If only it were actually true.

Association proxies are slippery things. You may have already noticed that they claim to be arrays, look like arrays and yet clearly aren’t as they have a whole bunch of extra methods (which aren’t singleton methods). A full discussion of association proxies is something there’s no time for today, but the basic idea is that they implement specific methods like find (do a find that is scoped to the association) or count (does a SELECT COUNT(*) FROM …). If anything else happens (ie if method_missing is hit) then they load the association and propagate the called method to the target array.

When I typed in conversation.messages the relevant methods were called and an instance of an association proxy was returned. IRB then wanted to display it and so called inspect. Association proxies don’t implement that method and so the collection was loaded and inspect called on that. In order words conversation.messages doesn’t actually load the collection, it’s what IRB did with it afterwards that did. In a standalone script or program this wouldn’t happen at all. Pheww, Rails isn’t stupid.

There’s a little trick you can use in cases like this when want to stop irb inspecting something that will have a side effect such as the one described above or because the output from inspect would be very long:

1
x= (1..100000).to_a; false

Instead of printing several pages of the numbers from 1 to 100000 this just prints false. When irb evaluates that line the last statement is false (you could put anything you want there) and so the return value from that is what irb calls inspect on.

Fun with local variables

Try the following snippet:

1
2
eval("a=1")
puts a

If you paste it into IRB it works just fine, but if you paste it into a script and run that then it won’t work (you’ll get a undefined local variable or method ‘a’ ). Local variable scope is one of those things that are just a little bit fiddly in ruby. For example local variables created by eval are only visible from subsequent calls to eval. The big difference here is that ruby loads and parses your script in one go whereas irb does it line by line. The pickaxe has some more discussion on this.

Modules Aren’t Just for Instance Methods

Modules are great for sharing bits of code across models, controllers etc.. The very basic case, where you just want to share some instance methods is dead easy: just write the methods and include the module. The module gets inserted in your class’ list of ancestors and so ruby will look inside it when it needs to find a method. If you’re using rails you might also want to share validations on models, filters on controllers and so on. But first something a little simpler.

Using a module to add class methods is a bit more fiddly than sharing instance methods. Of course you can just include the module inside class << self, but often we want to add both class and instance methods. Requires two modules and two includes when one cannot exist without the other is a bit messy and repetitive(not to mention error-prone). Luckily, whenever a module is included, a callback is called passing what did the including as an argument. Instead on relying on the programmer to do that second include we can do it ourselves from this callback. This is a fairly standard pattern:

1
2
3
4
5
6
7
8
9
10
11
12
13
module MyModule
  module ClassMethods
    def a_class_method
    end
  end

  def an_instance_method
  end

  def self.included(base)
    base.extend ClassMethods
  end
end

Our included function is dead simple, we just want to add our module of class methods. We need to use extend here because include would add them as instance methods, we want to add them as singleton methods on the class, i.e. class methods. If you include this module then you’ll gain the methods in MyModule::ClassMethods as class methods. Naming the module ClassMethods is just for ease of reading, you could call it anything you want.

We could be doing a lot more though. It’s important to understand that things like has_many, validates_format_of and so on are just methods. They don’t always look like it (which is of course one of the things that makes ruby great for building DSLs), but it’s definitely there. Laying to one side issues of protected or private methods,

1
2
3
class Person < ActiveRecord::Base
  validates_presence_of :email
end

is the same as

1
2
3
class Person < ActiveRecord::Base
  self.validates_presence_of :email
end

or

1
2
class Person < ActiveRecord::Base; end
Person.validates_presence_of :email

This is of course true of all those other things (named_scope, before_filter, verify etc…) that you frequently see in models or controllers: just methods you can call on classes. What a happy coincidence that the included callback gives you a class! This means that we can package up sets of validtions or associations or anything like that into a module. For example, we could have an Auditable module. Things that are auditable have a polymorphic audit_trails association that tracks whenever the object was modified.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
module Auditable
  def update_with_audit(*args)
    returning update_without_audit(*args) do |rows_changed|
      if rows_changed > 0
        audit_trails.create(...)
      end
    end
  end

  def self.included(base)
     base.has_many :audit_trails, :as => :auditable
     base.alias_method_chain :update, :audit
   end
end

In our included method we use has_many to create the association and we use alias_method_chain to hook the update method (which is used whenever you update a row). Sometimes it’s nice to use instance_eval on base to make things look a little neater, for example

1
2
3
4
5
6
7
8
module FooBarish
  def self.included(base)
    base.instance_eval do
      validates_presence_of :foo
      validates_presence_of :bar
    end
  end
end

Whenever you include this module the 2 validations will be defined.

You could play similar tricks with sets of filters that you want certain controllers to have and all sorts of things. That said, don’t go overboard, sometimes regular old subclassing is exactly the right thing, but happily modules will keep you going even when it’s not.

Fun With Class Variables

Class variables are a slightly fiddly bit of ruby. They don’t always quite behave the way you expect. ActiveSupport bundles up a large number of helpers to deal with the various cases. They’re used extensively throughout the framework where the ability to control how things set at the framework level ripple down to subclasses (ie your models and controllers) is important to get right, but they can be pretty handy in your own apps too.

The Basics

If you’re at all familiar with ruby you’ll have heard of @@. @@ is a bit odd because it creates variables that are visible both from the instance and the class. The value is also shared with subclasses. ActiveSupport adds cattr_accessor which creates the obvious accessor methods and initializes the value to nil. The accessors are created on both the class and instances of it (the instance writer is optional).

1
2
3
4
5
6
7
8
9
class Foo
  cattr_accessor :bar
end

Foo.bar #=> nil
Foo.bar= '123'
Foo.new.bar #=> '123'
Foo.new.bar = 'abc'
Foo.bar #=> 'abc'

If we create a subclass, the value is shared:

1
2
3
4
5
6
class Derived < Foo
end
Foo.bar= '123'
Derived.bar #=> '123'
Derived.bar = 'abc'
Foo.bar #=> 'abc'

Other slightly odd things can happen:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Base
  def self.bar
    @@bar
  end

  def self.bar=value
    @@bar
  end
end

class Derived < Base
  cattr_accessor :bar
end
class Derived2 < Base
  cattr_accessor :bar
end

Derived.bar = '123'
Derived2.bar = 'abc'
Derived.bar #=> '123'

So now the subclasses have independent class variables named @@bar. However if you set Base.bar before the Derived and Derived2 classes are created [1] then everyone will be share the base class’ value as before. To summarize it’s rather fiddly and often unintuitive. It also does not allow the fairly common pattern of having the base class setting a default value that subclasses can override. If you don’t know about this then it can lead to odd situations. You’ll have code that works fine in dev mode, but not in production (since in development classes are trashed and recreated between requests which obviously interacts with all this) or tests that pass when run individually but fail when you run several of them at the same time.

Classes are objects

Classes are no exception in ruby, like most things (everything?) they are objects, and like objects they can have instance variables. These instance variables are just normal instance variables: they don’t have the odd scoping rules that @@ variables have. Base classes and derived classes have completely independent values.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Base
  @bar = '123'
  class << self
    def bar
      @bar
    end

    def bar= value
      @bar = value
    end
  end
end

class Derived < Base; end
Base.bar #=> '123
Derived.bar #=> nil
Derived.bar = 'abc'
Base.bar #=> "123"

Less prone to unwanted surprises, but not without shortfalls as you cannot make a default from a base class propagate down (without resorting to tricks with self.inherited).

Inheritable Tricks

ActiveSupport adds class_inheritable_accessor. This provides something closer in behaviour to what you might expect: base classes can provide defaults and subclasses inherit those defaults and can overwrite them without affecting other subclasses or the base class.

1
2
3
4
5
6
7
8
9
10
11
12
class Base
  class_inheritable_accessor :value
  self.value = '123'
end

class Derived < Base; end
class Derived2 < Base; end

Derived.value #=> '123'
Derived.value = 'abc'
Base.value #=> '123'
Derived2.value #=> '123

So far so good. We override the value on derive and didn’t perturb anyone else. Unfortunately there are some drawbacks:

1
2
  Base.value = 'xyz'
  Derived2.value #=> '123

Oh dear. Even though Derived2 never overrode any values the change we made to Base didn’t propagate. In other words the default values are baked into the subclass at the time that it is created. This is down to the way that class_inheritable_accessor is implemented: The classes have an instance variable @inheritable_attributes (like we saw above) that is a hash with all the attributes. The accessor methods just pull the values in and out of the hash. When a class is subclassed the subclass gets a copy of @inheritable_attributes. Once this has happened, nothing links the base class’ attributes with the subclass. This happens via the inherited callback, a consequence of this is that if you override inherited on an ActiveRecord class, a controller etc… without calling super all hell will break loose (since class_inheritable_accessor attributes will not be propagated).

Bags of tricks

ActiveSupport also provides class_inheritable_array and class_inheritable_hash. They both use class_inheritable_accessor as their underlying mechanism. When you set a class_inheritable_array or a class_inheritable_hash you are actually concatenating (or merging) with the value inherited from the super class.

1
2
3
4
5
6
7
8
9
class Base
  class_inheritable_hash :attrs
  self.attrs = {:name => 'Fred'}
end

class Derived < Base
  self.attrs = {:export => 'Pain'}
end
Derived.attrs #=> {:name => 'Fred', :export => 'Pain'}

These aren’t particularly magic but are a handy shortcut.

Delegation for the nation

ActiveSupport’s final trick is superclass_delegating_accessor, added in rails 2.0.1. At first it appears very similar to class_inheritable_accessor:

1
2
3
4
5
6
class Base
  superclass_delegating_accessor :properties
  self.properties = []
end

class Derived < Base; end

This time however we can do this:

1
2
  Base.properties = [:useless]
  Derived.properties #=> [:useless]

superclass_delegating_accessor creates regular instance variables in the class. The interesting bit is the reader method: it looks at the current class and checks if the appropriate instance variable is defined. If so, it returns it, if not it calls super (ie gets the superclass’ instance variable) and so it. It stops when it reaches the class that created the superclass_delegating_accessor.

Unfortunately it doesn’t always behave as you would expect: in place modifications will propagate upwards.

1
2
  Derived.properties << [:derived]
  Base.properties #=> [:useless, :derived]

Which is just horrible. The example is slightly less contrived when dealing with objects which you tend to modify in place. ActiveResource ran into this with its site property (an instance of URI) and rolls its own thing specially for this property that freezes things so that subclasses don’t mess with their parents.

Unfortunately it’s a decidedly quirky corner of ruby. Several different options with subtly different semantics that when they go wrong, go wrong in subtle ways that are easy to overlook and with no clear winner. Be careful!

[1] The important bit is actually when Derived and Derived2 create their @@bar variable. cattr_accessor does this for you so in this case the variable is created at the same time the class is.