Parametrised to the max

July 18th, 2008

If you’ve played with Rails for more that about 3 minutes you’ll know that the params hash can itself contain hashes. If you use form_for/fields_for then you’re used to things like params[:user] containing those fields pertaining to the user and things like that. But it can also contain arrays, arrays of hashes and so on.

Fundamentally html forms don’t know about any sort of structured data. All they know about is name-value pairs. Rails tacks some conventions onto parameter names which it uses to express some structure.

A lot of the time you don’t really care about how this happens. You use the helpers, rails generates its magic form element names and more magic on the other side stuffs that into your params hash. Every now and again you need to head off the beaten track, and there it’s useful to understand how things all fit together (eg if you are generating data to submit via javascript).

Boring cases & parlour tricks

First off we’ll need a way of trying out things. You could of course just mock up a form with the appropriate input elements but this would massively slow down the process of trying out stuff which is never a good thing. Instead we can tap into what Rails uses. Just open up a script/console prompt:

  ActionController::AbstractRequest.parse_query_parameters "name=fred&phone=0123456789"
    => {"name"=>"fred", "phone"=>"0123456789"}

Just stick name=value pairs (joined with &) into a string and pass it to the above function. What you get back is what would be in the params hash in your controller. From now on, for the sake of brevity I’ll just write parse instead of ActionController::AbstractRequest.parse_query_parameters.

If you’ve ever looked at the html your app generates you will have seen form inputs with names like user[name] (as generated by the text_field or form_for helpers). This notation indicates that the user parameter should be a hash, and that here we’re talking about the name key in that hash:

  parse "user[name]=fred&user[phone]=0123456789"
    => {"user"=>{"name"=>"fred", "phone"=>"0123456789"}}

The other basic structure is an array. By default, repeated parameters with the same name are ignored and only the first appearance counts.

  parse "name=fred&name=bob&name=henry"
    => {"name"=>"fred"}

If the name ends with [] then instead of discarding extra occurrences they are accumulated in an array:

  parse "aliases[]=fred&aliases[]=bob&aliases[]=dark+prince&things[]=foo"
    => {"aliases"=>["fred", "bob", "dark prince"], "things"=>["foo"]}

One use case might be a set of checkboxes which indicate which folders a user has to. Just create all your checkboxes with the name accessible_folder_ids[] and with the ids of the folders as values, for example:

  <% @folders.each do |f| %>
    <%= check_box_tag 'user[accessible_folder_ids][]', f.id %> <%= h f.name %> <br/>
  <% end %>

Stick this in a form and when you submit it, params[:user][:accessible_folder_ids] will be an array containing the ids of the folders the user can play with.

Nested parameters for dummies

We can nest hashes if we want. For example perhaps a user has an associated model such as an address:

  parse "user[name]=fred&user[address][town]=cambridge&
         user[address][line1]=4+Station+road"
    => {"user"=>{"name"=>"fred", "address"=>{"line1"=>"4 Station road", "town"=>"cambridge"}}}

A member of a hash can of course be an array. To do this you just append [] to the parameter name, as with a top level array parameter:

  parse "user[aliases][]=fred&user[aliases][]=bob&user[aliases][]=dark+prince"
    => {"user"=>{"aliases"=>["fred", "bob", "dark prince"]}}

The array can be several levels down: we can improve a previous example of a user with an address by saying that the address should have a lines array containing the lines from the address. Start with user[address][lines] and then just append [] to indicate the parameter should be an array:

  parse "user[name]=fred&user[address][town]=cambridge&user[address][lines][]=Random+house&
         user[address][lines][]=4+Station+road&user[address][lines][]=By+the+station"
    => {"user"=> {
      "name"=>"fred",
      "address"=>{ 
        "lines"=>["Random house", "4 Station road", "By the station"],
        "town"=>"cambridge"
    }}}

There’s another case in which nesting like this can be useful. Support for example that we have a list of users and we want the user to be able to change as many of them as they want with one edit operation. If we use the id of the records as the first key then this is easy:

  parse "users[1][name]=fred&users[1][email]=fred@example.com&
         users[2][name]=bob&users[2][email]=bob@example.com"
    =>{"users"=>{
      "1"=>{"name"=>"fred", "email"=>"fred@example.com"}, 
      "2"=>{"name"=>"bob", "email"=>"bob@example.com"
}}}

The corresponding controller code is similarly simple

  params[:users].each do |id, new_attributes|
    User.find(id).update_attributes new_attributes
  end

(Handling errors and invalid entries when editing multiple elements is an interesting problem in itself and is left as an exercise to the reader). If you are using the Rails form helpers, the :index option does precisely this:

  helper.text_field "person", "name"
    => '<input id="person_name" name="person[name]" size="30" type="text" />'
  helper.text_field "person", "name", "index" => 1
    => '<input id="person_1_name" name="person[1][name]" size="30" type="text" />'

Editing multiple objects

An interesting case is a form allowing us to create several models, for example to add several users to a mailing list. To do this we use parameters of the form users[][name]. This says that users is an array and we’re pushing an hash with key name onto it. So for example

  parse "users[][name]=fred&users[][email]=fred@example.com"
    => {"users"=>[{"name"=>"fred", "email"=>"fred@example.com"}]}

So how does rails know when one record is finished and the next start? Simple: if we’ve already had a users[][name] parameter and we see a new one then we can usually assume that the next one must belong to a new user

  parse "users[][name]=fred&users[][email]=fred@example.com&
         users[][name]=bob&users[][email]=bob@example.com"
    => {"users"=>[{"name"=>"fred", "email"=>"fred@example.com"}, 
                  {"name"=>"bob", "email"=>"bob@example.com"}]}

To continue with a previous example, a user might have several addresses:

  parse "user[name]=fred&user[addresses][][line]=24+bob+street&user[addresses][][town]=cambridge&
         user[addresses][][line]=1+market+square&user[addresses][][town]=bedford"
    => {"user"=>{"name"=>"fred", "addresses"=>[
        {"line"=>"24 bob street", "town"=>"cambridge"}, 
        {"line"=>"1 market square", "town"=>"bedford"}]
}}

Turtles all the way down

Hashes can be nested as much as you want:

  parse "users[a][b][c][d]=fred"
    => {"users"=>{"a"=>{"b"=>{"c"=>{"d"=>"fred"}}}}}

Arrays can’t be nested: you can can’t for example have a parameter which is an array of arrays. You can have a hash with an array parameter or an array of hashes but in general there can only be one level of ‘arrayness’.

To an extent this can be sidestepped, for example instead of an array of users you can have a hash of users keyed by id. If your data structure is genuinely just an array you can always make it into a hash that is keyed by array index. For example if we had a form displaying a list of users and each user has a name and a list of aliases then the following query string parses into what we want:

  parse "users[1][name]=fred&users[1][aliases][]=joker&users[1][aliases][]=the+bat&
         users[2][name]=bob&users[2][aliases][]=bobbo&users[2][aliases][]=bobster"
  =>{"users"=>{
    "1"=>{"name"=>"fred", "aliases"=>["joker", "the bat"]},
    "2"=>{"name"=>"bob", "aliases"=>["bobbo", "bobster"]}
}}

Where it can go wrong

Consider the following query string:

  user[aliases][]=fred&user[aliases][name]=fred

This can’t work: user[aliases][] indicates that user has an aliases attribute that’s an array, but user[aliases][name] indicates that user has an aliases attribute that’s a hash. There are other ways in which this can happen, but the result is the same:

  TypeError: Conflicting types for parameter containers

I think some people occasionally mix up :include and :joins (or possibly don’t know of the existance of :joins).

:include is for loading associations. Before Rails 2.1 it will always left outer join the appropriate tables, starting with rails 2.1 it will either load them with a separate query or it will join the appropriate tables.

:joins is for joining (duh). It either takes an sql fragment (eg “INNER JOIN foos on foos.id = foo_id”) or association names in a variety of ways and joins the relevant tables. For example:

:joins => :products
:joins => [:products, :customers]
:joins => [:products, {:customers => [:friends, :foes]}]
:joins => [:products, {:customers => {:friends => :parents}}]
:joins => {:order => {:customer => {:address => :some_other_association}}}

and so on. You can nest these as deeply as you want (although bear in mind that if you end up requiring a 23 way join you may want to rethink your strategy). The same notation for nested associations is used by :include. The one difference is that :joins creates inner joins for you (if you desperately need those outer joins, you can always use the string form of :joins, but you will have to write the sql fragment explicitly).

If what you want is to use attributes from the joined tables for sorting or in your conditions then both :include and :joins will work since they both cause the relevant tables to be joined. But :include then does a lot more work massaging the results that come back from the database, instantiating lots of activerecord objects, gluing together all the relationships in the appropriate manner. If all you wanted was to order or filter results based on some of the joined attributes then this work is wasted.

Just to show that I’m not making things up, I performed the following unscientific test:

Benchmark.bm(7) do |x|
  x.report("include:") do
    Customer.find :all, :include => :customer_detail, 
                 :conditions => "customer_details.date_of_birth >= '1980-01-1'"
  end  
  x.report("joins:") do 
    Customer.find :all, :joins => :customer_detail,
                 :conditions => "customer_details.date_of_birth >= '1980-01-1'"
  end
end

Some customers have provided extra details which we store in a separate table, and we want to get all customers born after 1980.

The results:

usersystemtotalreal
include1.3100000.0400001.350000( 1.532910)
joins0.3500000.0400000.390000( 0.454001)

In short, don’t use :include unless you will actually be accessing those associations and want to avoid the hit caused by loading them from the database one by one.

Berlin, here we come!

June 10th, 2008

Hot off the presses: the Railsconf Europe proposal submitted by my colleague Paul Butcher and myself has been accepted! Join us in Berlin to here about the nifty stuff we've been up to. I can't wait!

Dealing with concurrency

June 8th, 2008

Sometimes you don’t want a user doing two things at once (or two users doing something to the same third party at once). Dealing with this sort of issue is quite fiddly as its easy to overlook them and hard to track them down when they happen as they tend to be heavily timing dependant[4]. There’s a few tricks in the rails toolbox for dealing with them.

When I was at school the teacher used to dish out ‘bon points’ when you did something good. You could later trade those in for stickers and stuff. In a world very far removed from French primary school classrooms of 1990, the teacher might have written a webapp allowing you to do this (probably complete with an obnoxious facebook app where you could show everyone how many magic stars you have).

Optimism

The people table has a magic_stars column, and we want people to be able to spend their magic stars on various perks. if someone has been a good boy, you might want to use the credit method:

def credit
  self.magic_stars += 1
end

You need to be a little bit careful, as there’s a race condition here: suppose two people try to give someone a magic star at the same time

Person 1Person 2
loads child (3 stars)loads child (3 stars)
adds 1 adds 1
saves (4 stars)
saves (4 stars)

The 2 requests execute in separate mongrels (or mod_rails listeners etc…) and are completely oblivious to each other and will happily overwrite the change made by the other one. Rails’ optimistic locking will help us here: Just add an integer lock_version column to the model (don’t forget to make it default to 0) and the second ‘bad’ save will raise ActiveRecord::StaleObjectError. We can rewrite our credit method like this:

def credit
  self.magic_stars += 1
rescue ActiveRecord::StaleObjectError
  reload
  retry
end

So how does optimistic locking work? The key (unsurprisingly) is in the lock_version column. Assume that we loaded a person object with a lock_version of 4 and we now want to save it. We increment our lock version to 5. If no one else has touched this object then in the database it will still have a lock_version of 4.

Rails appends “WHERE lock_version = 4” to the update query, and then examines the number of rows that were updated (the database driver will return this). If we get 1 then we know everything went ok. If 0 rows were updated then we know that it must have been because someone touched that row and so we raise StaleObjectError.

Optimistic looking is nice because we’re not sitting on a database lock at any point, so we won’t hold up anyone else.

Optimism only gets you so far

Lets move on to another function: allowing children to swap their magic stars for some sort of perk. We need to check that they’ve got enough stars and if they do debit the appropriate amount and give them their perk. This is one of those classic times where you want a transaction: you don’t want a classroom full of 6 year olds screaming because you took their stars but didn’t give them their reward. You might write

def buy treat
  transaction do  
    if magic_stars >= treat.cost
      self.magic_stars -= treat.cost
      self.treats << treat
      save!
    end
  end
end

[1]If something bad happens halfway through then the transaction will roll back all the changes together.

There’s a big problem with this code. Suppose the child tries to buy two things in very quick succession. They’ve got 10 stars to begin with and each item costs 5. If the timing is right, then things will look like this:

Connection 1Connection 2
start transaction
Check number of starsstart transaction
set number of stars to 5check number of stars
add the treatset number of stars to 5
saveadd the treat
save
commit the transactioncommit the transaction

The 2 connections walk all over each other. In this case the children will be happy: both connections will set the remaining number of stars to 5, but both items will have been bought!

So why didn’t our optimistic locking help? Because of transaction isolation: in most databases by default your transaction won’t see changes made by other, uncommitted transactions. In fact you won’t see any changes made after your transaction was started. So when rails tries to save, as far as it can tell the row hasn’t changed and the save raises no errors.

Pessimism

The name optimistic locking suggests that there is an alternative, and there is: ask the database to get an actual lock for you. You can do this in two ways in rails: you can pass :lock => true to a finder (instead of passing true you can pass an sql fragment if you want to change the type of lock acquired) or you can call the lock! method (which is effectively a reload with :lock => true). It’s important to note that a lock is held as long as the current transaction. In particular if you aren’t inside a transaction then nothing happens, or in other words

def do_something_with_lock
lock!
do_something
end

doesn’t accomplish anything at all. Inside a transaction however, you’ll get an exclusive lock and no other transaction will be able to get a lock for that row. We can rewrite our buy method to look like

def buy treat
 transaction do
   lock!
   ...
  end
end

The lock stops another connection from fiddling with that person at the same time. It’s just a row lock, so it won’t stop you editing a different person at the same time[2]. If someone called our credit method from above at the same time that would be ok, as updating the customer row to change the amount of stars they have also requires an exclusive lock, so we’re in no danger here.

We can use these locks in more general ways, even if we don’t want to actually lock the customer row. For example if you have a constraint like a person may only borrow a certain number of books then you pretty much have to do it in ruby which renders you vulnerable to the various race conditions mentioned before.

You’re not modifying any rows (just adding one to a join table), so optimistic locking is no good. You can however lock the person row[3]. You’re not actually changing it at all, just using the database as a synchronisation method. We can repackage this up as something we can use all over our app:

def exclusive
  transaction do
    lock!
    yield
  end
end

Then in your app you can do things like

  some_customer.exclusive do
    #some task 
  end

It should be an easy exercise for the reader to put this in a form where any ActiveRecord model has this exclusive method.

[1]People sometimes ask about the difference between Person.transaction, the instance method transaction and ActiveRecord::Base.transaction. There is no difference between the first two – the instance method just callsself.class.transaction. The transaction method on Person, ActiveRecord::Base or some other model class only differ if the models use a different database connection. Most of the time, the three can be used interchangeably.

[2]If you use a database without row locks (ie mysql with myisam tables) you’ll probably get something horrible like a table lock.

[3]Do be careful about deadlocks though.

[4]The sleep function can be rather helpful when you’re trying to force particular bits of code to overlap with each other in particular ways.

Squeeze your pipes

May 29th, 2008

It’s very easy to merrily write a web application without realising that all those images and ajax calls you’re using make it rather sluggish when you’re not just connecting to localhost. Even when you’ve uploaded it to your production or test servers chances are you’ll have pretty good bandwidth and latency between you and those servers so it can be hard to see just what it will be like for an enduser who is less well endowed in the broadband department (or even still on dialup).

It’s not just about size

Very often it’s not just about how many bytes a second you could be transfering, the amount of latency is also a critical component of what the app feels like to the end user. Luckily we can simulate both.

The piece of kit we need is a traffic shaper, a piece of software that sits in between you and the end server and modulates the flow of packets. If you’re using linux or Mac OS X you’ve probably already got everything you need. I’m a mac nerd so I’ll concentrate on that.

Luckily it’s pretty damn easy on the mac as ipfw has all the necessary bits (from 10.4 upwards). The first thing we need to setup is a pipe. To quote the man page, “A pipe emulates a link with given bandwidth, propagation delay, queue size and packet loss rate”.


  ipfw pipe 1 config bw 300Kbit/s
  ipfw pipe 2 config bw 500Kbit/s delay 100

This creates 2 pipes: one with a bandwidth of 300Kbit/s, and the other with a bandwidth of 500Kbit/s and a propagation delay of 100ms (ie when a packet goes into that pipe it won’t come out the other end for 100ms)

Having setup the pipe, you then need to add firewall rules that send traffic through those routes. For example if you’ve got your mongrel running on port 3000 you can run


  sudo ipfw add pipe 2 tcp from any to any src-port 3000
  sudo ipfw add pipe 2 tcp from any to any dst-port 3000

to send traffic to/from port 3000 through pipe number 2. When you’re done with your testing or if you mess up, run


  sudo ipfw show

to show all the firewall rules you’ve created. The output will look a little like


00300    72    18982 pipe 2 tcp from any 4500 to any
00400    24     1896 pipe 2 tcp from any to any dst-port 4500
65535 58931 55276029 allow ip from any to any

The first column is the rule number (so 300 and 400 in my case). To remove the rules just run sudo ipfw delete rule-number. This isn’t the sort of thing I’d even begin to worry about at the beginning of the project, but worth keeping at the back of your mind.

This might not hold true for your vitriolic rants about your ex-girlfriend and similar, but blogging technical content is great. It’s good for the community in general when there are good howtos, explanations etc… and I’m sure a lot of people like the feel of being read by many people, but it’s not just that.

It’s often said that a true test of whether you understand something is whether you can explain it to someone else. Even if you were to erase a blog post the second you finished it, chances are you will have learnt something. I always do, even if I’m writing about something I already know well. Putting something in the form of a coherent piece of writing pushes me to explore all the edge cases and ramifications which, even I was aware of, hadn’t fully thought through before.

So go and write something interesting and do yourself and the community some good. Pick something interesting you worked on recently, a common question or misunderstanding, anything will do!

Reload me, Reload me not

May 28th, 2008

Rails’ development[1] mode automatic reloading is pretty nifty. It would really suck having to restart the server everytime you made a change, both in terms of the seconds wasted each time, the manual pressing of buttons you’d have to do and the 5 minutes wasted here and there when you forgot to restart. There’s no point reloading rails itself though, so all that stuff stays around. No point reloading plugins either really. I should probably point out that reloading is a misnomer: it implies things are actively loaded (which they aren’t). What actually happens is sort of that Rails forgets that it has seen Customer and loaded customer.rb, so when it hits Customer it goes off down the const_missing chain and loads your file again.

Although if you were working on a plugin or trying to diagnose a problem with a plugin it could be useful, but maybe that’s an edge case. That’s until your plugin starts to contain something with long lived reference to one of your application’s classes (or an instance thereof). A classic example is model classes with associations going over the plugin/app boundary. A quick trip to the console shows the problem:

I’ve got a teeny tiny app with 2 models: messages and conversations (think forums): conversations have many messages.

>> c = Conversation.find :first
=> #<Conversation id: 1, title: "hello world", created_at: "2008-05-28 09:03:58", 
updated_at: "2008-05-28 09:03:58">
>> c.messages
=> []
>> reload!
Reloading...
=> true
>> c.messages
NoMethodError: undefined method `messages' for #<Conversation:0x1849ccc>

reload! makes Rails do its class reloading (a useful trick in itself if you’re fiddling around at the console and want to see some changes). But what happened? We had a perfectly good instance of conversation and then it got trashed! Lets take a closer look:

>> first_class = Conversation
=> Conversation(id: integer, title: string, created_at: datetime, updated_at: datetime)
>> first_class.object_id
=> 13302014
>> first_class.instance_methods - ActiveRecord::Base.instance_methods
=> ["message_ids", "message_ids=", ... ]
>> reload!
Reloading...
=> true
>> first_class.instance_methods - ActiveRecord::Base.instance_methods
=> []
>> Conversation.object_id
=> 13772244

So we stash away the conversation class. It’s got an object id, the methods we would expect, all good. Then we reload! and all the methods are gone. If you ask for Conversation again you get a different class! This is how class reloading of ActiveRecord classes works: the old class is gutted and the constant removed, but we can’t stop people hanging onto references to the old class [3](incidentally if you ever get an incomprehensible message about methods not existing when you can see the method definitions right in front of you then you’ve probably run into a variant of this).

So if you’ve got a plugin holding onto a reference to some class, this is what is likely to happen to it, nonsensical messages about methods not existing when they really should. Maddeningly, the first time you load a page it will load fine, but refresh it and it’s gone! Your tests will pass too, and if you’re even half sane then your app dying even though the tests pass will give you a bad feeling. If you’re lucky the only bad thing will be that changes won’t be noticed until a restart, but even that’s quite annoying (especially if you didn’t expect it – application classes are reloaded after all !).

The way out

The obvious way out is to reload the plugin as well[4]. The set of things that are loaded only once is controlled by Dependencies.load_once_paths [5](except for rails itself, that’s special) and by default plugin lib directories are added to it. As long as you load via the rails dependency mechanisms, any constant loaded from a path not in that array will be reloaded after a request. This is why a ruby style require of application classes is usually a bad thing: it can stop rails from reloading some of your classes which leads to the problems seen above.

In my dummy app i’ve got

>> Dependencies.load_once_paths
=> ["/Users/fred/empty_app/vendor/plugins/dummy_plugin/lib"]

There’s an easy way out: in the plugin’s init.rb just stick [2]

Dependencies.load_once_paths.delete(File.expand_path(File.dirname(__FILE__))+'/lib')

If we boot up our sample app again:

>> Dependencies.load_once_paths
=> []

Job done!

[1]None of this applies if config.cache_classes is true (for example in production mode).

[2]If you load plugins from non standard locations, you may have to fiddle with that: the string you delete must exactly match what rails put in Dependencies.load_once_paths, that it corresponds to the same location on disk is not enough.

[3]This does mean I was lying slightly when I said this could be handy if you’re working on a plugin: if what your plugin is doing is extending some rails base class with a module, then this won’t help. Rails won’t have stripped the methods from the module (so nothing will break), but ActionController (for example) will still have the old module included in it and so won’t see the changes.

[4]Another lie. Aren’t I naughty? The classes from the plugin’s lib folder will be reloaded as needed but the plugin’s init.rb won’t be run again (which in a way is the reason for the previous caveat: we never include the new module into ActionController. Rerunning init.rb would be messy though, apart from anything else it wouldn’t unwind the changes we made, so if we removed some methods from our module they would still be in ActionController).

[5]Dependencies is moving to the ActiveSupport namespace, so replace every occurrence of Dependencies with ActiveSupport::Dependencies if you are running > 2.1.

Conditional RJS explained

May 26th, 2008

RJS is one of those nifty little things that catches your eye when you first start off using Rails. You very quickly get to add ajax and whizzy effects without having to peer into the somewhat messy world of javascript. Some would argue that it’s a bit of a crutch and you’re better off casting it away and just writing the javascript yourself. To an extent I agree, but I do think it has its place (and also its shortcomings). One of the problems with RJS is that it creates a convenient illusion and it can occasionally catch you out unless you understand that illusion and its limitations.

A question I frequently see is along the lines of ‘I want my rjs to do x only if y is not already on the page’. For example if some element is on the page you want to highlight it, if not you want to create it. People often come up with something like this as their first attempt:

  if page['some_div'].visible
    page['some_div'].replace :partial => 'some_partial'
  else
    page.insert_html :after, 'something_else', :partial => 'header'
  end

It’s important to understand that this cannot possibly ever work, not in a month of mondays or without destroying the fabric of space time. Until you understand this (and I’ll try to explain) you haven’t understood rjs and you will be confused. More generally if the conditions you are interested are client side (is this div there, is x in the text box etc…) then they can’t be handled by a plain old if in your rjs.

RJS in 5 minutes

RJS allows you to write ruby instead of javascript. Unfortunately browsers don’t support ruby as a scripting language, and so that ruby is transformed into javascript. The way it works (simplifying horribly) is that an instance of JavascriptGenerator (the page object you use in a a rjs file or a render :update block) responds to methods by outputting the appropriate javascript (or another proxy object).

For example page[‘some_div’] creates a proxy for the dom element with that id. If you call a special method on it (like render, visual_effect, insert, [] etc…) then rails will generate the appropriate bit of javascript. If you call anything else, then rails assumes you wanted to generate that javascript function call and does that for you (it converts the arguments for you, so strings are escaped properly, hashes are converted to json objects etc…).

So, some examples:

  page.select('.alert').each {|element| element.hide()}
  page['customers'].setStyle(:display => :inline, :padding => '1ex')
  page['header'].replace :partial => 'header'
  page['customers']['style']['display']='none'

these generate the following javascript:

$$(".alert").each(function(value, index) { value.hide(); });
$("customers").setStyle({"padding": "1ex", "display": "inline"});
$("header").replace("Contents of the partial");
$("customers").style.display = "none";

However you can’t write

page['customers'].select("li:not('.active')").each {|element| page.hide }

(hide all children of $(‘customers’) that are list elements without the class active) because the proxy doesn’t know how to proxy it. Not everything works completely seemlessly, so if something doesn’t work there’s a chance that RJS just doesn’t understand what you’ve trying to do. You can normally work you way out of it, in this case

page['customers'].select("li:not('.active')").each(
    page.literal 'function(element){element.hide()}')

will do the trick (page.literal dumps a literal bit of javascript), the question is whether it’s worth the effort or whether you’d be better off writing raw javascript instead of trying to bludgeon RJS into doing what you want.

R is for Ruby

Your rjs file is just normal ruby, so you can do things like

  if @divs_to_hide
    page.hide(@divs_to_hide)
  end
  if @element_to_remove
    page[@element_to_remove].remove()
  end

Which brings us to the main point: since rjs is just regular ruby (there’s just that special local variable), the if is a regular ruby if. It’s evaluated server-side, and so it’s just plain not possible for it to refer in any meaningful way to something client side.

A new hope

There is a way out though. What we want is a javascript if – so just generate one!

  page << "if($('some_div').visible()){"
  page['some_div'].replace :partial => 'some_partial'
  page << '}else{'
  page.insert_html :after, 'something_else', :partial => 'header'
  page << '}'

which generates

if($('some_div').visible()){
  $("some_div").replace("Contents of your partial here");
}else{
  new Insertion.After("something_else", "Contents of your other partial here");
}

Not very pretty [1], but it gets the job done. page<< inserts into the generated javascript the string you pass, we’ve just wrapped our other bits of javascript with some if statements and associated punctuation. Both partials are rendered and stuffed into the javascript and sent to browser (since we can only choose what to use client side) so this wouldn’t be very efficient if you were deciding which of 2 enormous bits of html to insert.

Another trick

It occured to me recently there’s another trick we can play as long as you can express your condition as the existence of an element specified by a css selector:

page.select('#foo').each do |element| 
  element.highlight
  page.alert('Foo is on the page')
end

This will highlight the element with id foo and display a message if that element exists (I could have used page[‘foo’] in the block if I had wanted to).

As described above, with select we can iterate over a collection and this is just a special case. The collection of items with id foo should contain precisely 1 or 0 elements, so iterating over it is the same as testing whether #foo exists. You can of course put anything you want in this block. The same caveats about all your partials etc… being rendered whether they are used or not still apply and there is the additional limitation that you can’t express things like “do x if y doesn’t exist”.

Hopefully that’s cleared things up a bit. If you want to know more about the magic behind RJS, take a look at protototype_helper.rb in action_view/helpers, it’s where all this stuff lives.

[1] There’s a plugin that tidies this up.

Confused by sync.rb ?

May 18th, 2008

I needed a reader/writer lock the other day and followed the trail to ruby's Sync class (why this gets to call itself Sync/Synchronizer as opposed to all the other synchronisation primitives is beyond me). The documentation isn't exactly enlightening either. I'm sure there's all sorts of clever stuff to do with upgrading a lock you already hold and other stuff like that, but if you're just interested in the boring case all you need is

  lock = Sync.new

  lock.synchronize(Sync::EX) do
    #do something that requires a writer (exclusive) lock
  end

  lock.synchronize(Sync::SH) do
    #do something that requires a reader (shared) lock
  end

One of the nice things about Rails is that it makes it really easy to get off the ground with Ajax, thanks to the link_to_remote helper (and all the other functions that share the same options like remote_function, form_remote_tag, observe_field etc... [1]). The hard work is actually done by prototype but now is not the time to worry about that.

The question that comes up over and over again is how do you add extra parameters, and this is where the :with option comes into play. This is all in the docs, but it's a little on the terse side. First things first though: if the parameters are known to you at the time when you call link_to_remote there's no need to use :with. Just pretend it's a link_to and do

  link_to_remote 'Click',  
        :url => {:action => 'foo', :some_param => 42, :something_else => 'hello world'}

That of course is the boring case. The interesting case happens when you want to submit some javascript variables or some input elements. Take a step back and look at what remote_function actually generates. It looks a little like this

  new Ajax.Request('/dummy/foo', {asynchronous:true, evalScripts:true, 
           parameters:'authenticity_token=' + encodeURIComponent('snip')})

The interesting thing here is the parameters option. It either takes a javascript object or a query string. By default all you'll get here is the authenticity token (part of Rails' CRSF protection), and if you've turned off that you'll get nothing. Everything you stick here gets passed as a parameter to your action, and in a nutshell rails sticks the value of your :with option here.

There's another easy case before we get stuck in. If you just want to submit the contents of the form then the :submit option is your best friend. Just pass it the id of a form, and rails will set the parameters to Form.serialize('some_form') and all the inputs from that form will get sent over with the ajax request.

:with => 'fancy pants'

In the most general case, you can pass whatever you want to :with as long as it evaluates to a valid query string (or if you don't have forgery protection turned on any javascript object will do [2]). Object.toQueryString is your friend here, it will turn any javascript object into a query string and takes care of escaping anything. Your other important friend is Form.Element.serialize. Given a form element or id it chucks back an appropriate piece of query string. It's even available as a method on extended elements, so instead of Form.Element.serialize('some_field') you can write $('some_field').serialize().

So if you've got a form element with id message you can create an ajax link that submits it with

  link_to_remote 'Click me', :url => {:action => 'foo'}, :with =>"$('message').serialize()"

If you had 2 elements you can just string them together (remember that all we're doing here is building up a query string:

  link_to_remote 'Click me', :url => {:action => 'foo'}, 
                 :with =>"$('message').serialize() + '&' + $('comment').serialize()"

Personally I hate all that messing around concatenating strings and ampersands, so I would usually just write

  link_to_remote 'Click me', :url => {:action => 'foo'}, 
     :with =>"$H({message: $F('message'), comment: $F('comment'), 
        fromUser:prompt('Tell me something')}).toQueryString()"

What we've done here is build up a prototype hash (that's the $H({...}) bit) and then called toQueryString on it, which does exactly what it says. $F is another prototype helper that, given an id, returns the value of the associated input element. The last part shows that we can do anything we want really. Here I'm asking the user to provide some text, but it could be some pre-existing javascript variable or anything you want.

If you decide not to us the various prototype serialisation helpers then you do need to be a little careful: it's up to you to ensure that what needs to be escaped is escaped. A handy function here is encodeURIComponent:

  link_to_remote 'Click me', :url => {:action => 'foo'}, 
     :with =>"'thing='+encodeURIComponent('I am a nasty & funky string')"

This ensures the ampersand is encoded rather than messing up everything. You can of course combine that with some of the other tricks:

  link_to_remote 'Click me', :url => {:action => 'foo'}, 
     :with =>"'thing='+encodeURIComponent(someFunctionReturningAString())"

observe_field is a naughty boy

Everything I've said holds true for observe_field, but observe field allows you to take some short cuts. If what you pass to :with doesn't look like a valid query string or interesting javascript expression then rails assumes you're just specifying what name you want the parameter to be submitted as, so

  observe_field 'some_field', :with => 'q'

expands to

  observe_field 'some_field', :with => '"q="+value'

This is executed in a context where value is the new value of the form element. Can you spot the problem yet? You need to return a valid query string and value is just the value of the field. By some good fortune you get away with this right until you type an ampersand (I'll file a ticket on thisit was fixed here, so edge and rails 2.1 are ok). It's not escaped and so your query string is mangled. Until this is fixed you should use something like

  observe_field 'some_field', :with => '"q="+encodeURIComponent(value)'

[1]The canonical function is actually remote_function, everyone else just uses its output. For example form_remote_tag is basically just a normal form whose onSubmit is the output of remote_function. As far as the docs go though, all the goodies are under link_to_remote.

[2]Prototype is smart enough to know that if you pass an object as the parameters option it should call Object.toQueryString on it, so for example parameters: {a: 2+2, b: "hello"} would do the right thing. The reason forgery protection makes a difference is that rails assumes that your :with option evaluates to a query string and appends to that authenticity_token=xxx.

Recently a fair few people on rubyonrails-talk have been getting strange error messages about config.time_zone or referring to funny versions of gems (like rails 2.0.2.9216). This rather suggests that they were using edge versions of rails (since config.time_zone is an edge thing, and rails 2.0.2.9216 means 'the edge gem from revision 9216). These edge gems come from gems.rubyonrails.org. But why were so many people suddenly doing this, all at the same time ?

It turns out that in its infinite wisdom Aptana silently adds gems.rubyonrails.org to your gem sources. If you remove gems.rubyonrails.org, Aptana will put it back. Don't ask me why, it's madness as far as I can tell. So you gem edge rails gems installed, thus the rails command now refers to edge rails and generates you an envionment.rb suitable for edge rails (including the config.time_zone line). The gem version specified in environment.rb is still 2.0.2 though, so your app loads 2.0.2 which doesn't know about the edge features. Failure.

So, to fix this:

  • Uninstall all the edge gems (the ones with versions like 2.0.2.9216 (the last number may vary)
  • Remove gems.rubyonrails.org from your gem sources: gem sources -r http://gems.rubyonrails.org
  • (Optional) get rid of Aptana, since it will keep adding gems.rubyonrails.org (or wait for the version that fixes this)

You might also want to recreate your app as the environment.rb that edge rails created for you will not work with rails 2.0.2

Rails makes setting up your associations dead easy and very concise. There's a whole bunch of options that can be set however as long as you are following Rails' conventions you very rarely need to use them. If you're new to rails you might not even know they exists and get away with it 99% of the time. One case where you do is when you've got more than one association pointing at the same table.

If you wrote an application tracking sales between individuals (lets call it railsbay) you might have a table called sales and in that table you'd need to track the buyer and the seller. Both buyer and seller should be objects from the users table, and we've got two columns that refer to them: buyer_id and seller_id. Before we tackle what we actually want to do, lets take a step back. This would all be really easy if there was only one user associated with a sale. So lets take a look at what happens inside when we write

class Sale < ActiveRecord::Base
  belongs_to :user
end

When you do this [1], ActiveRecord derives the class name by taking :user and camelizing it to get User, the assumed class name (if this was a has_many, it would also singularize, so Users -> User). If you specify a class_name option, that takes priority. The foreign key (the database column containing the id of the associated user, in this case user_id) is derived by taking the name (user) and adding _id [1]. It could be overridden with :foreign_key. The first parameter (:user) is the name of association, so in particular it is the name of the method you call to get the value of the association. All this means is that we could also have written

class Sale < ActiveRecord::Base
  belongs_to :user, :class_name => 'User', :foreign_key => 'user_id'
end

It's rather verbose, so we don't. If you didn't like your users (that's not really a good idea) you can say

class Sale < ActiveRecord::Base
  belongs_to :nasty_user, :class_name => 'User', :foreign_key => 'user_id'
end

and now you need to access the association as sale.nasty_user instead of sale.user.

We can now return to the original problem. We've got 2 users to associate, a buyer and a seller, so the associations might as well use those names. Our foreign keys are buyer_id and seller_id, and in both cases the association will refer to a User, so we need :class_name => 'User'

class Sale < ActiveRecord::Base
  belongs_to :buyer, :class_name => 'User', :foreign_key => 'buyer_id'
  belongs_to :seller, :class_name => 'User', :foreign_key => 'seller_id'
end

You don't actually need the foreign key in this case (since the default is association name + _id (from rails 2.0) so this simplifies down to

class Sale < ActiveRecord::Base
  belongs_to :buyer, :class_name => 'User'
  belongs_to :seller, :class_name => 'User'
end

Easy! Now we just need to do the other side of the association. A user will have many sales as a seller and many sales as a buyer, we'll call the associations sales and purchases. Like before, the association name determines the method you call to get/set an association. ActiveRecord assumes the defaults in much the same way, so from

class User < ActiveRecord::Base
  has_many :sales
end

ActiveRecord infers that the class is Sale (overridable by the :class_name option). The foreign key inference is different. Since we're talking creating assocations on the User class, ActiveRecord assumes that database columns pointing at this table are called user_id (underscorize the class name and add id [3]). So the above is equivalent to

class User < ActiveRecord::Base
  has_many :sales, :class_name => 'Sale', :foreign_key => 'user_id'
end

Our associations are called sales and purchases, both refer to things of class Sale and the foreign keys are buyer_id and seller_id, so our User class should look like

class User < ActiveRecord::Base
  has_many :purchases, :class_name => 'Sale', :foreign_key => 'buyer_id'
  has_many :sales, :class_name => 'Sale', :foreign_key => 'seller_id'
end

If you want you can drop the :class_name option from the sales association, it's not needed. There we go, you are now an association master! The same ideas apply in the other cases where the defaults aren't write. Always make the association name unique (try and come up with a good name, it will really help the readability of your code), and then set foreign_key and class_name options appropriately. I haven't talked about has_many :through at all, but that's because Josh Susser already has a lovely writeup of that over here.

[1]This is a slight lie. ActiveRecord does very little when you call belongs_to, has_many etc... all these things are computed when they are needed. That distinction isn't really important here.

[2]ActiveRecord use to base this on the :class_name option, ie the default key was (basically) the lowercased class name followed by _id. This changed in rails 2.0

[3]There's a bit more to this in cases where your models are in modules and so on. The gory details are in Inflector#foreign_key

For a long time javascript was something I avoided like the plague, usually only ever handling with the rubber gloves that rails provides. On a few occasions I had written some raw javascript myself and quite frankly I felt dirty afterwards and started hitting the bottle rather hard. I was writing clumsy procedural code, verbose, without unit test and working round all sorts of quirks in the language. Sometimes it was the only choice but whenever I could avoid it, I did.

This year I've been doing quite a lot of work in javascript, as part of a supercool project at work. However I am not an alcoholic nor a gibbering wreck. So what happened? In a single word, prototype (note that a lot of what I'm saying probably applies to similar libraries like jQuery, dojo etc... The important stuff isn't really in the details).

Getting Prototypical

Now I'd heard of prototype before. It was that thing with the $ thing for getting dom elements by id that you had to include for the whizz-bang scriptaculous magic or for using ajax. Thanks to my blind raging hatred of javascript I'd never looked any closer. What a mistake! There is so much more to prototype, something which I discovered thanks to the Bungee Book which is a lovely piece of work (Seriously, stop reading this and go and buy it. It's great (apart from the index)).

Fun in the Browser

Prototype brings two unrelated (at least at the conceptual level) things together. The first is a whole pile of things that make make the browser environment more palatable. The $ function is but the tip of the iceberg here. Some of my favourites include

  • $$ (and Selector class that powers it): find all the elements in the document matching a CSS rule. The great thing here is that it knows a whole pile of tricks, even if you're browser doesn't, so you can go from the basic $$('p.alert') (all paragraphs with class alert) to things that are a little more fun:

    $$('.sources .source:not(.disabled)')
    
    ( all elements inside something with class 'sources' that have the class 'source' but not the class disabled).

  • down, next, up etc... Navigating the dom by hand is the biggest pain in the arse since Vlad the Impaler, but these functions make it oh so easy. The up function gives you the parent element, down gives you the first descendant and so on (and of course you can say 'give me the 3rd such element'). That in itself is a modest improvement in comfort. But the killer blow is that you can pass those lovely css selectors and say things like

    foo.down('.source:not(.disabled)')
    
    which does exactly what it should: keep going down until you find something that matches that selector. Now think about how you'd do that by hand!

  • Event handling: prototype smoothes over all the differences between browsers and makes it really easy to have your own custom events which is great for having page elements react to changes without coupling things too tightly

  • Style manipulation: getStyle (gets computed styles), setStyle, hasClassName, removeClassName, addClassName are just terrific. Before

    foo.style.left = '8px';
    foo.style.right = '8px';
    foo.style.top = '8px';
    
    After
    foo.setStyle({left: '8px', right: '8px', top: '8px'})
    

There's loads more of this sort of stuff, the list above is just what sprung to mind. All the way prototype hides the differences between browsers.

Inner Beauty

Great stuff, but it doesn't address the issues of Javascript's builtin classes and the language itself. In a way I think this is where prototype really shines, because after a while I couldn't feel that feeling of self-loathing anymore. I was actually enjoying writing javascript. Part of it is down to the fact that beneath the horribleness of the 90s there's actually some cool stuff in javascript: closures, functions are objects and so on. The trouble is I'd never noticed because I was so busy vomiting profusely at the rest of it. Prototype made me aware of the nice stuff and tidies up the dog turds (And if you ignore everything I've written but remember that javascript has these functional hiding inside then that's probably the most important thing).

Enough waffling. What does prototype give you in this deparment?

  • Enumerable. A beast of a module. Mixed in to Array by default, it lets you use each, inject, map almost as if you were writing ruby (there's some punctuation clumsiness, but that's not prototype's fault). If that sounds awfully similar to ruby's Enumerable that's because it is
  • Class. Write classes, subclass them = win
  • method binding
  • Loads of little utilities like Object.isFunction that really should have been there all along
  • Your sanity back.

It's really a joy the first time you write javascript using all this. Seriously it brought tears to my eyes.

Feeling Testy?

The one bit missing from my workflow was unit testing. I started of playing with jsunit and while it worked fine I wasn't really enjoying it. There just seemed to be too much machinery involved, and some things seemed needlessly complicated. For example if you want to run a test suite in the browser you have to open the testRunner page, then click on the choosefile button on that page, navigate to your test file and hit run. And because of that you can't just set a firebug breakpoint. I ended up loading the test file itself in the browser and calling the setup, test and teardown functions from the firebug console. Yuck.

I know a lot of this is built for flexibility, but it was just completely overkill for what I needed. It turns out that prototype itself has a handy little unit testing framework that was much closer to what I was looking for. There's even the javascript_test plugin which makes all that fit nicely into your rails app (although it would seem that javascript_test hasn't seen much love - I've ended up ripping out most of its innards and replacing them with the lastest stuff from prototype). If you want to run a test suite in your browser, just open the test file! A rake task lets you run them all in one go

Started tests in Firefox
.......................
Finished in 23 seconds.
116 tests, 545 assertions, 0 failures, 0 errors

Finally, peace. Maintainable, structured, nonverbose javascript code, unit tested and running off a CI server. Who would have expected that!

:locals and string keys

May 3rd, 2008

If you ever do this

  render :some_partial, :locals => {'foo' => 'bar'}

don't. String keys in :locals have been deprecated for a while and are no longer supported in edge (and thus in the not too distant 2.1), so go ahead and change them now and you'll save yourself some grief when you migrate to 2.1.

When things break it's always nice when they break in an obvious way, so in this case foo not being defined at all in the partial makes it immediately obvious that something has change from :locals. You check the api docs, maybe google around a bit and see that string keys aren't accepted any more. If you do use a string key like this the foo will still be defined it will just be nil. You probably won't suspect an api change and spend some time hunting up and down as to why the variable you're using in :locals is nil (at least that's what I did - hopefully you won't have to do the same now!).

It's always the butler

May 1st, 2008

I have a confession to make: I really like hospital based tv series: House, ER, Scrubs, Green Wing. It's all good. Of the lot, House is the one I've been watching most recently and in many ways a typical episode is basically a medical murder mystery. Understanding a bug or working out some unexplained behaviour is basically the same thing, it's just that this time you're the hardened police officer on the case!

This episode opens up with a strange scene:

>> s = Shift.find :first
=> #<Shift id: 1, start: "2008-04-30 22:19:31", created_at: "2008-04-30 22:19:32", 
updated_at: "2008-04-30 22:19:32">
>> s.start
=> true
>> s.created_at
=> Wed Apr 30 22:19:32 BST 2008
>> s.start
=> Wed Apr 30 22:19:31 BST 2008

Something is clearly wrong, and as the title sequence starts up the viewer knows there's funny business afoot. In the grand tradition of TV series the next scene doesn't seem hugely related.

You can't be serious

A controller with a 'start' action was claiming that there was no such action, systematically. This controller had always had such an action and had not been changed in months, I was merely passing it while working on another issue. It was getting late so I called it a day. The next morning, with no code changes, it was fine so I moved on to other things, pushing that niggling feeling you get when you don't understand a bug to the back of my mind. Then at lunch it stopped working, again without apparent reason.

Time to tackle the problem head on. I fired up my trusty debugger and went stepping through ActionController. When a request comes in and the controller and action have been identified, ActionController (broadly speaking) will try to find a method of that name, a template of the same name or hit method missing. Now there's a bunch of methods that shouldn't be exposed as actions so what AC does is subtract from your controller's public methods all those public methods defined in ActionController::Base (which by definition pretty much can't be actions). For some reason ActionController::Base seemed to have a method called start and so 'start' was on the list of method names that could not be actions. Searching through the source revealed no definition of a start method, but interrogation of Kernel reveals that it defined a start method. I fired up script/console to do some more poking, and interestingly enough, no start method in Kernel. It should be running the same code as the app on the same machine, same everything and yet there's a difference.

There's definitely some methodicalness to tracking down a problem. At some point though you've go to switch off the guidance computer and just use the Force. If you're short on Jedi powers (as I am) then you've got to settle for second best which is to have an idea. It suddenly occurred to me that the difference was the debugger. When I'd been stepping through the ActionController code it was obviously on but when I'd been pissing about in script/console it hadn't. So I had a look at the source for ruby debug and sure enough it defines Kernel#start. So the seemingly random factor that was causing the action to fade in and out of existance was whether or not I was debugging something else!

Dénoument

At this point the problem I started with is sort of solved, a quick experiment shows that under the debugger it exhibits the anomalous behaviour and without the debugger it's fine. But why does the second call to start work?

When you get an ActiveRecord class it doesn't have your attribute accessors defined. They come into existance the first time you try and use them (via method_missing). So the first time we call start, we're calling the start method from Kernel.start that ruby-debug created. When we call created_at, method_missing gets hit which goes and generates all the accessors, overwriting the start method from Kernel.start with one that reads the start attribute. Then we call start again and this time we get the freshly generated one method

So there, mystery solved. The credits start rolling. I think this was a pretty good episode, who would have thought it was your trusted lieutenant who'd been stirring up trouble all along!

Epilogue: I now stick

Kernel.send :undef_method, :start if Kernel.respond_to? :start

In a file in config/initializers and order is restored to the galaxy.