We recently upgraded to delayed_job 3.0 and immediately started seeing some major memory leaks in our app, in the delayed job workers, passenger instances and even standalone scripts which don't even use delayed job. In the end I tracked it down to a bug in YAML.load

Out of the box YAML support can be provided by 1 of 2 backends in ruby 1.9 : syck and psych. Syck is an older implementation based around a no longer support C library, whereas psych uses the newer and supported libYAML. The default backend is psych, but earlier version of delayed_job did work with psych, and so were forcing the yaml engine to syck (which doesn't have this bug). When we upgraded to 3.0 they fixed their problems with psych and so we (unintentionally) started used psych. Unfortunately the version of psych that comes with ruby 1.9 has a memory leak in YAML.load. If YAML::ENGINE.yamler is 'psych' and Psych::VERSION is 1.0.0 then you are using an affected version

In particular this means that each time you load a model with serialised attributes, you leak memory. One of our very frequently used models has some serialized columns so that was why we were leaking. Delayed job obviously does a lot of yaml loading and so its workers were haemorrhaging memory.

Plugging the leaks

It took a bit of work to narrow down the leaks we were seeing to yaml but once that was done it turn out a few people have already written about this, notably over at nerdd.dk but I am somewhat amazed that knowledge of this issue is not more widespread. The issue is perhaps clouded by the fact that if libyaml isn't available when ruby is built ruby will just skip building psych (in which case syck is the only backend). Ruby 1.9.3 has a fixed version of psych, but disappointingly currently available versions of 1.9.2 (currently p290) still have this bug, 18 months after the release of 1.9.2.

Luckily there is a gem version of psych, however using it can be a bit fiddly if (as most rails apps do) you use bundler. Bundler loads psych early on its its setup process so you can't just stick psych in your Gemfile - both versions end up being loaded which causes an ugly mess.

nerdd.dk has a series of posts about how they tacked the various issues. In the end what I did was

  • set up config/setup_load_paths.rb to keep passenger happy: require 'rubygems' gem 'psych' require 'bundler' Bundler.setup
  • edit config/boot.rb to do gem 'psych' just after require 'rubygems'
  • hacked the stub executable for bundle to also have gem 'psych' after ruby gems is loaded

Request specs and Authlogic

December 5th, 2011

I was writing some rspec request specs the other day and was curious to notice that

before(:each) do
  activate_authlogic
  UserSession.create(@user)
end

wasn't working: the code under test didn't think the user was logged in at all. The same code works fine for controller specs and the authlogic documentation asserts that functional and integration tests should behave identically in this respect. Everything should just work.

Searching around revealed quite a few people with the same problem, all being told "don't do this, have your user login via your login form". While obviously actually testing the login form is important, to do so for every single request spec that involves a logged in user seems over the top. Taken to its extreme, every single request spec involving a logged in user should have the user signing up, editing their preferences and generally replaying the entirety of the history involving in going from a non user of the site to an active user in the desired state. Also, I was curious as to what was going on.

Cookie issues

When you call UserSession.create it validates the credentials provided (which is trivial when the arguments you pass are an actual user) and then stores the user's persistence token in a cookie (called user_credentials by default). It expects to have access to a controller like object that it can use for accessing cookies, request parameters / headers and so on.

AuthLogic::TestCase expects there to be a controller handing around, which in a controller spec / functional test there is (setup hooks create the controller, request & response ahead of time). In an integration spec on the hand the controller/request do not initially exist - Rails doesn't know what you're going to request. Because of this, Authlogic thinks you're testing from a model spec and so sets its cookies on a mock controller object. The Rails integration stack doesn't know anything about this mock controller object so when you make a request the authentication cookies never reach your controller(s).

In integration tests a thing called an integration session is in charge of maintaining state between requests, such as cookies, so that's where Authlogic needs to be setting its cookies.

Solution

I would have liked to be able to just use the integration session as Authlogic's controller, but that didn't quite work out. Authlogic expects the controller to have a request from which it can slurp stuff like the remote_ip but at the start of a request spec the request doesn't exist yet. In the end I did this

 cookies['user_credentials'] = "#{@user.persistence_token}::#{@user.send(@user.class.primary_key)}"

The cookies[]= method falls through (via method_missing) to the integration session object, so this cookie is stored in the right place. It's a bit yucky in that it makes assumptions about what the format of cookie value should be, but I can live with that. You may need to change the name of the cookie - Authlogic sets that based on the name of the user class and that sort of stuff so will vary from app to app. If you're using capybara rather than webrat to make your requests then you can use Capybara.currentsession.driver.browser.setcookie to set the cookie.

Dressipi is hiring

August 24th, 2011

Dressipi (where I work) is looking to hire a junior-ish software engineer to work on the guts of our various web applications, recommendation systems and mobile applications. We're a rails shop but we are above all looking for bright minds that can pick up new technologies with ease, so prior knowledge of rails is not required.

If you're interested drop us a line at jobs ]at[ dressipi.com with a covering letter & CV or any questions you might have.

Spork and parallel_tests

July 30th, 2011

What's more terrifying than being stabbed in the face with a spork? Being stabbed in the face with 2 sporks!

In these days where even laptops are dual or quad-core parallel_tests is a great tool for using all of those cores when running specs. Spork is another great tool for avoiding the overhead of loading rails, gems etc. when running tests.

Out of the box spork and parallel_tests don't work together. When spork receives a request to run specs it aborts its current run, so only one of your parallel_spec instances actually gets to run. In addition parallel_spec's trick of setting an environment variable to indicate which test environment should be used doesn't work, because the runners forked by spork don't inherit the environment parallel_spec sets up.

Fortunately there are at least two ways to get around this. I assume that you've already got parallel_tests up and running with the required extra databases, changes to database.yml etc.

Method 1: Multiple sporks

Grab the right version of parallel_tests

Make sure you've got parallel_tests 0.5 or later (Previous versions deal with the .opts file in a different way)

Gentlemen, start your sporks

Run as many sporks as you need. Run the first one on port 8989 (the default), the second on port 8991, the third on 8992 and so on. You should set TEST_ENV_NUMBER in each one, so your invocations should look like

TEST_ENV_NUMBER='' spork -p 8989
TEST_ENV_NUMBER='2' spork -p 8991
TEST_ENV_NUMBER='3' spork -p 8992

and so on

Setup parallel_spec.opts

Edit spec/parallel_spec.opts so that it looks like

--drb
--drb-port <%= 8989 + ENV['TEST_ENV_NUMBER'].to_i %>
--format progress
--format ParallelSpecs::SpecRuntimeLogger --out tmp/parallel_profile.log

Profit:

Run parallel_spec. When rspec is invoked it passes the .opts file through erb so the first instance (TEST_ENV_NUMBER='') will have --drb-port set to 8989, the second to 8991 and so on. Each rspec will connect to a different spork instance that has the correct TEST_ENV_NUMBER setup.

Method 2: Single hacked spork

A disadvantage to method 1 is that you have multiple instances of spork sitting around, consuming memory. It requires extra keystrokes to ensure each spork instance is restarted properly and more cpu cycles are consumed everytime you restart spork.

Use my spork

My fork of spork doesn't prohibit running multiple spec runs concurrently and allows environment data parallel_tests requires to be passed in. I've only modified the forking strategy.

Setup script/spec

#!/usr/bin/env ruby
require 'rspec'
require 'rspec/autorun'

module RSpec
  module Core
    class DRbCommandLine
      def run(err, out)
        begin
          DRb.start_service("druby://localhost:0")
        rescue SocketError, Errno::EADDRNOTAVAIL
          DRb.start_service("druby://:0")
        end
        spec_server = DRbObject.new_with_uri("druby://127.0.0.1:#{drb_port}")
        spec_server.run(@options.drb_argv, err, out, 'TEST_ENV_NUMBER' => ENV['TEST_ENV_NUMBER'])
      end
    end
  end
end

This ensures that when parallel_spec runs script/spec, the test environment number is passed through to the chikd

Make sure the right database is used

When spork forks a child with a specific TEST_ENV_NUMBER we need to ensure that the child connects to the correct database. This can be done with a Spork.each_run block in your spec_helper.rb. The app I've been working with uses mongomapper as well as Active Record, so my each_run block looks like this.

  if !ENV['TEST_ENV_NUMBER'].blank?
  #reconnect to your AR & mongo dbs
    db_config = YAML::load(ERB.new(File.read(Rails.root.join("config", "mongo.yml"))).result)
    if db_config[Rails.env]
      MongoMapper.setup db_config, Rails.env
    end
    ActiveRecord::Base.configurations = DressipiCom::Application.instance.config.database_configuration
    ActiveRecord::Base.establish_connection
  end

Profit

bundle exec spork and run parallel_spec as normal (things won't work if the 'normal' version of the spork binary is run). Your spork will fork multiple instances which will pick up TEST_ENV_NUMBER and reconfigure their database settings as needed.

Showing view spec exceptions

April 19th, 2011

When an exception is raised during rendering of a view it is wrapped inside an ActionView::Template::Error which is then reraised.

This leads to unhelpful failure messages in view specs, along the lines of

contact/index.html.haml renders a form to send feedback
     Failure/Error: it "renders a form to send feedback" do render
     ActionView::Template::Error
     # ./app/views/contact/index.html.haml:19
     # ./spec/views/contact/index.html.haml_spec.rb:10:in `block (2 levels) in '

This little monkey patch

module RSpec
  module Core
    class Example
      def set_exception(exception)
        if exception.is_a?(ActionView::Template::Error)
          @exception ||= exception.original_exception
        else
          @exception ||= exception
        end
      end
    end
  end
end

ensures that the original exception is the one that is displayed.

Unhelpfully helpful

April 6th, 2011

I loathe helper :all. What's the point of nicely divided helper modules if they're all injected into same namespace, overwriting each other at random? I like to write reusable helpers that individual controllers can use by implementing their version of a foo method. I see no reason why two completely unrelated helper modules couldn't both has a display_help_text method. helper :all prevents both of these and creates situations where you can accidentally overwrite a helper method in a completely different part of your application.

Obviously not everyone agrees with me because Rails continues to move in the other direction. Once upon a time, a bare controller only pulled in its default helper (ie Bar::FooController gets Bar::FooHelper) and the helpers for its ancestors. Later, the default application skeleton added helper :all to ApplicationController but this was still an opt in scheme. With Rails 3 it is an opt out scheme: you have to put clear_helpers in a controller to clear out the helpers Rails includes automatically for it.

View Specs

Rspec tries to do the right thing - if you poke into view_example_group.rb in rspec-rails you can see it determining the default helper class and including that and ApplicationHelper into the view object. Unforunately it also includes controller._helpers. This controller isn't an instance of the controller class that will use the view - it's an instance of ActionView::TestCase::TestController, so has Rails' default of the full set of helpers, irrespective of how the controller in the app is configured.

The first step is of course calling clear_helpers on ActionView::TestCase::TestController but if your application has helper modules that are used by multiple controllers then some of your view specs may start to fail. You could of course add those helpers to the view from your before(:each) blocks but that is a slightly nasty piece of duplication: you've already specified once which helpers should be used. More dangerously you could develop a situation where your view specs pass because they specify one set of helpers but your app fails because your controllers specify a different set.

To remedy this I've got the following in one of my apps:

module ViewHelpersSpecHelper
  extend ActiveSupport::Concern

  included do
    if metadata[:type] == :view
      name_parts = metadata[:example_group][:description].split('/')
      name_parts.pop

      controller_name = "#{name_parts.join('/').camelize}"
      unless controller_name =~ /\wMailer/
        controller_name += 'Controller'
      end

      helpers = controller_name.constantize._helpers
      before do
        view.singleton_class.class_eval do
          include helpers unless included_modules.include?(helpers)
        end
      end
    end
  end
end

This works out what controller class the view will normally be used with, for admin/foo/bar.html.haml this would be Admin::FooController and uses the helpers that controller would use normally.

A call to config.include(ViewHelpersSpecHelper) in my RSpec.configure block ensures that this module gets included everywhere relevant.

Cells

This starts to break down for partials that are rendered by multiple different controllers or for shared ones that don't have a natural controller they can be tested with. This is perhaps the one thing that helper :all doesn't make worse, since it does mean that a given view is always rendered with the same set of helper functions available. If you do have such units of view and helper code, the cells library is well worth a look.

Range#include? changed in ruby 1.9 . In ruby 1.8, testing whether a value was included in a range meant testing that it was greater than the start of the range and smaller than the end. Ruby 1.9 will iterate through all values in the range, checking whether any of them are equal to the value being tested.

The justification is that this is needed for some special cases. The example given in the link above is a range of objects wrapping integers, where #succ is defined to add 2 to the wrapped integer and comparison is defined by comparing the wrapped integer. If we write f(x) for the object wrapping x then you would have f(1) <= f(2) and f(2) <= f(3), but f(2) is not in the sequence generated by calling #succ repeatedly on f(1), since that yields only objects wrapping odd numbers.

While more correct in this sense, this is obviously a lot slower. It's also not immune from a different class of issues. As Xavier Noria pointed out on the rails-core list, it's easy enough to construct ranges with an infinite amount of elements in them for example. Calling include on them would hang forever on 1.9.

In the real world, my main concern is the speed. In our app we had a validates_inclusion_of on dates with a 100 year date range. On 1.8 this was instant, but on 1.9 calling include? on that range was taking around 130ms (on a 2.66Ghz Core i7).

Luckily the old include? logic is available as cover?. As of 3.0.5 Rails will use cover? rather than include? for validates_inclusion_of, until then you could drop something like this in an initializer.

if RUBY_VERSION =~ /1\.9/
  module ActiveModel
    # == Active Model Inclusion Validator
    module Validations
      class InclusionValidator < EachValidator
        def validate_each(record, attribute, value)
          match = if options[:in].is_a?(Range)
            options[:in].cover?(value)
          else
            options[:in].include?(value)
          end

          unless match
            record.errors.add(attribute, :inclusion, options.except(:in).merge!(:value => value))
          end
        end
      end
    end
  end
end

Get them here. I had a great time there - well worth going!

A colleague and I recently updated an app to rails 2.3.4 and found ourselves with a slightly sticky performance problem - slow app response times, with time disappearing seemingly nowhere. New Relic also showed that requests were queueing inside mongrels which wasn't shouldn't have been a problem since there were plenty of mongrels for the traffic being received at the time

Our friendly Engineyard support engineer pointed us to a ticket on the Rails lighthouse describing a rather similar situation encountered during the RC phase of rails 2.3: Active record based sessions causing trouble because they weren't returning their connection to the connection pool [1]. Connection pooling isn't new in 2.3 but the move to rack changed the order in which things like sessions, query caching, connection pool are put up.

This ticket was mark as resolved, however looking at the fix it became clear why we were still experiencing the problem:

if ActionController::Base.session_store == ActiveRecord::SessionStore
  configuration.middleware.insert_before :"ActiveRecord::SessionStore", ActiveRecord::ConnectionAdapters::ConnectionManagement
   configuration.middleware.insert_before :"ActiveRecord::SessionStore", ActiveRecord::QueryCache
else
  configuration.middleware.use ActiveRecord::ConnectionAdapters::ConnectionManagement
  configuration.middleware.use ActiveRecord::QueryCache
end

(Source here)

Since our session store isn't ActiveRecord::SessionStore, the code to put the database related middleware in the right place in the chain isn't executed. I'm not entirely sure what to do about this - the rails initializer cannot be expected to know the details of what the session store is doing (unless part of the session store interface was a uses_active_record? flag) but as it is this very sneaky and would have taken a while to find if we hadn't been pointed in the right direction.


[1] As of Rails 2.2, Active Record has a concept of a connection pool: a given rails instance won't use more than a certain number of connections (5 by default) even if that rails instance was a jruby based instance with hundreds or threads. If you need a connection from the pool and they are all in use then you have to wait until one becomes available. Normally you are blissfully unaware of this - rails marks your connection as no longer used when your action has finished processing.

Fun with ruby http clients

April 13th, 2009

Quite a few people have written about the performance failings of Net::HTTP, but until recently, to be honest, I never really cared a lot. Most of my http request needs have been fairly meagre, often not much more than hitting a url and checking the result code.

I've been playing with couchdb recently, and so my app does a fair amount of http requests. I've been using RelaxDB which uses net/http, so Net::HTTP's performance has started to matter.

Net::HTTP is not the only game in town. I spent some time recently playing with rfuzz, eventmachine and taf2-curb and came to largely the same conclusion as Paul Dix.

Leaning on a mature library such as libcurl gives taf2-curb a huge advantage. While eventmachine was on par speed wise, neither of the 2 http clients it includes are a complete implementation of the HTTP protocol. For example HttpClient will tell the remote server that it speaks HTTP/1.1, yet it does not support chunked encoding (mandatory part of the spec). HttpClient2 does understand chunked encoding, but doesn't let you set headers or a body to the request. Fine for just pinging a url, but not up to the task of working with couchdb. Something to do with couchdb's chunk encoded also seemed to confuse rfuzz.

taf2-curb does the job very nicely. On my dumb benchmark, 1000 requests for a static html page hosted on the same machine (ie we're pretty much only testing overhead) the numbers are:

Benchmark.bmbm(5) do |x|
  x.report 'net/http' do
    u = URI.parse('http://docs.local/')
    1000.times {Net::HTTP.get u}
  end

  x.report 'curb' do
    1000.times do
      c = Curl::Easy.new 'http://docs.local/'
      c.perform
    end
  end
end
               user     system      total        real
net/http   0.560000   0.270000   0.830000 (  1.065960)
curb       0.310000   0.170000   0.480000 (  0.696188)

On the other extreme, these numbers corresponds to ~1 meg of data pulled from couchdb (benchmark code the same apart from the urls, and I did 100 iterations rather than 1000).

               user     system      total        real
net/http  17.400000   8.900000   2.630000 (  32.067821)
curb       0.700000   1.300000   2.000000 (  29.586022)

curb comes up squarely on top. Another thing of note during this test is cpu usage (as you might expect from the difference in user time). With Net::HTTP the ruby process running this was taking up 60-70% (on a 2.4GHz core duo), with curb it used around 5% of cpu.

The commit to switch RelaxDB from net/http to taf2-curb is here for those interested - really very straightforward stuff. There may well be more to be had by fiddling with libcurl options, I haven't tried yet.

If you work with designers or getting to grip with a new codebase you may find yourself frequently answering the question 'What template generates this bit of html'. This smidgen of code adds comments to your html with the path to the templates used. For example the template

Hello
 <%= render :partial => 'some_partial' -%>

might generate

<!-- TEMPLATE: views/dummy/index.html.erb -->
Hello
<!-- TEMPLATE: views/dummy/_some_partial.erb -->
Here is a partial
<!-- ENDTEMPLATE: views/dummy/_some_partial.erb -->
<!-- ENDTEMPLATE: views/dummy/index.html.erb -->

You can grab the code from github. Original idea from this thread on rails-talk

It's a good thing when tests run quickly. You get a shorter feedback cycle, you're unlikely to move on to something else while your tests are running. Another good thing is ruby-debug: a fast debugger for ruby, worth its weight in gold when you need it. As of Rails 2.1 ruby-debug is always loaded and active during tests.

Having the debugger there does have some overhead. Not much, but it's still there. You're incurring that overhead whether your tests need it or not. On your CI server, it's pure wasted time for example. Even running locally it's frequently unecessary - most of the time when I run the tests I'm just running the tests, not using the debugger to try to understand a test failure.

So, without further ado I present this:

unless ENV['DEBUGGER']
  Debugger.stop if defined?(Debugger)
end

Stick it in your test_helper.rb, just below where it says

require 'test_help'

Run your tests again, and hey presto, faster! It does very from app to app, on some of the smaller apps I have the difference is pretty marginal, on the bigger apps I've got gains of up to 20-30%. I expect that on the smaller ones loading the Rails environment, setting up fixtures etc... dominates actual test runtime. And the day you do need to run the tests under the debugger it's as easy as

rake DEBUGGER=true

if you're using rake to run your tests or

DEBUGGER=true ruby -Itest test/unit/some_test.rb

if you're running an individual test (this might change according to what shell you use). In fact most of the time you won't even need to do this, as calling debugger calls Debugger.start for you.

When cache_classes gets you down

December 28th, 2008

As I've mentioned before, in Rails 2.2 if config.cache_classes is true (which it is in production) then all of your models, controllers etc are loaded up as part of application initialization. This is great if you're firing up a server that should actually be handling requests as it avoids all sorts of nasty race conditions to do with requiring files and defining classes in ruby. It's not so great if you're running a small script, cron job or something like a migration.

The reasons are two-fold. First off, it makes startup slower. This is irrelevant when starting up a web server because it is just getting out of the way stuff you would do anyway but it is time wasted if you've just got a single threaded script that doesn't care about dependency race conditions (and probably only touches one or two models anyway), especially if the script only takes a second or two to run once the environment is loaded. Secondly it can break stuff if models or controllers try and look at the database when the class is loaded. One example of this is ActiveScaffold. For those of you not familiar with it, ActiveScaffold helps you create adminy CRUD style interfaces. A very basic controller might look like this:

class ProductsController < ApplicationController
  active_scaffold
end

When the controller is loaded ActiveScaffold sees that it's working on the ProductsController, infers that the corresponding model is Product and goes off to find out what columns that model has. Imagine you've just added that model and controller and it's time to deploy on your production machine. At this point the products table doesn't exist (since the migration hasn't run yet). Since you're running in your production environment when the rake task causes rails to be loaded it will load your application's classes, including ProductsController which will try and introspect the products table. Fail.

There are two workarounds I can think of. One is a separate rails environment: define an environment with cache_classes set to false but that still uses the production database. Instead of running your migrations in the production environment run them in the production_without_cache_classes environment. This works but can be a bit annoying, especially for adhoc scripts and stuff like that and is even more of a pain if you have multiple productiony environments (eg a staging environment).

Another is to edit your production.rb so that it reads

if defined?(DONT_CACHE_CLASSES) && DONT_CACHE_CLASSES
  config.cache_classes = false
else
  config.cache_classes = true
end

Then for any script which you want to run without class caching turned on stick DONT_CACHE_CLASSES=true at the top (before you require config/environment). If you want to extend this to your rake files then edit the Rakefile at the top level of the app. This feels slightly neater to me as I don't have to remember to fiddle with the environment when running the script.

Another variation upon this is to use an environment variable instead of a constant which would allow you to do things such as rake db:migrate CACHE_CLASSES=false or similar (although of the default tasks I can't of one you'd run in production where you would want class caching to be on).

This is far from beautiful but appears to get the job done for now.

Dates, params and you

December 3rd, 2008

A not particularly nice area of Rails are the date and time helpers. 3 popups just isn't a very nice bit of user interface. It's a lot of clicks when you want to change dates and most people can't reason in their head about just the date. It's far easier to pick a date from a calendar type view. Still the helpers rails provides are fine for that quick and dirty date input.

Based on the questions on the mailing list about this, the thing that trips people up is that, unlike other attributes you might typically have, dates and times are not representable by a single input control. Instead you have several, one for each component (year, month, day etc...). So in particular, there is no single value in your params hash with your date or time. Exactly what is in your params hash depends on whether your using select_date or date_select (if you're entering a datetime, select_datetime or datetime_select).

These are to each other as text_field_tag is to text_field: date_select is expecting to hook up to an attribute of an instance variable (or if you use form_for or fields_for an attribute of the corresponding object) whereas select_date isn't. However unlike the other pairs of functions like textfield_tag/text_field, select/select_tag these two send very different parameters through to your controller.

select_date is perhaps the easiest to understand. It will result in a hash (by defaults it is named "date", but you can override this with the :prefix option) with keys like year, month, day. You can then put those together to get an instance of Date or Time. For example the following in your view

<%= select_date Date::today, :prefix => 'start' %>

will result in a params hash like this:

{:start => {:year => '2008', :month => '11', :day => '22'}}

As I said, there is nothing in the params hash that is the actual value. You have to put it together yourself, for example

my_date = Date::civil(params[:start][:year].to_i, params[:start][:month].to_i, 
                      params[:start][:day].to_i)

A bit more work than you average parameter, but there's nothing mysterious going on here. Under the hood, select_date is also quite boring: it's just calling select_year, select_month, select_day with appropriate options and concatenating the result. A consequence of that is that if you want some odd combination (eg just months and seconds) you can just do that concatenation work yourself. One interesting thing about those subhelpers is that the first parameter you give them can be one of two things:

  • an integer in which case the corresponding day/month/year is displayed (eg 3 for March)
  • something like a Date or DateTime in which case the relevant date component is extracted from it

date_select is where the fun is. Here the expectation is that there is a model object we will want to update and we want to be able to do

my_object.update_attributes params[:my_object]

However update_attributes just wants to set attributes. If you pass it {'foo' => 'bar'} it will try and call the method foo= passing bar as a parameter. For a date input that is made up of these multiple parameters this is clearly a problem. What solves this is something called multiparameter assignment. If there are parameters whose name is in a certain format, then instead of just trying to call the appropriate accessor Rails will gather the related parameters, feed them through a transformation function (for example Time.mktime or Date::new[1]) and then set the appropriate attribute.

The format used is as follows: all the related parameters start with the name of the attribute which lets Rails know they are related. Next Rails needs to know in what order to pass them to the transformation function and whether a typecast is needed. If your view contained

<%= date_select 'product', 'release_date' %>

Then your parameters hash would look like

{:product => 
        {'release_date(1i)' => '2008', 'release_date(2i)' => '11', 'release_date(3i)' => '22'}}

Rails can look at this and see that this is to do with the release_date attribute. It's a date column, so rails knows to use Date::civil. The suffixes tell rails that 2008 is the first parameter to Date::civil and is an integer, that 11 is the second parameter and so on. Rails constructs the value using Date::civil(2008,11,22) and assigns that to release_date.

If you don't intend to pass the parameters to update_attributes (or other functions with that syntax such as the new or create methods on an ActiveRecord class) there's not a lot of point in putting up with the scary parameter names althouh you can of course construct the date yourself with

Date::civil(params[:product]['release_date(1i)'].to_i,
 params[:product]['release_date(2i)'].to_i, 
params[:product]['release_date(3i)'].to_i)
You might as well just use select_date and have readable parameter names though.

So, to sum up use date_select or datetime_select when creating/updating ActiveRecord objects but select_date or select_datetime for just a general purpose date input. As a closing tip, with select_datetime you can use the :use_hidden option in which case hidden form inputs are generated instead of select boxes.[2]


[1] There's a bit more to this. For one the range of times representable by a Time object is limited on most platforms (since it's commonly a 32 bit number of seconds since an epoch). Rails has some conversion code that will try and create an instance of Time but if necessary will fall back and create a DateTime object. Secondly there's some cleverness to do with interpreting the user's input with respect to the correct time zone.

[2] This is (I think) a slight misuse. The intent of the use hidden is that it is the mechanism by which the :discard_day and so on work

with_options for fun and profit

November 25th, 2008

Active Support has a nifty little helper that can cut down on repetition. A lot of things in Rails like validations, associations, named_scopes, routes etc... take a hash of options as their final parameter. There are times where you use many of these with some common options, for example

class Customer < ActiveRecord::Base
  validates_presence_of :phone_number, :if => :extended_signup
  validates_presence_of :job_title, :if => :extended_signup
  validates_presence_of :job_industry, :if => :extended_signup
end

or maybe you have a bunch of associations which all share an association proxy extension module and some settings

class Customer < ActiveRecord::Base
  has_many :foos, :extend =>MyModule, :order => "updated_at desc", :conditions => {:active => true}
  has_many :bars, :extend =>MyModule, :order => "updated_at desc"
  has_many :things, :extend =>MyModule, :order => "updated_at desc"
end

Enter with_options!

class Customer < ActiveRecord::Base
  with_options :extend => MyModule, :order => "updated_at desc" do |options|
    options.has_many :foos, :conditions => {:active => true}
    options.has_many :bars
    options.has_many :things
  end
end

Any time you've got a bunch of method calls taking some common options, with_options can help.

So how does it work on the inside? All the with_options method actually does is yield a special object to its block - all the craftiness is in that object. What we want that object to do is forward method calls to the object we're really interested in (in this case the Customer class) adding the options before it does so.

As with many such proxy objects we undefine just about every method and just implement method_missing. The implementation of method_missing inspects the arguments and merges any options present with the common set defined by the call to with_options (so individual methods can take extra options or override the common ones) before passing them onto the "real" object.

Originally a limitation was that the last argument just had to be a hash, so for example if you had a procedural named_scope then with_options couldn't work. Luckily a recent commit rectifies this: if the thing you're trying to merge with is a proc, then with_options will replace it with a new proc that merges the common options with the result of the call to the original proc. If you're not on edge you'll have to wait for 2.3 in order to get this.

While both the examples I gave showed using with_options on what is essentially model class configuration it is by no means limited to that. You could use it for that sort of configuration on your own classes or just inside a regular method - anytime you are making several method calls on the same object with a hash of options at the end.