If you're writing an iOS app that uses Core Data then you may well want to ship it with an initial database (which potentially gets over the air updates later on).

On iOS, CoreData stores always use sqlite3 as their backend. You could create a sqlite database directly, but you'd have to reverse engineer the way apple uses sqlite, ensure that you use the same name manging for table and column names, generate the same meta data used for persistent store migration etc. Too brittle for my liking.

Luckily both RubyCocoa and MacRuby allow you to access the Core Data framework from a ruby script. The former is bundled with Mac OS X since 10.5 (you will need to use the system ruby). RubyCocoa has some clunkier syntax because it doesn't have the benefit of the extensions to ruby, however at the moment MacRuby doesn't quite work with Active Record, which is where I was getting my seed data from. If your data is coming from else where, this may not be a problem. Other than the slight syntax differences around the handling of Objective-C's sort-of-named-arguments the code is the same

The RubyCocoa version looks like this

require 'rubygems'
require 'osx/cocoa'

OSX.require_framework 'CoreData'

class CoreDataStore
  def create_entity name, props={}, relationships={}
    entity = OSX::NSEntityDescription.insertNewObjectForEntityForName_inManagedObjectContext(name, context)
    props.each do |k,v|
      entity.setValue_forKey v, k
    end
    relationships.each do |k, objects|
      collection = entity.mutableSetValueForKey(k)
      objects.each {|o| collection.addObject o}
    end
    entity
  end

  def initialize(data_store_path, mom_path)
    @data_store_path = data_store_path
    @mom_path = mom_path
  end

  def context
    @context ||= OSX::NSManagedObjectContext.alloc.init.tap do |context|
      model = OSX::NSManagedObjectModel.alloc.initWithContentsOfURL(
          OSX::NSURL.fileURLWithPath(@mom_path))
      coordinator = OSX::NSPersistentStoreCoordinator.alloc.initWithManagedObjectModel(model)

      result, error = coordinator.addPersistentStoreWithType_configuration_URL_options_error(
         OSX::NSSQLiteStoreType, nil, OSX::NSURL.fileURLWithPath(@data_store_path), nil)
      if !result
        raise "Add persistent store failed: #{error.description}"
      end
      context.setPersistentStoreCoordinator coordinator
    end
  end

  def save
    res, error = context.save_
    if !res
      raise "Save failed: #{error.description}"
    end
    res
  end
end

Having done this, you use it like this

store = CoreDataStore.new('seedData.sqlite', 'yourmodel.mom')
blog = store.create_entity 'Blog', 'title' => 'Hello world', 'body' => "it's a fine day"
store.create_entity 'Comment', 'body' => 'I Agree', 'blog' => blog
store.save

The too arguments are the path to the file you want to store the data in, and the path to your model file.

Your Mom's a data model

You may be wondering what a .mom file is. In XCode you work with a .xcdatamodel file. When you build your app, this is compiled down into a .mom file. You can also do it your self by running /Developer/usr/bin/momc mymodel.xcdatamodel mymodel.mom

We recently upgraded to delayed_job 3.0 and immediately started seeing some major memory leaks in our app, in the delayed job workers, passenger instances and even standalone scripts which don't even use delayed job. In the end I tracked it down to a bug in YAML.load

Out of the box YAML support can be provided by 1 of 2 backends in ruby 1.9 : syck and psych. Syck is an older implementation based around a no longer support C library, whereas psych uses the newer and supported libYAML. The default backend is psych, but earlier version of delayed_job did work with psych, and so were forcing the yaml engine to syck (which doesn't have this bug). When we upgraded to 3.0 they fixed their problems with psych and so we (unintentionally) started used psych. Unfortunately the version of psych that comes with ruby 1.9 has a memory leak in YAML.load.

In particular this means that each time you load a model with serialised attributes, you leak memory. One of our very frequently used models has some serialized columns so that was why we were leaking. Delayed job obviously does a lot of yaml loading and so its workers were haemorrhaging memory.

Plugging the leaks

It took a bit of work to narrow down the leaks we were seeing to yaml but once that was done it turn out a few people have already written about this, notably over at nerdd.dk but I am somewhat amazed that knowledge of this issue is not more widespread. The issue is perhaps clouded by the fact that if libyaml isn't available when ruby is built ruby will just skip building psych (in which case syck is the only backend). Ruby 1.9.3 has a fixed version of psych, but disappointingly currently available versions of 1.9.2 (currently p290) still have this bug, 18 months after the release of 1.9.2.

Luckily there is a gem version of psych, however using it can be a bit fiddly if (as most rails apps do) you use bundler. Bundler loads psych early on its its setup process so you can't just stick psych in your Gemfile - both versions end up being loaded which causes an ugly mess.

nerdd.dk has a series of posts about how they tacked the various issues. In the end what I did was

  • set up config/setup_load_paths.rb to keep passenger happy: require 'rubygems' gem 'psych' require 'bundler' Bundler.setup
  • edit config/boot.rb to do gem 'psych' just after require 'rubygems'
  • hacked the stub executable for bundle to also have gem 'psych' after ruby gems is loaded

I was looking at moving an application to ruby 1.9.3 and was getting some strange syntax errors along the lines of "syntax error, unexpected keyword_do_block" on code that was working fine on 1.9.2. I spend quite a few minutes staring at the code which looked completely benign.

It turns out the ruby 1.9.2 is a bit too permissive: it allows you to write an extra comma after your argument list but before the do that marks the start of your block.

  some_method arg1, arg2, do
    ...
  end

ruby 1.9.3 on the other hand won't accept this.

Everynow and again I wind up rewriting a ruby performance hotspot in C. It happens infrequently enough that I always forget the C api for passing a block implemented in C to some ruby code. Hopefully writing this down will help me remember this in the future. Today, I wanted to call find_each on a class, using a C function as the block.

My code ends up looking like this:

static VALUE call_find_each(VALUE klass){
    return rb_funcall(klass, rb_intern("find_each"),0);
}

static VALUE block_method(VALUE yielded_object, VALUE context){
    /* do work here */
}

static VALUE some_method(VALUE self){
    /*
      set klass to the class we want to call find_each on
      set ctx  to a pointer to some context 
    */
    rb_iterate(call_find_each, klass, RUBY_METHOD_FUNC(block_method), (VALUE)ctx);
}

block_method is the C function that implements my block. some_method is the function that wishes to call find_each on klass passing the block.

As I understand it rb_iterate takes the 3rd argument you give it and sets things up so that the next ruby method call made will use that method as a block. The last argument passed to rb_iterate is just a context pointer. Although it is typed as VALUE it doesn't have to be a ruby object, rb_iterate just passes it to your block method. While in my code it's just a pointer to a C struct, in the most general case it's probably wiser for it to be a ruby object: in theory the method you're passing the block to could be converting it to a proc and saving it for later use - you wouldn't know when it was safe to deallocate your context object.

Now that rb_iterate has your block sorted out, the first 2 parameters come into play. The first is a function pointer and the second is a VALUE. Once rb_iterate has done its setup work it just calls your function pointer passing that value. In my example all I need to pass is the class on which to call find_each, but you could for example pass a ruby array that would contain arguments to pass to find_each. In the common case that you just want to call each on a collection, ruby provides the rb_each method which does exactly that:

    rb_iterate(rb_each, some_collection, RUBY_METHOD_FUNC(block_method), (VALUE)ctx);

would do some_collection.each using block_method as the block.

Dressipi is hiring

December 19th, 2011

Dressipi is looking for a mid to senior level developer to join their development team. If you love solving problems, teasing meaning out of large volumes of data and have a passion for writing well designed and tested code we'd love to hear from you. We develop recommendation systems that make shopping simpler and looking good effortless, using a combination of knowledgeable stylists and clever technology.

We're looking for:

  • a seasoned software engineer, with at least a year's experience of Ruby on Rails
  • good knowledge of MySQL (exposure to other databases/datastores a plus)
  • able to hit the ground running with a large and growing Rails 3 application.
  • TDD/BDD experience

Some experience of recommendation systems or machine learning techniques would be useful but by no means required. Frontend skills (HTML, CSS, jQuery) appreciated but not required.

You'll join a small, dynamic team to work on all the technology that powers Dressipi, from consumer facing web applications to heavy backend calculations and mobile apps. No knowledge (or interest in) fashion or clothes required - our delightful stylists have all the clothing expertise we need.

As well as working with the rest of the development team you will be working closely with the stylists, encoding their unique knowledge and understanding of clothing into Dressipi's algorithms.

You will take part in the full lifecycle of the product from understanding the business needs, to deploying the solution and analysing the resulting data.

Dressipi tackles problems in a wide range of areas, including recommendation systems, text extraction, visual recognition, crowd sourcing and more.

Request specs and Authlogic

December 5th, 2011

I was writing some rspec request specs the other day and was curious to notice that

before(:each) do
  activate_authlogic
  UserSession.create(@user)
end

wasn't working: the code under test didn't think the user was logged in at all. The same code works fine for controller specs and the authlogic documentation asserts that functional and integration tests should behave identically in this respect. Everything should just work.

Searching around revealed quite a few people with the same problem, all being told "don't do this, have your user login via your login form". While obviously actually testing the login form is important, to do so for every single request spec that involves a logged in user seems over the top. Taken to its extreme, every single request spec involving a logged in user should have the user signing up, editing their preferences and generally replaying the entirety of the history involving in going from a non user of the site to an active user in the desired state. Also, I was curious as to what was going on.

Cookie issues

When you call UserSession.create it validates the credentials provided (which is trivial when the arguments you pass are an actual user) and then stores the user's persistence token in a cookie (called user_credentials by default). It expects to have access to a controller like object that it can use for accessing cookies, request parameters / headers and so on.

AuthLogic::TestCase expects there to be a controller handing around, which in a controller spec / functional test there is (setup hooks create the controller, request & response ahead of time). In an integration spec on the hand the controller/request do not initially exist - Rails doesn't know what you're going to request. Because of this, Authlogic thinks you're testing from a model spec and so sets its cookies on a mock controller object. The Rails integration stack doesn't know anything about this mock controller object so when you make a request the authentication cookies never reach your controller(s).

In integration tests a thing called an integration session is in charge of maintaining state between requests, such as cookies, so that's where Authlogic needs to be setting its cookies.

Solution

I would have liked to be able to just use the integration session as Authlogic's controller, but that didn't quite work out. Authlogic expects the controller to have a request from which it can slurp stuff like the remote_ip but at the start of a request spec the request doesn't exist yet. In the end I did this

 cookies['user_credentials'] = "#{@user.persistence_token}::#{@user.send(@user.class.primary_key)}"

The cookies[]= method falls through (via method_missing) to the integration session object, so this cookie is stored in the right place. It's a bit yucky in that it makes assumptions about what the format of cookie value should be, but I can live with that. You may need to change the name of the cookie - Authlogic sets that based on the name of the user class and that sort of stuff so will vary from app to app. If you're using capybara rather than webrat to make your requests then you can use Capybara.currentsession.driver.browser.setcookie to set the cookie.

Dressipi is hiring

August 24th, 2011

Dressipi (where I work) is looking to hire a junior-ish software engineer to work on the guts of our various web applications, recommendation systems and mobile applications. We're a rails shop but we are above all looking for bright minds that can pick up new technologies with ease, so prior knowledge of rails is not required.

If you're interested drop us a line at jobs ]at[ dressipi.com with a covering letter & CV or any questions you might have.

NSData and RubyCocoa on 10.7

August 3rd, 2011

For some reason code like

OSX::NSData.dataWithBytes_length(data, data.length)

segfaults on lion. Others have noted that this only occurs when the data contains bytes with the high bit set. This smells like something trying to interpret the string in some encoding when the string is in fact arbitrary binary data.

I'd love to use MacRuby instead of RubyCocoa, but unfortunately MacRuby doesn't seem to be able to handle Active Record at the moment, so I can't use it (yet). I haven't had time to delve properly into how RubyCocoa converts between ruby and Objective-C objects but I was able to hack around by using ruby inline

require 'inline'
class CFDataGenerator
  begin 
    inline do |builder|
      builder.include "<CoreFoundation/CoreFoundation.h>"
      builder.c_raw_singleton "
      
      static VALUE from_string(int argc, VALUE *args){
        CFDataRef d = CFDataCreate(NULL, RSTRING_PTR(args[0]), RSTRING_LEN(args[0]));    
        
        VALUE result;
        char type = '@';
        ocdata_to_rbobj(Qnil,&type, (const void *)&d, &result, false) ;
        return result;
      }
    "
    end
  end
end

Then you can use CFDataGenerator.from_string to convert ruby strings into nsdata instances. Remember to release the instance when you're done!

Spork and parallel_tests

July 30th, 2011

What's more terrifying than being stabbed in the face with a spork? Being stabbed in the face with 2 sporks!

In these days where even laptops are dual or quad-core parallel_tests is a great tool for using all of those cores when running specs. Spork is another great tool for avoiding the overhead of loading rails, gems etc. when running tests.

Out of the box spork and parallel_tests don't work together. When spork receives a request to run specs it aborts its current run, so only one of your parallel_spec instances actually gets to run. In addition parallel_spec's trick of setting an environment variable to indicate which test environment should be used doesn't work, because the runners forked by spork don't inherit the environment parallel_spec sets up.

Fortunately there are at least two ways to get around this. I assume that you've already got parallel_tests up and running with the required extra databases, changes to database.yml etc.

Method 1: Multiple sporks

Grab the right version of parallel_tests

Make sure you've got parallel_tests 0.5 or later (Previous versions deal with the .opts file in a different way)

Gentlemen, start your sporks

Run as many sporks as you need. Run the first one on port 8989 (the default), the second on port 8991, the third on 8992 and so on. You should set TEST_ENV_NUMBER in each one, so your invocations should look like

TEST_ENV_NUMBER='' spork -p 8989
TEST_ENV_NUMBER='2' spork -p 8991
TEST_ENV_NUMBER='3' spork -p 8992

and so on

Setup parallel_spec.opts

Edit spec/parallel_spec.opts so that it looks like

--drb
--drb-port <%= 8989 + ENV['TEST_ENV_NUMBER'].to_i %>
--format progress
--format ParallelSpecs::SpecRuntimeLogger --out tmp/parallel_profile.log

Profit:

Run parallel_spec. When rspec is invoked it passes the .opts file through erb so the first instance (TEST_ENV_NUMBER='') will have --drb-port set to 8989, the second to 8991 and so on. Each rspec will connect to a different spork instance that has the correct TEST_ENV_NUMBER setup.

Method 2: Single hacked spork

A disadvantage to method 1 is that you have multiple instances of spork sitting around, consuming memory. It requires extra keystrokes to ensure each spork instance is restarted properly and more cpu cycles are consumed everytime you restart spork.

Use my spork

My fork of spork doesn't prohibit running multiple spec runs concurrently and allows environment data parallel_tests requires to be passed in. I've only modified the forking strategy.

Setup script/spec

#!/usr/bin/env ruby
require 'rspec'
require 'rspec/autorun'

module RSpec
  module Core
    class DRbCommandLine
      def run(err, out)
        begin
          DRb.start_service("druby://localhost:0")
        rescue SocketError, Errno::EADDRNOTAVAIL
          DRb.start_service("druby://:0")
        end
        spec_server = DRbObject.new_with_uri("druby://127.0.0.1:#{drb_port}")
        spec_server.run(@options.drb_argv, err, out, 'TEST_ENV_NUMBER' => ENV['TEST_ENV_NUMBER'])
      end
    end
  end
end

This ensures that when parallel_spec runs script/spec, the test environment number is passed through to the chikd

Make sure the right database is used

When spork forks a child with a specific TEST_ENV_NUMBER we need to ensure that the child connects to the correct database. This can be done with a Spork.each_run block in your spec_helper.rb. The app I've been working with uses mongomapper as well as Active Record, so my each_run block looks like this.

  if !ENV['TEST_ENV_NUMBER'].blank?
  #reconnect to your AR & mongo dbs
    db_config = YAML::load(ERB.new(File.read(Rails.root.join("config", "mongo.yml"))).result)
    if db_config[Rails.env]
      MongoMapper.setup db_config, Rails.env
    end
    ActiveRecord::Base.configurations = DressipiCom::Application.instance.config.database_configuration
    ActiveRecord::Base.establish_connection
  end

Profit

bundle exec spork and run parallel_spec as normal (things won't work if the 'normal' version of the spork binary is run). Your spork will fork multiple instances which will pick up TEST_ENV_NUMBER and reconfigure their database settings as needed.

Sporking with akephalos

July 3rd, 2011

Spork is a test server for saving on environment startup time when running rspec or cucumber. Akephalos is an awesome capybara driver that wraps the htmlunit library. Because htmlunit is a java library, akephalos spawns a jruby process and then uses DRb to allow non-jruby instances to get at the htmlunit goodness.

If you try and combine the two, you get a somewhat uninviting error message:

Fs-ElephantBook-2:dressipi fred$ cucumber --drb features/user_signs_up.feature 
Using the default profile...
Disabling profiles...
.F---------------------------

(::) failed steps (::)

0x00000083df8904 is not id value (RangeError)
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:375:in `_id2ref'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:375:in `to_obj'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:1405:in `to_obj'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:1713:in `to_obj'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:613:in `recv_request'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:908:in `recv_request'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:1533:in `init_with_client'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:1545:in `setup_message'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:1497:in `perform'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:1592:in `block (2 levels) in main_loop'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:1588:in `loop'
(druby://127.0.0.1:8990) /Users/fred/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/drb/drb.rb:1588:in `block in main_loop'
(druby://127.0.0.1:60779) -e:1
./features/step_definitions/authentication_steps.rb:31:in `block (2 levels) in '
./features/step_definitions/authentication_steps.rb:30:in `each_pair'
./features/step_definitions/authentication_steps.rb:30:in `/^I sign up with$/'
features/user_signs_up.feature:10:in `When I sign up with'

Which a shame as spork would be especially useful here because akephalos can take quite a few seconds to launch its jruby process.

DRb

First let's step back and remind ourselves how DRb works. DRb (short for distributed ruby) is a distributed object system for ruby. In the simplest of cases, when you call a method on a remote object, your arguments are marshalled and transmitted over the wire. In the remote process, the arguments are unpacked, the method called and then the result is marshalled and sent back to you.

Obviously this doesn't always work - some objects (procs, IO streams etc.) can't be dumped. A very common example would be if you want to call a method and pass a block. So, in addition to the previously described scheme, DRb also allows making proxies for objects. Instead of getting or passing a marshalled dump of an object it uses a proxy that knows the process the object has come from (a uri like druby://127.0.0.1:8990) and something identifying the object in that process. By default it uses the object_id for the object, which DRb can then use to find the actual object. This can require some careful coding, as if in the remote process the object appears unused it might get garbage collected. If this happens then when you try to call a method on it DRb won't be able to turn the object_id back into an actual object and you get an exception like the one above.

When you run code such as

some_drb_object.do_foo { }

Then there are actually DRb calls happening in both directions. The client asks the server process to run do_foo, but because there is a block passed to the method DRb wraps up that block in a proxy object and sends it to the server along with the arguments. When do_foo calls yield, a DRb call happens in the reverse direction in order to invoke the block.

Akephalos and Spork

Back to the original problem. When you run some specs or cucumber features with spork, spork forks off a new instance and then uses DRb to run your specs/features in that process. It's the combination of that use of DRb with akephalos' use of DRb that causes the problem. After digging around I traced the exception raised to this line in akephalos:

page.find(selector).each { |node| nodes << Node.new(self, node) }

page is a DRb object from akephalos's jruby process that wraps the htmlunit representation of the page. A little more digging revealed that the invalid id was in fact referring to the block passed to each. You may have noticed that the error is being raised from druby://127.0.0.1:8990, ie from spork's parent process.

There was already a DRb server running in spork parent process ( druby://127.0.0.1:8990 ), and because our test running process was forked from that process, when DRb packaged up that block into a proxy object it gave the uri for that server as the uri to use when making calls to the block. When each yielded to the block, instead of the method call going to the process that called each, it went to the spork parent process. Because the call went to the wrong process, the object id passed is entirely meaningless.

Fixing it

I'm not entirely sure of the best way to fix this - akephalos is blissfuly ignorant of this aspect of its environment. I was able to get my cucumber features running with both spork and akephalos by doing this in my env.rb

Spork.prefork do
  ... #my usual Spork.prefork stuff
  Akephalos::RemoteClient.manager #force akephalos to be started by the parent process
                                  #so that we don't keep starting it again and again
end

Spork.each_run do
  Thread.current['DRb'] = { 'server' => DRb::DRbServer.new }
end

This ensures that when DRb is constructing a proxy object it uses the uri for the freshly constructed DRbServer as the uri for any proxy objects it passes, and so block invocations and the like go to the correct process.

You'll all have seen lines like this in your log files:

Completed 200 OK in 14ms (Views: 0.2ms | ActiveRecord: 2.7ms)

While obviously a very high level view of your application's performance, it's still a good place to start when you're wondering where the time has gone. If you're using a different datastore or if external services (solr, sphinx, external API calls) are important contributors to your application's performance you might want time spent in those services listed too.

Prior to Rails 3, it was hard to add information to this summary without making a mess of things. What was reported on was just hardcoded into Action Controller. Things are a lot easier in Rails 3.

New in Rails 3 is ActiveSupport::Notifications. The idea here is to decouple things producing performance data and things consuming them.

ActiveSupport::Notifications.instrument('some_task.my_app', data) do
  #do some processing
end

will record how long it took to do the task, and then yield that data to anyone interested (instances of subclasses of ActiveSupport::LogSubscriber). Action Controller's process action is wrapped by a call to instrument and one particular subscriber to that is one that writes what Action Controller has been up toto the app's log file. If you wanted to analyze your app's data other than by eyeballing the log files you could write your own class that could subscribe to these notifications and do whatever processing you wanted, instead of writing something to parse your log files.

A simple example

Suppose we want the time we spend query solr to be logged. Following the way Active Record does things, you need to do three things: instrument the action we wish to monitor using ActiveSupport::Notifications, record the information in a subscriber and then plug the resulting data into Action Controller.

At its barest, the instrumentation might look like this

module Solr
  class Connection
    def send_with_instrumentation(request)
      ActiveSupport::Notifications.instrument("send_query.solr") do
        send_without_instrumentation(request)
      end
    end
    alias_method_chain :send, :instrumentation
  end
end

Here send is a funnel through which all queries to solr flow.

The log subscriber might look a little like this:

module Solr
  class LogSubscriber < ActiveSupport::LogSubscriber
    def self.runtime=(value)
      Thread.current["solr_runtime"] = value
    end

    def self.runtime
      Thread.current["solr_runtime"] ||= 0
    end

    def self.reset_runtime
      rt, self.runtime = runtime, 0
      rt
    end

    def send_query(event)
      self.class.runtime += event.duration
    end
  end
end

Solr::LogSubscriber.attach_to :solr

The call to attach_to means that whenever a foo.solr event is fired, our subscriber's foo method will be called. The only thing we do in our subscriber is keep track of how much time has been spent, using the runtime and runtime= method. reset_runtime resets the counter and returns the time accumulated so far.

The last thing we need to do is use the hooks that Action Controller helpfully provides. The following is lifted pretty much line for line from Active Record[1]:

module Solr
  module Railties
    module ControllerRuntime
      extend ActiveSupport::Concern

    protected

      attr_internal :solr_runtime_before_render
      attr_internal :solr_runtime_during_render

      def cleanup_view_runtime
        self.solr_runtime_before_render = Solr::LogSubscriber.reset_runtime
        runtime = super
        self.solr_runtime_during_render = Solr::LogSubscriber.reset_runtime
        runtime - solr_runtime_during_render
      end

      def append_info_to_payload(payload)
        super
        payload[:solr_runtime] = (solr_runtime_before_render || 0) + 
                                 (solr_runtime_during_render || 0) +
                                 Solr::LogSubscriber.reset_runtime
      end

      module ClassMethods
        def log_process_action(payload)
          messages, solr_runtime = super, payload[:solr_runtime]
          messages << ("Solr: %.1fms" % solr_runtime) if solr_runtime
          messages
        end
      end
    end
  end
end
ActiveSupport.on_load(:action_controller) do
   include Solr::Railties::ControllerRuntime
end

cleanup_view_runtime is used to prevent double counting of time spent performing database queries etc. as both database time and view rendering time. The call to super in cleanup_view_runtime will wind up actually rendering the view, so looking at the solr time before and after this tells us how much time we need to deduct from the view render time (which is what the call to super returns).

append_info_to_payload is where we record how much time has been used in total, which is the sum of solr time before, during and after rendering the view.

log_process_action (a class method) acts on a payload hash and produces an array of components such as 'ActiveRecord: 32.2ms'.

Finally we add an on_load hook that will include this module in ActionController::Base.

Put all this together and you should see stuff like

Completed 200 OK in 17ms (Views: 10.8ms | ActiveRecord: 0.3ms | Solr: 3.7ms)

in your logs.


[1] At the time of writing rails 3.0.7 actually does this incorrectly because it doesn't consider time after rendering, or when no view was rendered at all

Showing view spec exceptions

April 19th, 2011

When an exception is raised during rendering of a view it is wrapped inside an ActionView::Template::Error which is then reraised.

This leads to unhelpful failure messages in view specs, along the lines of

contact/index.html.haml renders a form to send feedback
     Failure/Error: it "renders a form to send feedback" do render
     ActionView::Template::Error
     # ./app/views/contact/index.html.haml:19
     # ./spec/views/contact/index.html.haml_spec.rb:10:in `block (2 levels) in '

This little monkey patch

module RSpec
  module Core
    class Example
      def set_exception(exception)
        if exception.is_a?(ActionView::Template::Error)
          @exception ||= exception.original_exception
        else
          @exception ||= exception
        end
      end
    end
  end
end

ensures that the original exception is the one that is displayed.

Unhelpfully helpful

April 6th, 2011

I loathe helper :all. What's the point of nicely divided helper modules if they're all injected into same namespace, overwriting each other at random? I like to write reusable helpers that individual controllers can use by implementing their version of a foo method. I see no reason why two completely unrelated helper modules couldn't both has a display_help_text method. helper :all prevents both of these and creates situations where you can accidentally overwrite a helper method in a completely different part of your application.

Obviously not everyone agrees with me because Rails continues to move in the other direction. Once upon a time, a bare controller only pulled in its default helper (ie Bar::FooController gets Bar::FooHelper) and the helpers for its ancestors. Later, the default application skeleton added helper :all to ApplicationController but this was still an opt in scheme. With Rails 3 it is an opt out scheme: you have to put clear_helpers in a controller to clear out the helpers Rails includes automatically for it.

View Specs

Rspec tries to do the right thing - if you poke into view_example_group.rb in rspec-rails you can see it determining the default helper class and including that and ApplicationHelper into the view object. Unforunately it also includes controller._helpers. This controller isn't an instance of the controller class that will use the view - it's an instance of ActionView::TestCase::TestController, so has Rails' default of the full set of helpers, irrespective of how the controller in the app is configured.

The first step is of course calling clear_helpers on ActionView::TestCase::TestController but if your application has helper modules that are used by multiple controllers then some of your view specs may start to fail. You could of course add those helpers to the view from your before(:each) blocks but that is a slightly nasty piece of duplication: you've already specified once which helpers should be used. More dangerously you could develop a situation where your view specs pass because they specify one set of helpers but your app fails because your controllers specify a different set.

To remedy this I've got the following in one of my apps:

module ViewHelpersSpecHelper
  extend ActiveSupport::Concern

  included do
    if metadata[:type] == :view
      name_parts = metadata[:example_group][:description].split('/')
      name_parts.pop

      controller_name = "#{name_parts.join('/').camelize}"
      unless controller_name =~ /\wMailer/
        controller_name += 'Controller'
      end

      helpers = controller_name.constantize._helpers
      before do
        view.singleton_class.class_eval do
          include helpers unless included_modules.include?(helpers)
        end
      end
    end
  end
end

This works out what controller class the view will normally be used with, for admin/foo/bar.html.haml this would be Admin::FooController and uses the helpers that controller would use normally.

A call to config.include(ViewHelpersSpecHelper) in my RSpec.configure block ensures that this module gets included everywhere relevant.

Cells

This starts to break down for partials that are rendered by multiple different controllers or for shared ones that don't have a natural controller they can be tested with. This is perhaps the one thing that helper :all doesn't make worse, since it does mean that a given view is always rendered with the same set of helper functions available. If you do have such units of view and helper code, the cells library is well worth a look.

Range#include? changed in ruby 1.9 . In ruby 1.8, testing whether a value was included in a range meant testing that it was greater than the start of the range and smaller than the end. Ruby 1.9 will iterate through all values in the range, checking whether any of them are equal to the value being tested.

The justification is that this is needed for some special cases. The example given in the link above is a range of objects wrapping integers, where #succ is defined to add 2 to the wrapped integer and comparison is defined by comparing the wrapped integer. If we write f(x) for the object wrapping x then you would have f(1) <= f(2) and f(2) <= f(3), but f(2) is not in the sequence generated by calling #succ repeatedly on f(1), since that yields only objects wrapping odd numbers.

While more correct in this sense, this is obviously a lot slower. It's also not immune from a different class of issues. As Xavier Noria pointed out on the rails-core list, it's easy enough to construct ranges with an infinite amount of elements in them for example. Calling include on them would hang forever on 1.9.

In the real world, my main concern is the speed. In our app we had a validates_inclusion_of on dates with a 100 year date range. On 1.8 this was instant, but on 1.9 calling include? on that range was taking around 130ms (on a 2.66Ghz Core i7).

Luckily the old include? logic is available as cover?. As of 3.0.5 Rails will use cover? rather than include? for validates_inclusion_of, until then you could drop something like this in an initializer.

if RUBY_VERSION =~ /1\.9/
  module ActiveModel
    # == Active Model Inclusion Validator
    module Validations
      class InclusionValidator < EachValidator
        def validate_each(record, attribute, value)
          match = if options[:in].is_a?(Range)
            options[:in].cover?(value)
          else
            options[:in].include?(value)
          end

          unless match
            record.errors.add(attribute, :inclusion, options.except(:in).merge!(:value => value))
          end
        end
      end
    end
  end
end

Get them here. I had a great time there - well worth going!