Hot off the presses: the Railsconf Europe proposal submitted by my colleague Paul Butcher and myself has been accepted! Join us in Berlin to here about the nifty stuff we’ve been up to. I can’t wait!
Dealing With Concurrency
Sometimes you don’t want a user doing two things at once (or two users doing something to the same third party at once). Dealing with this sort of issue is quite fiddly as its easy to overlook them and hard to track them down when they happen as they tend to be heavily timing dependant[4]. There’s a few tricks in the rails toolbox for dealing with them.
There’s a related issue that this post is not about: suppose two users bring up the edit form for some object. They each make unrelated changes to the form and save, one after the other. The second user will squash the changes made by the first user. Oops. I’m not concerned about that, only with what happens if the two requests are actually concurrent, whereas here the two save requests could be minutes apart.
One way of solving this problem is for editable objects to have an edited_by association and disallowing edits if someone else is in the process of editing. Another way is to get clever about how you apply changes to attributes. While not directly relevant, the things outlined here may still be useful, for example if you go down the route of only allowing the user named in the edited_by association to edit the record then you need to be able to reliably set that field even if two users click the edit button at the same time.
And now, our feature presentation
When I was at school the teacher used to dish out ‘bon points’ when you did something good. You could later trade those in for stickers and stuff. In a world very far removed from French primary school classrooms of 1990, the teacher might have written a webapp allowing you to do this (probably complete with an obnoxious facebook app where you could show everyone how many magic stars you have).
Optimism
The people table has a magic_stars column, and we want people to be able to spend their magic stars on various perks. if someone has been a good boy, you might want to use the credit method:
1 2 3 | |
You need to be a little bit careful, as there’s a race condition here: suppose two people try to give someone a magic star at the same time
| Person 1 | Person 2 |
|---|---|
| loads child (3 stars) | loads child (3 stars) |
| adds 1 | adds 1 |
| saves (4 stars) | |
| saves (4 stars) |
The 2 requests execute in separate mongrels (or mod_rails listeners etc…) and are completely oblivious to each other and will happily overwrite the change made by the other one. Rails’ optimistic locking will help us here: Just add an integer lock_version column to the model (don’t forget to make it default to 0) and the second ‘bad’ save will raise ActiveRecord::StaleObjectError. We can rewrite our credit method like this:
1 2 3 4 5 6 | |
So how does optimistic locking work? The key (unsurprisingly) is in the lock_version column. Assume that we loaded a person object with a lock_version of 4 and we now want to save it. We increment our lock version to 5. If no one else has touched this object then in the database it will still have a lock_version of 4.
Rails appends “WHERE lock_version = 4” to the update query, and then examines the number of rows that were updated (the database driver will return this). If we get 1 then we know everything went ok. If 0 rows were updated then we know that it must have been because someone touched that row and so we raise StaleObjectError.
Optimistic looking is nice because we’re not sitting on a database lock at any point [5], so we won’t hold up anyone else. The most obvious limitation is that it only protects you if you are actually modifying a row (more on that later).
Pessimism
The name optimistic locking suggests that there is an alternative, and there is: ask the database to get an actual lock for you. You can do this in two ways in rails: you can pass :lock => true to a finder (instead of passing true you can pass an sql fragment if you want to change the type of lock acquired) or you can call the lock! method (which is effectively a reload with :lock => true). It’s important to note that a lock is held as long as the current transaction. In particular if you aren’t inside a transaction then nothing happens, or in other words
1 2 3 4 | |
doesn’t accomplish anything at all. Inside a transaction however, you’ll get an exclusive lock and no other transaction will be able to get a lock for that row. We can rewrite our buy method to look like
1 2 3 4 5 6 | |
The lock stops another connection from fiddling with that person at the same time. It’s just a row lock, so it won’t stop you editing a different person at the same time[2]. If someone called our credit method from above at the same time that would be ok, as updating the customer row to change the amount of stars they have also requires an exclusive lock, so we’re in no danger here.
We can use these locks in more general ways, even if we don’t want to actually lock the customer row. For example if you have a constraint like a person may only borrow a certain number of books then you pretty much have to do it in ruby which renders you vulnerable to the various race conditions mentioned before.
You’re not modifying any rows (just adding one to a join table), so optimistic locking is no good. You can however lock the person row[3]. You’re not actually changing it at all, just using the database as a synchronisation method. We can repackage this up as something we can use all over our app:
1 2 3 4 5 6 | |
Then in your app you can do things like
1 2 3 | |
It should be an easy exercise for the reader to put this in a form where any ActiveRecord model has this exclusive method.
[1]People sometimes ask about the difference between Person.transaction, the instance method transaction and ActiveRecord::Base.transaction. There is no difference between the first two – the instance method just callsself.class.transaction. The transaction method on Person, ActiveRecord::Base or some other model class only differ if the models use a different database connection. Most of the time, the three can be used interchangeably.
[2]If you use a database without row locks (ie mysql with myisam tables) you’ll probably get something horrible like a table lock.
[3]Do be careful about deadlocks though.
[4]The sleep function can be rather helpful when you’re trying to force particular bits of code to overlap with each other in particular ways.
[5]Other of course than the implicit locks that happen when you update a row
Squeeze Your Pipes
It’s very easy to merrily write a web application without realising that all those images and ajax calls you’re using make it rather sluggish when you’re not just connecting to localhost. Even when you’ve uploaded it to your production or test servers chances are you’ll have pretty good bandwidth and latency between you and those servers so it can be hard to see just what it will be like for an enduser who is less well endowed in the broadband department (or even still on dialup).
It’s not just about size
Very often it’s not just about how many bytes a second you could be transfering, the amount of latency is also a critical component of what the app feels like to the end user. Luckily we can simulate both.
The piece of kit we need is a traffic shaper, a piece of software that sits in between you and the end server and modulates the flow of packets. If you’re using linux or Mac OS X you’ve probably already got everything you need. I’m a mac nerd so I’ll concentrate on that.
Luckily it’s pretty damn easy on the mac as ipfw has all the necessary bits (from 10.4 upwards). The first thing we need to setup is a pipe. To quote the man page, “A pipe emulates a link with given bandwidth, propagation delay, queue size and packet loss rate”.
ipfw pipe 1 config bw 300Kbit/s
ipfw pipe 2 config bw 500Kbit/s delay 100
This creates 2 pipes: one with a bandwidth of 300Kbit/s, and the other with a bandwidth of 500Kbit/s and a propagation delay of 100ms (ie when a packet goes into that pipe it won’t come out the other end for 100ms)
Having setup the pipe, you then need to add firewall rules that send traffic through those routes. For example if you’ve got your mongrel running on port 3000 you can run
sudo ipfw add pipe 2 tcp from any to any src-port 3000
sudo ipfw add pipe 2 tcp from any to any dst-port 3000
to send traffic to/from port 3000 through pipe number 2. When you’re done with your testing or if you mess up, run
sudo ipfw show
to show all the firewall rules you’ve created. The output will look a little like
00300 72 18982 pipe 2 tcp from any 4500 to any
00400 24 1896 pipe 2 tcp from any to any dst-port 4500
65535 58931 55276029 allow ip from any to any
The first column is the rule number (so 300 and 400 in my case). To remove the rules just run
sudo ipfw delete rule-number. This isn’t the sort of thing I’d even begin to worry about at the beginning of the project, but worth keeping at the back of your mind.
Reload Me, Reload Me Not
Rails’ development[1] mode automatic reloading is pretty nifty. It would really suck having to restart the server everytime you made a change, both in terms of the seconds wasted each time, the manual pressing of buttons you’d have to do and the 5 minutes wasted here and there when you forgot to restart. There’s no point reloading rails itself though, so all that stuff stays around. No point reloading plugins either really. I should probably point out that reloading is a misnomer: it implies things are actively loaded (which they aren’t). What actually happens is sort of that Rails forgets that it has seen Customer and loaded customer.rb, so when it hits Customer it goes off down the const_missing chain and loads your file again.
Although if you were working on a plugin or trying to diagnose a problem with a plugin it could be useful, but maybe that’s an edge case. That’s until your plugin starts to contain something with long lived reference to one of your application’s classes (or an instance thereof). A classic example is model classes with associations going over the plugin/app boundary. A quick trip to the console shows the problem:
I’ve got a teeny tiny app with 2 models: messages and conversations (think forums): conversations have many messages.
1 2 3 4 5 6 7 8 9 10 | |
reload! makes Rails do its class reloading (a useful trick in itself if you’re fiddling around at the console and want to see some changes). But what happened? We had a perfectly good instance of conversation and then it got trashed! Lets take a closer look:
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
So we stash away the conversation class. It’s got an object id, the methods we would expect, all good. Then we reload! and all the methods are gone. If you ask for Conversation again you get a different class! This is how class reloading of ActiveRecord classes works: the old class is gutted and the constant removed, but we can’t stop people hanging onto references to the old class [3](incidentally if you ever get an incomprehensible message about methods not existing when you can see the method definitions right in front of you then you’ve probably run into a variant of this).
So if you’ve got a plugin holding onto a reference to some class, this is what is likely to happen to it, nonsensical messages about methods not existing when they really should. Maddeningly, the first time you load a page it will load fine, but refresh it and it’s gone! Your tests will pass too, and if you’re even half sane then your app dying even though the tests pass will give you a bad feeling. If you’re lucky the only bad thing will be that changes won’t be noticed until a restart, but even that’s quite annoying (especially if you didn’t expect it – application classes are reloaded after all !).
The way out
The obvious way out is to reload the plugin as well[4]. The set of things that are loaded only once is controlled by Dependencies.load_once_paths [5](except for rails itself, that’s special) and by default plugin lib directories are added to it. As long as you load via the rails dependency mechanisms, any constant loaded from a path not in that array will be reloaded after a request. This is why a ruby style require of application classes is usually a bad thing: it can stop rails from reloading some of your classes which leads to the problems seen above.
In my dummy app i’ve got
1 2 | |
There’s an easy way out: in the plugin’s init.rb just stick [2]
1
| |
If we boot up our sample app again:
1 2 | |
Job done!
[1]None of this applies if config.cache_classes is true (for example in production mode).
[2]If you load plugins from non standard locations, you may have to fiddle with that: the string you delete must exactly match what rails put in Dependencies.load_once_paths, that it corresponds to the same location on disk is not enough.
[3]This does mean I was lying slightly when I said this could be handy if you’re working on a plugin: if what your plugin is doing is extending some rails base class with a module, then this won’t help. Rails won’t have stripped the methods from the module (so nothing will break), but ActionController (for example) will still have the old module included in it and so won’t see the changes.
[4]Another lie. Aren’t I naughty? The classes from the plugin’s lib folder will be reloaded as needed but the plugin’s init.rb won’t be run again (which in a way is the reason for the previous caveat: we never include the new module into ActionController. Rerunning init.rb would be messy though, apart from anything else it wouldn’t unwind the changes we made, so if we removed some methods from our module they would still be in ActionController).
[5]Dependencies is moving to the ActiveSupport namespace, so replace every occurrence of Dependencies with ActiveSupport::Dependencies if you are running > 2.1.
Blogging Is Good for the Soul
This might not hold true for your vitriolic rants about your ex-girlfriend and similar, but blogging technical content is great. It’s good for the community in general when there are good howtos, explanations etc… and I’m sure a lot of people like the feel of being read by many people, but it’s not just that.
It’s often said that a true test of whether you understand something is whether you can explain it to someone else. Even if you were to erase a blog post the second you finished it, chances are you will have learnt something. I always do, even if I’m writing about something I already know well. Putting something in the form of a coherent piece of writing pushes me to explore all the edge cases and ramifications which, even I was aware of, hadn’t fully thought through before.
So go and write something interesting and do yourself and the community some good. Pick something interesting you worked on recently, a common question or misunderstanding, anything will do!