Space Vatican

Ramblings of a curious coder

Who’s Afraid of the Big Bad Lock?

Unless you’ve been living under a rock for the past few years you will have heard of C-ruby’s GIL/GVL, the Global Interpreter/VM Lock (I prefer to think of it as the Giant VM Lock). While your ruby threads may sit on top of native pthreads, this big bad lock stops more than one thread running at the same time. Allegedly one of the main reasons behind this was to protect non threadsafe ruby extensions and also to shield us from the horrors of threading. Personally it feels like a lot of 3rd party gems needed updating for 1.9 anyway (particularly with encoding related issues) and so it would have been a good opportunity to make that change. The complexities of threading, locking etc. could be handled by providing higher level abstractions over them (actors etc.).

True concurrency isn’t completely dead though. Ruby can already release the GVL when a thread is blocked on IO and if you are writing a C extension you can release the lock too. The mysql2 gem does this: clearly there is no point in holding onto the GVL when you’re just waiting on mysql to return results. Similarly Eric Hodel recently submitted a patch to the zlib extension so that the lock is released while zlib is doing its thing. This obviously doesn’t make mysql queries or zlib run any faster individually but it means you can run many in parallel and that these operations don’t block other unrelated threads. When even laptops have hyperthreaded quad-core processors, this is a good thing.

Goodbye Mephisto, Hello Octopress

For a long time, this blog was hosted on an ageing mephisto install. I took some time this weekend to migrate to octopress. It’s much more current and is more modern in many ways (a responsive layout, nicer syntax highlighting options, integration with things like gist and jsfiddle being just a few). These days a lot of my blogging is done on the train, so I was writing in a text editor and then pasting into mephisto which is just convoluted.

IAM Roles

It’s not uncommon to want to make requests to AWS from your servers. For example you might be using S3 to store user uploads, pushing custom metrics into CloudWatch or accessing one of the datastores such as DynamoDB or SimpleDB. All of these require authentication, and getting your access key / secret key onto your instances has historically been a little messy.

Using Multi Search With Tire

New in elasticsearch 0.19 is the ability to batch searches together via the multi search api. As far as I know this doesn’t result in elasticsearch doing any less work (unlike sphinx’s multi query stuff which is able to work out when your batch contains common sub expressions and stuff like that) but it does cut out a lot of latency.

I was motivated to poke at this by some work I was doing at dressipi: a particular page required nearly 20 (similar but distinct) search queries to be run. The individual queries were quick, around 10-15ms each but repeat 20 times and this was adding up to around 250ms (and this was on my dev machine so very low latency between my app on elasticsearch). Once I reimplemented it to use multisearch it was down to about 30-40ms for the same batch of 20 queries.

Slides From My LRUG Elasticsearch Talk

I spoke on elasticsearch at lrug on June 11. You can download the slides I used if you want. Always slightly nerve wracking to stand up and speak in front of a crowded room but I usually enjoy it once I get going and yesterday was no exception. Nice to see a good sized crowd despite the competing attentions of the football and wwdc!