At Dressipi we use rspec request specs for our integration testing and these days we use capybara rather than webrat. A lot of our specs use the built in rack-test driver but a fair few use capybara-webkit (I’d like to look at poltergeist but haven’t quite got round to it yet).
Database connection management has long been an issue with tools such as capybara or selenium. You want to wrap your tests in transactions because that makes them much faster: you don’t have to cleanup the database after each spec and you can load content shared by a bunch of specs once and once only. But transactions being problems: capybara runs your app in a separate thread to your spec, so your app a separate database connection. Transactions are a per-connection thing so the transaction wrapping your spec have no effect on the connection used by the app thread: changes made by the app aren’t rolled back. On top of that by default transactions can’t see uncommitted changes from other transactions, so your app thread can’t see seed data your spec code has created. If you change the transaction isolation level then connections can see uncommitted transactions, however you’ll very soon end up with deadlocks in your code due to the locks acquired by the separate transactions.
Typically you do very little database access from your request specs, just insert a record here or check some state there, so for a while I used a bit of a hack to allow the spec code to run code inside the app thread. This was pretty messy.
Sharing is good but only if you behave
A little while ago (I don’t know who claims credit for this), the following bit of code started doing the rounds
1 2 3 4 5 6 7 8 |
|
It’s included as part of cucumber-rails as the SharedConnectionStrategy
. What this does it fairly simple. Normally the connection
method locates a fresh connection for your thread via Rails’ connection pool. This monkeypatch allows you to force it to always return the same connection, so both your spec code and the capybara app use the same connection, despite being on separate threads. As a result the transaction begun at the start of spec encompasses any changes made by the app thread. Your spec code can make changes visible by the app thread and you spec code can see any changes made by the app.
If you’re keen on your threading you’ll probably smell a rat: things like database connections typically aren’t meant to be used simultaneously by multiple threads simultaneously.
In ruby 1.8 and in the early days of 1.9 you’d usually get away with this: database queries called into C code to do its querying and then returned ruby results. Calling into C code blocks the entire VM and so you’d never actually run 2 queries simultaneously.
The mysql2 gem is much smarter - it uses rb_thread_blocking_region
to escape from ruby’s GVL. If you abuse this and actually run 2 queries against the same connection simultaneously then you’ll start getting errors such as Mysql2::Error: This connection is still waiting for a result
. I’m not a postgres user but given that the pg gem uses the async api it’s probably subject to similar issues: you’re not supposed to call PQsendQuery
until you’ve called PQgetResult
enough times to get all of the results.
It’s quite easy to get away with this- we ran with that patch for a long time with no issues. I suspect that this is down to a lot of tests inherently building in serialization: your spec code spends most of its time waiting for capybara finder calls to complete. Those calls only complete when capybara finds the element on the page, which will usually mean that the page has loaded, ie the thread serving your app is now idle. As a result there won’t be much in the way of database queries from the app until you call a method like visit
or click
, so you can access the database from your spec without fear.
More complicated web apps widen the window of opportunity substantially: for example the main page load might trigger several ajax calls in the background. We started writing request specs against code like this and they would fail unpredictably.
Mike Perham suggested using this instead:
1 2 3 4 5 6 7 8 9 |
|
When ActiveRecord::Base.connection
is called the first time, @@shared_connection
is nil, so what is returned is a ConnectionPool::Wrapper
. This gets assigned to @@shared_connection
so all subsequent calls to connection
return the ConnectionPool::Wrapper
. This is an object that uses method_missing
to forward everything to the connection it wraps, but only allowing one thread access at a time.
We use spork, so
1
|
|
line goes in our Spork.each_run
block - if you do it before then the shared connection ends up being created in the spork parent process which will cause no end of trouble.
This really feels to me like it should work - I can’t fault the logic, yet we were still getting the occasional random Mysql2::Error: This connection is still waiting for a result
error when running our request specs.
Reusing the wheel
Active Record already has a connection pool, so what could go wrong with just setting that pool size to 1 in database.yml? Surely that will ensure that we’re not using multiple connections simultaneously. The only issue here is that, for performance reasons, once a thread checks out a connection it can’t be used by another thread until you check it back in (unlike the connection pool wrapper that checks things in and out for you around each method call). In a normal rails app, the ActiveRecord::ConnectionAdapters::ConnectionManagement
middleware checks connections back in for you after the request, but clearly that doesn’t apply to your spec code.
There are 2 connections that need to be managed (assuming you app doesn’t spawn threads of its own): the one used by your spec code and the one used by the app. Normally the app code would be handled by the aforementioned middleware but this is deactivated in specs (if request.env['rack.test']
is true) because it’s normally not desirable in tests: if you are using the standard rails integration tests then everything executes in the same thread and so checking the connection back in at the end of the request would actually mess things up. There did’nt seem to be a good way of overriding this behaviour, so I ended up just sticking this in test.rb
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
|
This is a straight up copy of the original middleware except that it checks connections in even if rack.test
is set.
The other connection is used by your spec code. I haven’t come up with anything better than calling ActiveRecord::Base.clear_active_connections!
(aliased to something shorter to type) before calling visit
, click
etc. This could possibly be improved by magically calling it when the appropriate capybara driver methods are called. If you forget to do this then your spec code will hog the database connection and the capybara thread will hang and eventually raise an exception saying that it couldn’t checkout a connection.
We’ve been using this setup for a little while now and (fingers crossed) random request spec failures seem to be a thing of the past.