Archive for April, 2008

Using migrations to transform data

Monday, April 28th, 2008

Usually I use migrations to just add and remove columns. But today I needed to modify the database schema of a live app and should thus preserve the data already present. Although not hard implementing it revealed some pitfalls.

The task at hand was to augment the users in the system with roles. We had a column in the Members table which made a member a manager or not(boolean). This information is now extracted to a roles table (we are using RoleRequirement). It is now possible to assign many different roles to each member.

The migration I wrote at first looked something like this:


def self.up
r = Role.new(:name => "manager"
r.save
remove_column :members, :manager
managers = Member.find_all_by_manager(true)
managers.each do |manager|
manager.roles << r
manager.save
end
end

However this migration failed on the manager.save, complaining that the column manager on the table members no longer existed. And it didn’t of course. The column had just been removed! I figured that the attributes in the manager objects still contained the :manager => true key-value pair. So I deleted that but the error persisted. Which makes sense when I thought about it. The migration loaded the Rails framework with all inferred classes. One of these classes was the Member class which contained at the time of loading still a manager field. Trying to save the member without one of the attributes causes an error. Reloading framework seemed a bit overkill as well.

In the end the solution was, as most of the time, deceptively simple. Remove the column at the end of the migration!


def self.up
r = Role.new(:name => "manager")
r.save
managers = Member.find_all_by_manager(true)
managers.each do |manager|
manager.roles << r
manager.save
end
remove_column :members, :manager
end

But how do you test your migration? I took a stab at this in the form of an RSpec test:


describe Member, "should migrate properly" do
it "should migrate correctly from a column manager to a role manager" do

`rake db:drop`
`rake db:create`
`rake db:migrate VERSION=59`
m1 = Member.new(:manager => true)
m1.save_with_validation(false)
m2 = Member.new(:manager => true)
m2.save_with_validation(false)
m3 = Member.new(:manager => true)
m3.save_with_validation(false)
`rake db:migrate VERSION=60`
Member.find(m1.id).has_role ?("manager").should be_true
Member.find(m2.id).has_role?("manager").should be_true
Member.find(m3.id).has_role?("manager").should be_true
end
end

But this does not work yielding a MySQL timeout. This has to do with the fact that all tests are run in transaction mode preventing other database changes during the tests. This particular test waits for the lock to be released which never happens and thus times out.

I have yet to figure out how to do this, but I think this is really important! I would love to be able to test my migrations as these migrations manipulate live data which can not be lost. Any suggestions are welcome.

BackgroundRB versus Beanstalkd

Thursday, April 24th, 2008

This seems to be a recurring problem. Sending a lot of emails from Rails is still a bit murky. You can’t really use ActiveMailer for this because when you are trying to send 12.000 emails the request does not return a respond until all the 12.000 emails are sent. This leads to confused users who click the button again, timeouts etc etc. A general mess. You could use a cron job but that does not make that much sense as you would expect the emails being send as you press a button and not when a cron job comes along. Again confused users.

What you really want is to be able to background tasks and be done with it! For this there are many options. Unfortunately. But I believe the two candidates worth looking at are:

Beanstalkd is being used on Facebook(need I say more?). BackgroundRB enjoys more attention and has recently been rewritten and thus presumably increased in quality. The first is not as tightly coupled with Rails as the latter but that can be amended with a plugin.

Then there is the matter of compiling beanstalkd… That was no fun at all. Well maybe I made it too hard for my self and I did not exactly knew how to manipulate ‘make’. But I wanted the latest version(0.11) of beanstalkd and MacPorts only had 0.9 in stock. So I downloaded the source, followed the readme and hit make. Of course that didn’t fly. I needed to point the makefile to the correct lib directory. Long story short, I did install beanstalkd through port and took the makefile from that install and patched the makefile in the source download. Here is the patched makefile. This worked like a charm! However this made me a bit concerned how up to date the packages on our Debian staging server were. It turned out that I needed to install both the dependencies and the actual beanstalkd from source. I’d prefer gems then.

For more information on beanstalkd I gladly refer to this blog post as I decided after this ordeal that I would go forward with backgroundrb. This decision was mainly based on the better logging abilities of backgroundrb and the from source install as opposed to some gems.

A parting thought; look at the footer of backgroundrb’s website. Notice the link called Ginger Ninja? Click it. I think it’s at least weird and perhaps doggy.