ActiveRecord nitpick – no constant iteration over select results

I need to iterate over a big set of data, and I want to do it without sucking the whole thing into memory. Unfortunately, as far as I can tell ActiveRecord does just that, which is a pity because it could probably be made to do something like this

Foo.find_and_loop(:all, :conditions => '....') do |res|

without too much trouble. You’d have to have some fancy interaction between AR and the connection adapters so as to run your loop at a low level, but it should be possible. All the databases that I looked at let you operate that way in their C API’s.

Doubtless, this is something “opinionated”, and of use to a small percentage of rails users, so it’s probably not worth submitting a bug report, but it’s a minor annoyance all the same.

Caching “streamed” content in Rails

In a previous article, I illustrated one way of generating streaming content programatically.

The application utilizing it is still slow in sending it, though, because it’s got to interact with the database, process the data, and so on. The ideal solution would be to cache it, however, the normal rails cacheing solutions don’t work, because the return value is a Proc, and if you try and write that to disk, it just writes the equivalent of proc_object.inspect, which doesn’t do me much good.

However, I figured out a clever way of running the exact same controller method “standalone”, and stashing the results to disk. I placed this file in scripts/cacher:

#!/usr/bin/env ruby

ENV['RAILS_ENV'] = ARGV.first || ENV['RAILS_ENV'] || 'development'

require File.dirname(__FILE__) + '/../config/boot'
require "#{RAILS_ROOT}/config/environment"
require 'console_app'

app.get '/mycontroller/big_results_method'

resp = app.controller.response"#{RAILS_ROOT}/public/cached_big_results.csv", 'w') do |out| resp, out

I think it could be improved, because it seems to be running the DB query twice – once during the actual app.get, and once again when the Proc is called. But, in any case, it works, and now I can call it from a cron job.