How Does An ORM Work?
13 Aug 2012
One of the things all of us use regularly and (most?) of us have no idea how it actually works is an
Object Relation Mapper or ORM. I think it is crazy that a piece that we rely on and couldn't work without anymore is so foreign to most of us. So I decided to change that.
And end of this wordy piece I want you to know how a ORM works. Of course you won't know all the details, like how some ORMs seem to be able to do more with the database that you even knew was possible. But some basic should be cleared up by then.
On DataMapper
For this, I decided to deep dive into
DataMapper. DataMapper is an alternative to ActiveRecord, but with less magic, which is great if you are trying to read something.
A basic DM model looks something like this:
class Duck
include DataMapper::Resource
property :id, Serial
property :name, String
property :hungry, Boolean
end
DataMapper::Resource is the base include that enables all the magic DataMapper functionality. Like in ActiveRecord, where you inherit from
ActiveRecord::Base.
The
property method is defined in
DataMapper::Model::Property.
This method does a lot of stuff! In fact, it spans from
line 27 to line 99! First up, it determines the class based on the type we provided in the class definition:
klass = DataMapper::Property.determine_class(type)
So if we define
property :name, String, it will determine that we want a String! Mind bending, isn't it? Afterwards, it will instantiate the class (in our case
DataMapper::Property::String). Then we get the now defined properties in the given repository (basically a "different" DB connection) and add our property to it. Easy.
Now we have some filler code until the end, with the exception of these two babies:
create_reader_for(property)
create_writer_for(property)
This (shockingly!) creates the readers and writers for the property we defined.
So what we know by now is that DataMapper keeps a list of all properties defined for a given class, so it can query, search, create etc. based on these values.
Querying DataMapper
So we can create a model, but that is really now all that a ORM does. With a ORM, we can of course query for data, and that is what makes it really interesting! So let's get to it!
Querying is done in
DataMapper::Model. Lets take an example query:
Duck.all
We find all on
lines 341 - 348.
def all(query = Undefined)
if query.equal?(Undefined) || (query.kind_of?(Hash) && query.empty?)
# TODO: after adding Enumerable methods to Model, try to return self here
new_collection(self.query.dup)
else
new_collection(scoped_query(query))
end
end
So we are generating a new collection with the given query. Cool.
Model#new_collection makes a new collection (
line 749 for the lazy ones). D'uh!
So what happens when we generate a new collection? This:
def initialize(query, resources = nil)
raise "#{self.class}#new with a block is deprecated" if block_given?
@query = query
@identity_map = IdentityMap.new
@removed = Set.new
super()
# TODO: change LazyArray to not use a load proc at all
remove_instance_variable(:@load_with_proc)
set(resources) if resources
end
(
On Github)
You might now realize: Nothing happens! No database connection is made, nothing! In fact, DataMapper doesn't actually perform the database call until really, really necessary! It just returns a "placeholder" collection if you want that has the information needed to actually perform the call.
One way to trigger that call would be
#first:
def first(*args)
first_arg = args.first
last_arg = args.last
limit_specified = first_arg.kind_of?(Integer)
with_query = (last_arg.kind_of?(Hash) && !last_arg.empty?) || last_arg.kind_of?(Query)
limit = limit_specified ? first_arg : 1
query = with_query ? last_arg : {}
query = self.query.slice(0, limit).update(query)
if limit_specified
all(query)
else
query.repository.read(query).first
end
end
Quite a bit of code here, luckily we can ignore most of it. For example the first few lines are just applicable if we actually pass in some options (which we won't do for now). The first really interesting line is line 16. Here we actually call
query.repository.read(query).first.
Here we call
repository#read. So lets see this in action.
def read(query)
fields = query.fields
types = fields.map { |property| property.dump_class }
statement, bind_values = select_statement(query)
records = []
with_connection do |connection|
command = connection.create_command(statement)
command.set_types(types)
# Handle different splat semantics for nil on 1.8 and 1.9
reader = if bind_values
command.execute_reader(*bind_values)
else
command.execute_reader
end
begin
while reader.next!
records << Hash[ fields.zip(reader.values) ]
end
ensure
reader.close
end
end
records
end
This is a lot simpler than it looks. It takes the query, constructs a select query and then executes it. Simple as that! In the end, it maps the values it got back from the Database into "real" objects and returns the given records to us.
So That's It
Lets review all that. So an ORM needs a structure to tell it how the database tables should be mapped to the object. DataMapper does this with the
DataMapper::Model#property method, where ActiveRecord reads this data from a schema file. In the end, both automatically generate setters and getters for the properties.
Then we have a layer that stores the queries, the results, etc. and constructs the right, database agnostic data so the query can be fulfilled.
And the third layer, where I really glossed over very shortly, is the database adapter (Do you want to know some more about the database adapter themselves? Then contact me
on Twitter or in the comments and tell me about it!). The database adapter knows how the query should be run in the database and knows how to transform the results back into the objects.
And that's the way we query a database!
If you liked this post, you should follow me on Twitter and signup for the RSS feed!