featureenvy

The blog of a thinkerer.
By @featureenvy

All rights reserved.

How Does An ORM Work?

One of the things all of us use regularly and (most?) of us have no idea how it actually works is an Object Relation Mapper or ORM. I think it is crazy that a piece that we rely on and couldn't work without anymore is so foreign to most of us. So I decided to change that. And end of this wordy piece I want you to know how a ORM works. Of course you won't know all the details, like how some ORMs seem to be able to do more with the database that you even knew was possible. But some basic should be cleared up by then.

On DataMapper

For this, I decided to deep dive into DataMapper. DataMapper is an alternative to ActiveRecord, but with less magic, which is great if you are trying to read something. A basic DM model looks something like this:
class Duck
  include DataMapper::Resource

  property :id, Serial
  property :name, String
  property :hungry, Boolean
end
DataMapper::Resource is the base include that enables all the magic DataMapper functionality. Like in ActiveRecord, where you inherit from ActiveRecord::Base. The property method is defined in DataMapper::Model::Property. This method does a lot of stuff! In fact, it spans from line 27 to line 99! First up, it determines the class based on the type we provided in the class definition:
klass = DataMapper::Property.determine_class(type)
So if we define property :name, String, it will determine that we want a String! Mind bending, isn't it? Afterwards, it will instantiate the class (in our case DataMapper::Property::String). Then we get the now defined properties in the given repository (basically a "different" DB connection) and add our property to it. Easy. Now we have some filler code until the end, with the exception of these two babies:
create_reader_for(property)
create_writer_for(property)
This (shockingly!) creates the readers and writers for the property we defined. So what we know by now is that DataMapper keeps a list of all properties defined for a given class, so it can query, search, create etc. based on these values.

Querying DataMapper

So we can create a model, but that is really now all that a ORM does. With a ORM, we can of course query for data, and that is what makes it really interesting! So let's get to it! Querying is done in DataMapper::Model. Lets take an example query:
Duck.all
We find all on lines 341 - 348.
def all(query = Undefined)
  if query.equal?(Undefined) || (query.kind_of?(Hash) && query.empty?)
    # TODO: after adding Enumerable methods to Model, try to return self here
    new_collection(self.query.dup)
  else
    new_collection(scoped_query(query))
  end
end
So we are generating a new collection with the given query. Cool. Model#new_collection makes a new collection (line 749 for the lazy ones). D'uh! So what happens when we generate a new collection? This:
    def initialize(query, resources = nil)
      raise "#{self.class}#new with a block is deprecated" if block_given?

      @query        = query
      @identity_map = IdentityMap.new
      @removed      = Set.new

      super()

      # TODO: change LazyArray to not use a load proc at all
      remove_instance_variable(:@load_with_proc)

      set(resources) if resources
    end
(On Github) You might now realize: Nothing happens! No database connection is made, nothing! In fact, DataMapper doesn't actually perform the database call until really, really necessary! It just returns a "placeholder" collection if you want that has the information needed to actually perform the call. One way to trigger that call would be #first:
    def first(*args)
      first_arg = args.first
      last_arg  = args.last

      limit_specified = first_arg.kind_of?(Integer)
      with_query      = (last_arg.kind_of?(Hash) && !last_arg.empty?) || last_arg.kind_of?(Query)

      limit = limit_specified ? first_arg : 1
      query = with_query      ? last_arg  : {}

      query = self.query.slice(0, limit).update(query)

      if limit_specified
        all(query)
      else
        query.repository.read(query).first
      end
    end
Quite a bit of code here, luckily we can ignore most of it. For example the first few lines are just applicable if we actually pass in some options (which we won't do for now). The first really interesting line is line 16. Here we actually call query.repository.read(query).first. Here we call repository#read. So lets see this in action.
      def read(query)
        fields = query.fields
        types  = fields.map { |property| property.dump_class }

        statement, bind_values = select_statement(query)

        records = []

        with_connection do |connection|
          command = connection.create_command(statement)
          command.set_types(types)

          # Handle different splat semantics for nil on 1.8 and 1.9
          reader = if bind_values
            command.execute_reader(*bind_values)
          else
            command.execute_reader
          end

          begin
            while reader.next!
              records << Hash[ fields.zip(reader.values) ]
            end
          ensure
            reader.close
          end
        end

        records
      end
This is a lot simpler than it looks. It takes the query, constructs a select query and then executes it. Simple as that! In the end, it maps the values it got back from the Database into "real" objects and returns the given records to us.

So That's It

Lets review all that. So an ORM needs a structure to tell it how the database tables should be mapped to the object. DataMapper does this with the DataMapper::Model#property method, where ActiveRecord reads this data from a schema file. In the end, both automatically generate setters and getters for the properties. Then we have a layer that stores the queries, the results, etc. and constructs the right, database agnostic data so the query can be fulfilled. And the third layer, where I really glossed over very shortly, is the database adapter (Do you want to know some more about the database adapter themselves? Then contact me on Twitter or in the comments and tell me about it!). The database adapter knows how the query should be run in the database and knows how to transform the results back into the objects. And that's the way we query a database! If you liked this post, you should follow me on Twitter and signup for the RSS feed!

Why Having A Vision Is More Important Than Anything Else

Do you know why we advance slower than we have some decades ago? Do you know that we went from the planning stage to the moon in 11 years? That is if you consider the time from the creation of NASA to the time we actually landed on the moon. But I think the most important part was the speech that President Kennedy gave*. He said
First, I believe that this nation should commit itself to achieving the goal, before this decade is out, of landing a man on the Moon and returning him safely to the Earth
Which is, as someone from that time would have put it: "Thy be insane?". Or something like that. Now we know better, of course, because the US had this vision. So how will having a vision help you? * Actually, the most important thing was probably the Germans. Because both the Soviet Union and the United States took, after World War II, rocket engineers from Germany to kick-start their endeavors. Yes, the famous rocket we all saw flying around were based on Nazi technology. So yeah…

So, A Vision, Any Vision?

Of course, not any vision will do. For example, saying "I will be one of the people in my circle that some people will know" isn't a good vision. So isn't "I will be the biggest site on the internet in 3 months". The first one isn't really something measurable, and, honestly, lame. And the second one isn't really possible to achieve. What makes a good vision? A good vision is measurable. For example, "I will have 2000 subscribers by the end of the year" is simple to understand, measurable, and we always know where we stand. But it isn't a vision yet, it is just a goal. A vision should be something that can never really be reached, but motivates us to continue working. A vision is something that we can strive for, something that is measurable, and something that is barely reachable. So, a good vision could be:
We will become the number one payment processor in the world with customers that just love us!
Ok, the love part is a bit hard to measure, but it is a good guideline. And I'm sure we could find some grad students who would love to study, well, love. Should we give this customer a refund? Without a vision, we don't know, and to some we may give, to others we won't, which will confuse the customers even more. But with that vision? The action is clear!

A Good Vision Is A Good Guide

A vision isn't just for suits, or for people who want to sell something to suits (read: a company). No matter what your project is, I will bet that having a vision for it will help you immensely. For one, we won't have to ask if we should show an error message, or if it won't annoy users more if we actually tell them that they are stupid. If the vision is to build the most robust, secure system, then yes, we should show a general error message. If the software should be the most usable piece of software ever written, then we have to redesign it so that this error can't possibly happen. If we write a command line utility for Linux, we don't even have to care about that. :) A good vision answers probably 90% or more of all the small questions we have while developing a new project.

Sit Around, It's Story Time!

For example just recently I started on a new project. The first thing I did was asking for a vision. And they just repeated: "Oh yeah, build this basic thing here, we can try to analyze the rest later, it will be fine." So I started implementing it. And lo and behold, it wasn't what they wanted. It showed error messages that were clear to us, but not to the expected user. It required things which seemed obvious to me, but which weren't actually needed. At all. Even though it technically did everything they asked, it just didn't match with their vision. And yes, by now I do have a vision for the project. Most people actually have a vision in mind, but they somehow don't want to spell it out. It's crazy!

Vision == Motivation

A vision keeps us motivated. That's all. Lets go to the next paragraph. Oh, you want some more details? Ok. Well, it's simple. We can code, and code, and code, but we will never "get there", we will never "get closer to the finished product". We will always be stuck in a rut. Why? We reach the next goal, so our boss just figures out a new one for us. It sucks! We don't see the end of the tunnel. But a vision changes that! With a vision, we suddenly have something to strive for. The goals suddenly aren't arbitrary points that the boss sets for us, but are actually real points that bring us closer to the ultimate goal. Which is really what we want. Because then, at the end of the day, we can say: "Yeah, I got closer to the vision, and people are even more happy!"

Steve Jobs, A Visionary

One of the things that I think is just amazing about Steve Jobs is that he always had a vision. Maybe his private vision was to be a jerk and still be considered a successful, beloved business man, and if it was, he would have succeeded. But even more amazing are his visions for the company. For example, before the iPhone, no one thought this will be possible. And Steve Jobs just said: "It is, do it!". Or as Steve (being a lot more eloquent than I am) would have put it:
Do we have what it takes to establish a third category of products? The bar is pretty high. It has to be far better at doing some key things. We think we have the goods. Our most advanced technology in a magical and revolutionary device at an unbelievable price.
Now that's a vision! If you liked this post, you should follow me on Twitter and signup for the RSS feed!

What you should know about SOLID

Did you always wonder what that whole SOLID thing is all about? Why everyone is shouting at you for violating the Law of Demeter (you won't go to jail for that one, by the way)? And did you talk about soccer when people mentioned the Liskov substitution principle and looked at you confused? Then read on! I always like going to the SOLID wikipedia page because SOLID is a mnemonic acronym. SOLID stands for Single responsibility, Open-closed, Liskov substitution, Interface segregation and Dependency inversion. Makes sense yet? Lets briefly go over each one.

Single Responsibility Principle

The Single Responsibility Principle says that every class should have one job, and one job alone, but do it really, really well. If you want to know if your class is violating SRP, describe what the class does. If you use and or or in the sentence, it is a strong sign that you violate the SRP principle. Let's look at an example of a duck:
class Duck
  attr_accessible :name

  def eat
    @food -= 1
    remove_food
  end

  def swim
  end

  def remove_food
  end
end
A very simple class that does one thing: Keep track of a duck (in a pond for example). The problem with this example is that if we want to describe it, we would probably say the following: It keeps track of a duck and keeps track of the food. Yes, the dreaded and. So how would we rewrite this example? One way to encapsulate this better is to remove the food in a Food class and not manage the food supplies in the Duck itself. SRP also says that the model should hide the implementation details. That's why I use in the example eat instead of grab_food. How we eat is an implementation detail of the Duck model itself. If we want to change how food is handled, the only thing we should have to change is the Duck class itself, and nothing else.

Open/Closed Principle

The next principle is about open/closed classes. But you might now way "But Codepoet, in Ruby all classes are open!". And you would of course be right. But that's not the whole point of the open/closed principle. If we want to follow this principle, all classes cannot be changed once they implementation is completed, except for bug fixing. If the behavior should be changed for one part of the system, the class should be open, however, to be extended in another class. This is one of those principles which is sometimes hard to follow, since requirements change, and programs evolve. So sometimes a class has to be changed, or added some functionality. Always making a new class just isn't feasible. However, keeping this principle in mind is a good thing, especially with Ruby where we have easy ways to extend classes and even objects. Maybe some methods shouldn't be added to the base class, since most of the system doesn't need them, but just to one part of the system.

Liskov Substitution Principle

LSP (not to be confused with LSD) states that any object inheriting from another has to keep the constraints of the superclass intact. This includes preconditions (this value can't be null), postconditions (return value is rounded to an integer) and invariants. This sounds too abstract? I agree, so let's make an example: Given we develop a system to measure ducks. So of course we have a RealDuck class that can swim, eat, quack, etc. Swim means we give it a value, it proceeds the given distance, and then returns. If we then ask it for the position, it will have advanced the given distance. Easy as pie! Now we want to compare the real ducks with a mechanical replacement duck (the reason to do this is left to the reader). So we subclass it, add a new method replaceBatteries, and we are done! But not quite! Even though this looks perfect on paper, we just violated Liskovs Substitution Principle. Why? Because now, if the MechanicalDuck swims, it might not get a single feet ahead because the battery is empty. But that's an invariant that the RealDuck guarantees! (Except if you insure the duck before, but come on!) Most of the violations of LSP are small, but can have a huge rippling effect if the rest of the system relies on a different behavior.

Interface Segregation Principle

Another principle with an impressive name. But this one is simple to understand: Design your interfaces so that they deal with one thing, and only one. Sounds familiar? It should, because it is similar to the Single Responsibility Principle discussed above. The difference between the two is that SRP deals mostly on the class level, while ISP is concerned mostly about the overall architecture of a system. If you follow this principle, then your interfaces will be small and to the point, without anything in there that can't be separated out. Lets look at the RealDuck and the MechanicalDuck again. The duck has to eat, of course, and that's why it is in the class. But a mechanical device doesn't have to eat! So it doesn't make any sense that the mechanical duck has an eat method. So the interface is too large for this use case, and it should be refactored out.

Dependency Inversion Principle

The Dependency Inversion Principle states that high-level and low-level modules shouldn't be tightly coupled together, and that they can be replaced. This is especially useful for testing the higher-level components, because the lover-level modules that they depend on can simply be replaced by mock modules. In Ruby, one way to achieve this goal is to add an optional argument to initialize where the dependency can be replaced. To give an example:
class Duck
  def initialize(food_source = DefaultFoodSource.new)
    @food_source = food_source
  end
end
This way we can add another food source if we want (for example, an InfiniteFoodSource), so that we can test the duck in isolation.

And That's It!

Yes, that's it. We are done. No more talking about sports when Liskovs Substitution Principle comes up. No more hand stands when someone talks about Dependency Inversion. Nice! Though always keep in mind: These are just guidelines. Principles. They aren't law. If you have a good reason to go against one of these principles, then do it! If you liked this post, you should follow me on Twitter and signup for the RSS feed!

"Working with Unix Processes" Review (By Jesse Storimer)

The last thing I want to do is dealing with low-level operating system things. That's one of the reasons why I like Ruby, because it keeps me from having to deal with all that. So why the hell would I read "Working with Unix Processes"? Well, first, it was the book of choice for the (fantastic) Ruby Rogues podcast, episode 058. And second: It was a really amazing ride! Jesse managed to get down to the lowest level and still managed to explain everything in terms I actually understand! You can get the book on Amazon for around 17.00€ ($21.00). Side note: "Working with Unix Processes" was first self published by Jesse Storimer over at workingwithunixprocesses.com, but has now been picked up by The Pragmatic Programmers.

Content

Working with Unix Processes concentrates on the rarely used Ruby module Kernel#fork and its companions. Jesse Storimer starts out with a gentle introduction to processes, process IDs and everything else you didn't know you wanted to know about processes. Did you know, for example, that processes belong to a process group? And that a whole process group can be killed at the same time? And that that's basically how a terminal emulator works? No? Neither did I! After he explains everything about processes, he goes on and explains that processes can fork. This is actually a very important feature of all operating system. All processes are spawned from one process, which is created at boot. And so he introduces us to process parents, children, and orphaned processes, and what that means. After forking, the next important thing is joining processes together again (for example to enable multicore processing). Of course, now that we have multiple processes, another important aspect is communication between processes. This is done either through signals or interprocess communication. He then ends "Working with Unix Processes" with an extensive Appendix, where he digs deeper into the inner workings of such gems as Resque or Unicorn.

The good

One part I loved about the book is that each chapter has a short note on "Use in the real world". Here Jesse Storimer highlights projects that use the aforementioned technique. Another nice thing he did was include a Spyglass server. This is a heavily documented server in Ruby that he developed specifically for the book, to highlight how his techniques can be leveraged. Reading through it is pure joy! A very nice feature of the book is that it is a living book. Since Jesse released it (Dec 20, 2011) he released 14 new version (some just fix typos, but others added full new chapters or new and extended explanations for the things he explained.

The bad

For one, the name. It is called "Working with Unix Processes", but a better name would have been "Working with Unix Processes in Ruby". A lot of the advise he gives is very Ruby specific, and all examples are in Ruby. I'm not saying it isn't applicable if you use other languages, but this book focuses on Ruby. Of course, this was the audience he had in mind, but it is something that everyone should remember when buying this book. I am also not sure what the target audience for the book is. A system programmer (which doesn't know Ruby) doesn't need to be told how fork works, or what process IDs are. For a Ruby programmer that just tries to leverage some more programming near the system, the book was a bit too terse, and the examples were too abstract.

Conclusion

Besides its shortcoming, I found it a great book to read. You can read it in 3 hours (aka a rainy Sunday afternoon) and get through it pretty quickly. If you want to dive deeper, the source code to Spyglass and the other mentioned open source projects are right at your fingertips. I recommend this book to everyone who is interested in a more systems programming approach, and wants to know the basics so he can see when they could help. Ruby knowledge is required though, or else the book won't make that much sense. The topic sounds boring, mundane even, but I think if you are at least a bit curious you should give it a chance and read it. Sparked your interest? Then don' wait and get the book! If you liked this post, you should follow me on Twitter and signup for the RSS feed!

How do CSS grids work?

When I started with a new web page I, like (probably?) most of us, didn't start from the very beginning. Why? Because there is a lot of ceremony involved in setting up the basic structure of a web app. We need a navigation. A content area. A header. A footer. And we probably arrange them in a grid. And of course we need to apply standard styling to buttons, forms, tables, and typography. Luckily, we have CSS frameworks that take out most of the guess-work and lay a good foundation for us. While it works, I always wondered how to these grids work? Is it magic?

It' (not) magic!

Let me give you an example. In Twitter's Bootstrap we can arrange content with a few divs:
<div class="row">
  <div class="offset2 span4">
    I'm offset two columns and four columns wide
  </div>
  <div class="span4">
    I'm also four columns wide
  </div>
</div>
<div class="row">
  <div class="offset2 span8">
    And I'm on a new row!
  </div>
</div>
Other frameworks use similar syntax. So... how do they work?

Bootstrap, 960 Grid System

There are a few CSS frameworks around, but I will compare Twitter's Bootstrap and 960 Grid System. I already showed a basic example with Bootstrap. Now let's compare it with 960.gs:
<div class="grid_4 prefix_2">
  I'm offset two columns and four columns wide
</div>
<div class="grid_4">
  I'm also four columns wide
</div>
<div class="clear"></div>
<div class="grid_8 prefix_2">
  And I'm on a new row!
</div>
It works similar to Bootstrap, in that we define the length with "grid_4" compared to "span4". Also we can define the offsets with "prefix_2" ("offset2" in Bootstrap). But there is a difference how we define a new row. Whereas Bootstrap uses an outer div with class "row", 960.gs relies on a clearfix div on the same level. Here I like Bootstraps approach better, because it is easier to walk through the HTML when I can collapse a whole row and don't have to search on possibly hundreds of divs on the same level to find the cell in the middle of the document.

How these grids actually work

Thanks for all the syntax, but I still didn't talk about how the magic happens. Let's look at Bootstrap first.

Bootstrap SPAN

If we take a look at bootstraps source code, we can find a file called grid.less*. The only thing this file does is call the #grid core mixin. There is also an import for the responsive version of the grid, but we will ignore that one for this walk-through.
// Fixed (940px)
#grid > .core(@gridColumnWidth, @gridGutterWidth);
(See also on Github) We can find the core mixin under, surprise, surprise, mixins.less. Bootstrap defines a mixin called .spanX (@index). It recursively creates all the .span12 .span11 etc. classes in the compiled CSS.
    .spanX (@index) when (@index > 0) {
      (~"[email protected]{index}") { .span(@index); }
      .spanX(@index - 1);
    }
    .spanX (0) {}
(On Github) The line (~"[email protected]{index}") { .span(@index); } uses Javascript Evaluation to define the class. The rules will be the LESS mixin .span (line 2), so lets look at this next.
.span (@columns) {
    width: (@gridColumnWidth * @columns) + (@gridGutterWidth * (@columns - 1));
}
(On Github) This sets the width of the .spanX to the width of the columns plus the gutter in between the columns it spans. But wait! That can't be all! We can define multiple spans in a single row! So we have some more CSS!
[class*="span"] {
      float: left;
      margin-left: @gridGutterWidth;
}
(Github) All classes that start with .span will be floated to the left. Additionally, the gutter is applied as margin on the left, so the columns have a bit distance between each other. So that about covers the .span. What about .offset?

OFFSETing Bootstrap

Offsetting works roughly the same as spanning the grid elements:
.offsetX (@index) when (@index > 0) {
      (~"[email protected]{index}") { .offset(@index); }
      .offsetX(@index - 1);
    }
    .offsetX (0) {}

    .offset (@columns) {
      margin-left: (@gridColumnWidth * @columns) + (@gridGutterWidth * (@columns + 1));
}
(Github) Again, we have the .offsetX-mixin that creates the possible offsets. This will load the .offset mixin (note the missing 'X'), which calculates the, well, offsets (highlighted lines). All the classes I used in the example except .row have been explained. Let's look at the rows.

Bootstrapping rows

This one will be easy:
.row {
      margin-left: @gridGutterWidth * -1;
      .clearfix();
}
(Github) It adds a margin to each row and then loads the .clearfix mixin. The clearfix code requires some trickery since clearfix and browsers tend to hate each other (as described in the comment).
.clearfix {
  *zoom: 1;
  &:before,
  &:after {
    display: table;
    content: "";
  }
  &:after {
    clear: both;
  }
}
(Github)

The Bootstrap Conclusion

So how does bootstrap generate the grid? Four steps are needed:
  • Create all possible .spanX classes, where the length is calculated based on column width and gutter. Also float them to the left.
  • Create all possible .offset classes with the length calculated from column width and gutter.
  • Create a .row class that acts as a clearfix.
And now we have a grid system!

So what about 960.gs?

I didn't forget! So let's do a check if they use the same system. Go to the 960.gs demo page and see for yourself. .grid_2 is defined as width: 140px and all grid classes are floated to the left and have a gutter applied. The only difference is that 960.gs seems to apply the gutter to left and right, whereas Bootstrap just applies it to the left margin. The "clear" class is a simple clearfix (who would have thought?). The .prefix_X classes have a .padding applied to the left. Again a small difference, 960.gs applies a padding, but Bootstrap relies on margin. So which one is right? ;)

So there you have it

Both systems work roughly the same, with small differences. Both use floated cells with a clearfix to make rows. Both define a cell length and a cell offset. One uses margin, the other padding. Did I miss something interesting? And are there any CSS frameworks out there which use a different system? * Which means Bootstrap is written with LESS, a CSS preprocessor like SASS.