I wanted to write a loop in Ruby, to show all numbers from 0 to 10:
i = 0 while i < 10 puts i i++ end
The above seems reasonable for any non-Rubyist, but when you run it you get:
syntax error, unexpected end. Of course, because Ruby lacks the increment
i++) which exists
in pretty much any other language.
Anyone would advice me to rewrite the loop like this:
10.times do |i| puts i end
And that’d be the right thing I guess. But what fun is it?
A proper hacker wouldn’t settle for the right thing, and I want to be a proper hacker. So let’s modify Ruby’s interpreter to support my crappy loop!
How deep do we have to dig?
Let’s explore how this could be achieved. A first naive attempt could look like this:
class Integer def ++ self = self + 1 end end
But quickly we realize we can’t define a method like that. Meta-programming to the rescue:
class Integer define_method '++' do self = self + 1 end end
However that doesn’t work either, for several reasons.
First, we’ll have a bit of
trouble invoking that method because of its name. But most importantly Ruby
won’t allow us to modify self, this’ll fail
Can't change the value of self.
How’s that? Can’t an object modify itself?
It sort of can, but really it can’t. An object can modify any of its instance variables, sometimes a string can modify itself.
But it turns out that numbers aren’t even stored like other objects in Ruby, but they’re immediate values. That is variables storing numbers don’t hold a reference to a generic object but they hold the value directly. The same happens for other object types such as symbols, for example.
The only way we can change the value of a numeric variable is by reassigning it.
As deep as the parser
Eventually I found the solution by looking at the similar
I tried to find out how that was being implemented in Matz’s Ruby Interpreter
and it turns out that internally it’s just converting
i += 1 to the same
i = i + 1. That’s great, we can do the same for
Knowing that, we just have to update the parser so the following 3 statements are converted to the same AST:
i = 1 + 1 i += 1 i++
The patch on MRI boils down to
- Adding support so the new operators can be tokenized,
++will be tokenized as
--will be tokenized as
- Adding parser rules so an expression like
var_lhs tINCRis reduced to the same as
var_lhs tOP_ASGN arg_rhswhen
I would never have come up with this patch if it wasn’t because I had read the book Ruby Under a Microscope, which is an excellent resource to start diving into MRI’s source code.
In particular, the first chapter where it explains how the program is tokenized and then parsed was very relevant for this task.
Now it’s possible to run our previous code!
And because it’s just the same as invoking
+= 1 you can do stuff like:
d = Date.new(32) d++