Tuesday, April 22, 2008

Early vs Late Binding

I've been thinking recently about programming languages (surprised?), specifically about the things that make them different. One of the really nice things about C, is that it compiles into machine code which tends to run lean and mean. By that I mean it is blazing fast and doesn't take up much memory. On the other hand, programming in Python and JavaScript has really been growing on me. There is so much flexibility to create elegant solutions quickly and without rewriting lots of existing code. In fact, I'd say greater ability to reuse existing code is a natural outgrowth of programming language flexibility.

So where does this flexibility come from? One place I tend to notice it most, is in the ability to give an existing function a new body, in other words, you can plug in different behavior in place of the default.

Here's a simple example to illustrate the idea. Let's say that we created a simple checkout register which takes a receipt, adds the sales tax, and spits out the grand total. Here's our code foundation in both Python and JavaScript (these two examples do essentially the same thing):

Python:
def CalculateTax(amount):
return amount * 0.18

class Receipt(object):

def __init__(self, items=None):
self.items = items or []

def CalculateTotal(self):
return sum([item + CalculateTax(item) for item in self.items])
JavaScript:
function calculateTax(amount) {
return amount * 0.18;
}

function Receipt(items) {
if (items) {
this.items = items;
} else {
this.items = new Array();
}
}

Receipt.prototype.calculateTotal = function() {
var total = 0;
for (var i = 0; i < this.items.length; i++) {
total += this.items[i] + calculateTax(this.items[i]);
}
return total;
}
To use the above code, you might write something like this:

Python:
my_order = Receipt([5.50, 10, 7.89])
print my_order.CalculateTotal()
JavaScript:
var myOrder = new Receipt([5.50, 10, 7.89]);
alert(myOrder.calculateTotal());
Now let's say someone asks you to change the tax rate which is used when calculating the total. Here's the catch, you're not allowed to change the existing code. It turns out this is actually really easy. You can define a new function, then make an existing function name point to the new function. Here's an example of how to inject our new code:

Python:
def CalculateHigherTax(amount):
return amount * 0.25

CalculateTax = CalculateHigherTax

print my_order.CalculateTotal()
JavaScript:
function calculateHigherTax(amount) {
return amount * 0.25;
}

calculateTax = calculateHigherTax;

alert(myOrder.calculateTotal());
After adding the above code to the foundation we started with, you will notice that the calculate total method now uses calculate-higher-tax instead of the original function, even though you are calling the same method on the same object as before. Congratulations, you have just witnessed late binding in action.

So what is late binding? The idea is that the computer decides which code should be executed while the program is running. This seems normal in scripting languages, but compiled languages often use this too (I'm looking at you Java and C++). For example, overloaded methods and polymorphism take advantage of late binding. With late binding you can change the meaning of an identifier (for example, change the behavior when you call a specific function) at just about any time.

Now lets take a look at a language which uses early binding. C is a great example. With early binding, the meaning of things like function names are locked in when the code is compiled. There is no dynamic lookup while the program is running to see which code should be executed, instead the address of the desired code is embedded directly into the binary machine code.

Here is how the same calculate-total example might look in C:
#include<stdio.h>

float CalculateTax(float amount) {
return amount * 0.18;
}

typedef struct {
float* items;
int num_items;
} Receipt;

float CalculateTotal(Receipt this_order) {
int i;
float total = 0;
for(i = 0; i < this_order.num_items; i++) {
total += this_order.items[i] + CalculateTax(this_order.items[i]);
}
return total;
}

int main(void) {
Receipt my_order;
float my_items[3] = {5.50, 10, 7.89};
my_order.items = my_items;
my_order.num_items = 3;
printf("%f\n", CalculateTotal(my_order));
}
If you try to set CalculateTax to a new function definition, you will get an error at compile time because a function cannot be changed once it is bound. Early binding tends to produce more efficient programs. However, if you want to, you can still use the flexiblity available in late binding in C.

Using function pointers, you can store the address of the code that you want to be executed, and change the address while the program is running. We can achieve the same late binding effects that I've illustrated in Python and JavaScript by making some small changes to the C code (marked in bold below). Declare a function pointer named TaxCalculator which will store the address of the desired calculate-tax function, then change CalculateTotal so that it uses the TaxCalculator instead of directly calling a calculate-tax function.
#include<stdio.h>

float CalculateTax(float amount) {
return amount * 0.18;
}

float CalculateHigherTax(float amount) {
return amount * 0.25;
}


typedef struct {
float* items;
int num_items;
} Receipt;

float (*TaxCalculator)(float) = &CalculateTax;

float CalculateTotal(Receipt this_order) {
int i;
float total = 0;
for(i = 0; i < this_order.num_items; i++) {
total += this_order.items[i] + (*TaxCalculator)(this_order.items[i]);
}
return total;
}

int main(void) {
Receipt my_order;
float my_items[3] = {5.50, 10, 7.89};
my_order.items = my_items;
my_order.num_items = 3;
printf("%f\n", CalculateTotal(my_order));
TaxCalculator = &CalculateHigherTax;
printf("%f\n", CalculateTotal(my_order));

}
There you have it!

Here's another way to think about this comparison. In high level languages which don't expose pointers, functions, variables, and other identifiers actually act like pointers.

5 comments:

amcclosky said...

you know your a computer scientist when you read this article and think BUT WHAT ABOUT FUNCTION POINTERS then get a little happy when you discover that that was where you were going with the article. lol

Jeff Scudder said...

Yes, function pointers can come in quite handy :) Now all I need to figure out is just-in-time compilation of C code to be able to create something that is a mix between scripting language and compiled language. I'm sure there are some options out there. I've just started thinking about it.

amcclosky said...

have you ever seen this: http://www.cython.org/

I just came across it the other day.

Jeff Scudder said...

Actually, I have :) I tried it out a couple of weeks ago, but attempts to compile the generated C code resulted in errors because Python.h (and others?) were nowhere to be found. It looked interesting though, I should try to get it working, I'm probably missing something simple.

Naveen said...

Jeff - you should look at Objective C. It uses late binding when implementing objects, and you can override object methods in the same way you show for python/javascript, but otherwise everything is straight C (which can't be said of C++).