Thursday, June 25, 2009

Partial Function Invocation

My wife tells me that I often jump into an explanation by starting at the beginning of my train of thought without giving any indication of where I'm going. It would be better if I began with the point I'm trying to make, then explain how I reched my conclusion, How am I doing so far? Oh wait, right... Here is my conclusion:

Allowing a function to be partially invoked, to allow some of the arguments to be specified at different times, can allow for code which is more flexible than by just using objects or pure functions.

I've been thinking about this lately as I refactored the gdata-python-client which is a library which can be used with AtomPub services. I'll spare you all the gory details, and offer a simple example of how partial function invocation might come in handy.

When making a request to a remote server, you might need the following information, just for example: username, password. URL, message body, and content type. So we start out by writing a stateless function to take this information and open a connnecion to the server, format our inputs, transmit our request, and parse the response. We'll call it post, and using it looks like this:
serverResponse = post(url, data, contentType, username, 
password)
This is all well and good, but suppose the final request, as shown above, is preceeded by a whole series of function calls. Each function would need to dutifully pass along parts of the request. Say for example that the user types in their username and password long before the request is made, so these get passed as parameters to lots of functions which only receive them so they can pass them on. In addition, the username and password are almost always the same from request to request, so the same values are being passed to the post function over and over. In cases like this, we will often use an object to hold common values.
class Requestor {
username
password

method post(url, data, conteentType) {...}
}
Now our request will look like:
client = new Requestor(username, password)
serverResponse = client.post(url, data, contentType)
What we've effectively done here is specified some of the information in advance and left other pieces of information to be specified at the last minute. I would argue that this make the code better (cleaner, less chance of human error in listing lots of parameters, perhaps less data on the call stack, etc.).

Now the question becomes: Did we extract the right pieces of information from the function call into the object? Suppose the code you are writing needs to use a different password for each service you are making a request to, but the content type of the data is always the same. Then it would have made more sense to design our class like this:
class Requestor {
username
contentType

method post(url, data, password) {...}
}
Since we've established that not everyone who is using our code has the same usage patterns, lets design for utimate flexibility. Every parameter can be specified either in the object, or in the function call. Also, if the object has a parameter already, we could override it by passing in that parameter when we call the method. This is not too difficult in in Python, so here is a non-pseudocode example:
class Requestor(object):

def __init__(self, url=None, data=None,
content_type=None, username=None,
password=None):
self.url = url
self.data = data
self.content_type = content_type
self.username = username
self.password = password

def post(self, url=None, data=None,
content_type=None, username=None,
password=None):
url = url or self.url
data = data or self.data
content_type = content_type or self.content_type
username = username or self.username
password = password or self.password
# Now we have our inputs, code to make
# the request starts here
...
If you think this seems a bit excessive, I would agree. I didn't go nearly this far when designing the library that started me thinking about this. There was one request parameter in particular though that does use this pattern. (Five points to the first person to post it in the comments. ;-)

To use the above class, you would do:
requestor = Requestor(username='...')
requestor.password = '...'
...
server_response = requestor.post(url, data, content_type)
It will also handle our alternate usage where we want to give the password to the post method and set the content type at the object level:
requestor = Requestor(username='...', content_type='...')
...
server_response = requestor.post(url, data, password='...')
We can even override parameters which are set in the object:
requestor = Requestor(password='...', content_type='...')
requestor.username = '...'
...
# Override the content_type, just on this request.
server_response = requestor.post(url, data, content_type='...')
With the above example we end up with a lot of code just to let us specify each parameter in either the object or as a function argument. In fact, this can introduce so cases where the user forgets to specify in either, which is possible because all function arguments are now optional. Wouldn't it be better if we could instead specify some of the function parameters, pass the half-specified function call around, and fill in the ramaining values when we finally invoke. For this illustration, I'm using the following syntax to show a partial invocation, < > around arguments instead of ( ).
function post(url, data, contentType, username, password) {...}

started = post<username, password>
...
serverRespense = started(url, data, contentType)
Recall our case from earler, what if the contentType is constant but the password is instead more variable:
started = post<username, contentType>
...
serverRespense = started(url, data, password)
It turns out I'm not the first person to think of this pattern, not by a long shot. Functional programming often makes use of this pattern, referred to as function currying. I found the following example for Scheme which also shows how easy this is in Haskell. The prototype library for JavaScript includes a bind function which can accomplish the same thing. Here's a paper on the topic in C++: (pdf, Google cache HTML). I also found PEP 309 which was a proposal for this in Python. Perhaps I should have called my Python example above: Function Currying using Classes. If you can think of other examples, I'd love to see them.