Thursday, 31 January 2008

C++ pocket lambda library, part 2


In the previous post in this series* I promised to tackle some more advanced topics from my pocket lambda library. To start with something, let us gratuitously choose "currying" first. Do you know what currying is? No, it has nothing to do with food, but rather with an american mathematician named, ehm..., Haskell B. Curry. In his formalization of the notion of a computable function (λ-calculus) he used (for simplicity's sake) only functions with one parameter. Come again? What about all the other functions? For this he used a trick: he defined a function which returns a new function where the first argument is fixed, and the second one remains unbound. When used recursively, we can reduce any function of n-arguments to n functions of 1 argument! Mathematics can be as cool as programming sometimes!

I bet you've seen (and used) it before! Yes, STL provides bind1st and bind2nd functions, which are doing exaclty that: currying! In other languages like Python, Perl, Ruby, Haskell, etc, it's possible too, though sometimes through a library extension. Groovy supports IMHO the most explicite (and cool) form:
    def triple_it = {x,y -> x*y }.curry(3);

1. Our task


What we want to do is to extend the STL binder such that they would work the _$1 placeholders. I think of something like the following:
    for_each(vec.begin(), vec.end(), bind(countVector, 'X', 1.222, _$1));
    for_each(vec.begin(), vec.end(), bind(countVector1, _$1, 1.222, 'Y'));
    for_each(vec.begin(), vec.end(), bind(countVector2, _$1));

    for_each(vec.begin(), vec.end(), bind(pr_sinus, _$1*2));
    const float pi = 3.14; // well, not exactly ;-)
    for_each(vec.begin(), vec.end(), -bind(sinus, _$1*(pi/180.0)) ); 
We assume here that the bind function is pulled in by the appropriate namespace definition, as I want avoid to have to write it like kmx::lambda::bind(sinus, kmx::lambda::_$1). In fact, I never tried out the fully qualified form, shame on me :-(((.

How can we do that? Well there's no need to reinvent the wheel - Boost library already has it, and it's even a part of the C++0x TR.1 standard library extension (tr1::bind). So we can look at the code, or even read about the techniques needed in the implementation**. So if you want, you can spare yourself this post and read ** instead!


2. Basic mechanisms


Well, here is the bind function for the 2 arguments case:
    // 2 args
    template <class F, class A1, class A2>
        binder<F, list2<A1, A2> > bind(F f, A1 a1, A2 a2)
    {
        typedef list2<A1, A2> list2_type;
        list2_type list(a1, a2);
        return binder<F, list2_type>(f, list);
    } 
What is it doing? Basically it stores the arguments to be bound in a custom argument list (class list2<>). Then the function and its arguments are used to create the instance of the actual binder class:
    template <class F, class List> class binder : public lambda_expr
    {
        F m_f;
        List m_list;

     public:
        binder(F f, List list): m_f(f), m_list(list) {}

        // OPEN TODO --> only working for void funct!!!
        template <class A1> void operator() (A1 a1)
        {
            list1<A1> list(a1); // store the actual arg
            m_list(m_f, list); // forward to curried funct: m_f+m_list
        }
    }; 
What is it doing now? It's a functor accepting (in the H.B.Curry's true manner) only one argument. Which one? In the context of an STL algorithm it will be the current value of the iterator. So our task is to sort out to which of the arguments of a function this iterator is bound. Take our previous example: for_each(vec.begin(), vec.end(), bind(countVector1, _$1, 1.222, 'Y')). Here the iterator has to be bound to the first argument of the function - 1.222, and 'Y' to the second and third ones. The binder object stores the actual argument in the custom argument list (class list1<>) and invokes the function m_f in the context of the bound arguments m_list, remembering the current argument in the custom argument list class list1<>. All this is accomplished by heavy use of the listN<> classes, to which we turn our attention now:
    template <class A1, class A2> class list2
    {
        // args: one of them is a placeholder!
        //  -- list1 will detect it!
        A1 m_a1;
        A2 m_a2;
     public:

       list2(A1 a1, A2 a2): m_a1(a1), m_a2(a2) {};

       // apply to arguments and placeholders!
       template <class F, class List1>
           void operator() (F f, List1 list1)
           {
               f(list1[m_a1], list1[m_a2]);
           }
    }; 
Well, list2<> doesn't do that much, it invokes the curried function, but the detection of bound arguments is redirected to the most important class here: list1<>! Let us have a look at it:
    template <class A1> class list1
    {
        A1 m_a1;

     public:
        explicit list1(A1 a1): m_a1(a1) {}

        // if placeholder:
        A1 operator[] (placeholder<1>) const { return m_a1; }

        // not placeholders: return the ball
        template <class T>
            T operator[] (T val) const { return val; }
    }; 
Well, this isn't very complicated: complier will detect when a placeholder was placed and use then the current argument (i.e. the iterator). In other cases, the stored argument value will be used, i.e. the value used to curry the function f. As the value is stored in the listN<> class, it is done via a trick: the listN<> class tries to access the placeholder's _$1 value by using the index operator of the list1<> class, but in non-placeholder cases the index value is simply returned to listN<> to be used as n-th argument (a kind of rebound!).

For the sake of completeness, i.e. in case of bind(sinus, _$1) (who'd need something like that?), we need the function call operator in the list1<> class:
    template <class A1> class list1
    {
       ...
       // apply to arguments and placeholders!
       template <class F, class List1>
           void operator() (F f, List1 list1)
           {
               f(list1[m_a1]);
           }
    };
To bind 3-argument functions we need two more templates to be added: the list3<> class and the bind(F f, A1 a1, A2 a2, A3 a3) function, for 4 arguments another two, and so on, and on... But I don't think anyone should need more than 5 arguments in their functions, do you?

3. Extensions


With the above techniques we can write bind(countVector1, _$1, 1.222, 'Y') but not bind(countVector, _$1*2) or bind(sinus, _$1*(pi/180.0)). What can we do about this? Above we wondered about the bind(sinus, _$1) case, but here we can use it to our advantage. We extend the list1<> class like that:
    template <class A1> class list1
    {
       ...
        template <class T>
            T operator[] (Mul<T>& e) const { return e(m_a1); }
   };
So we can see that the list1<> class is indeed a central one! We extended its rebound mechanism to support the multiplication. Oh, uhm, that wasn't pretty! Why couldn't we simply use a general forwading here? The reason is, I left is as a TODO in code (shame on me :-((() but never came around to implement this. I tried the following:
    A1 operator[] (lambda_expr& e) const { return e(); }
but I didn't finish it, and I don't remember if it worked and if it didn't, why. It was almost one year ago and I don't remember all the details, sorry. OK, that could be done better, but (as I said in my previous post*) "it's a simple library". If you need another operation to be supported in the bind context, simply add it to the list1<>. It remains to be clarified if the simplicity of this library is an advantage, but some measurements were reported, which say that the many layers in a fully grown-up lambda library (like Boost) have a detrimental effect on the performance, because the code gets so complex that the optimizers cannot cope with it!*** And the compilation gets longer too. So maybe such a primitive implementation isn't totally in the wrong?
There is another extension I tried. Again we have to extend the central list1<> class:
    template <class A1> class list1
    {
       ...
       // bind a void member function e.g.: int (T::*)(void *)
       template <class T, typename Ret, class List1>
           Ret operator() (Ret(T::*f)(void), List1 list1)
           {
               return ((list1[m_a1])->*f)();
           }
    };
What does it do? It is supposed to be used like that:
    vector<SomeClass> vec(10);
    for_each(vec.begin(), vec.end(), bind(&SomeClass::compute(), _$1);  
So for each object in a collection its (void) member function will be called. But again, I don't know if it actually worked, as there is no testcase for that in my code (but I think it should). And not everything that's possible makes sense anyway: here the code is misleading, we'd rather like to express this with _$1->compute()!

4. Last words


Well, that's everything I can say about currying support in my library. In the next post(s) I'll show you how to support lambda expressions like: cout << "---" << _$1 << "\n", and how to access a member or a member function of an object from a collection. What else? Maybe some conditional lambda expressions and some general discussion? Look out.

A general question can be answered here: what about the code of my library? Will it be publicly available? And what about the TODOS? And the missing orthogonality? The anwer is, I don't want to (and haven't got time, for that matter) polish the rough edges and add some more orthogonality. It was a fun writing the library, it took me only a couple of days (about 4, I think) , it is simple, and it has shown me that you don't have to be scared of the template metaprogramming. Of course, I wasn't a novice with templates, but I expected more resistance from C++. Maybe it's because of the library's simplicity?

Whatever. This library is not ready, is not complete, and I leave it as it is now.

---
* http://ib-krajewski.blogspot.com/2007/12/c-pocket-lambda-library.html
** Chris Gibson, "How Do Those Funky Placeholders Work?", Overload 73: http://www.accu.org/index.php/journals/1397
*** C++0x proposal: "Lambda expressions and closures for C++" : http://www.research.att.com/~bs/N1968-lambda-expressions.pdf

2 comments:

Dean Michael said...

If you're really interested in Functional Programming, you ought to take a look at the stuff happening within the Boost C++ Library community -- there's at least a couple of already mature libraries (Boost.Lambda and Phoenix, part of Boost.Spirit) and one that's being proposed for inclusion called Boost.Egg.

Lots of things happening in the Functional Programming niche in C++. Although it seems like Lambda support isn't going into C++0x, I'm pretty sure a library implementation should make it into TR1.

I hope this helps!

Marek Krj said...

Thanks for the comment! As I was writing this code more than a year ago, I only knew Lambda (which was too heavy-weight for my taste) and Spirit libraries. Phoenix and Egg are new to me, will have a look at them, I think.

Some discussion about language support vs. library is included in part 3 of this mini-series. As for me, the last C++ commitee Oxford 2007 Meetings stated lambda functions as more or less sure candidates, but I'm not following this too closely.