Sunday 27 April 2008

pocket C++ lambda library, part IIa


Are you wondering about the title of this entry? Well, it really should by part of a previous one*, but after looking at it I decided that the previous entry is pretty long already, and I wasn't willing to blow it up even more. And the theme isn't such an interesting one as to deserve a separate part in this mini-series. What is it we are talking about?

1. Access to the members


In the past I sometimes really wanted to be able to do the following:

    struct XXX { int value; int getValue() { return value; } };
    vecx<XXX*> vec_x;
    find_if(vec_x.begin(), vec_x.end(),  _$1->value == alive);
i.e. to access the members of an object out of the lambda function. Of course I'd like a code like _$1.value the best, but C++ doesn't allow us to overload the dot operator! Why not? This would mean that we could customize the method invocation mechanism itself! As it's the case in Groovy, Perl, Python or even Java**:

// Groovy:
Object invokeMethod(String name, Object args)
{
    log("Just calling me: $name");
    def result = metaClass.invokeMethod(this, name, args);
}
If you have that, you can do things like Rails in Ruby and builders in Groovy. You just intercept calls to the nonexisting methods (i.e. overload the methodMissing()/method_missing() in Groovy/Ruby or __getattr__() in Python) and install the "code block" (a closure, as to be exact) passed as one of the parameters in a custom hash map with the name parameter as key... You've got the message.

This isn't possible in C++, as it would introduce the metaclass notion into the language. At least it would require a common superclass for all C++ objects, and this contradicts the design of C++ classes (AFAIK) as thin wrappers for physical memory segments. On the other side, C++ has a more primitive notion of call intercepting: overloading of the -> operator! Alas, it only works with pointers, so we cannot provide a general solution for value based containers. An that is a bad thing enough.

So let's concentrate on the less ambitious goal: _$1->getValue()! Unfortunately, even this syntax canot be made work in our context! Why? Because for the C++ compiler the expression getValue() doesn't make sense! It doesn't refer to a class' method getValue(), which we could feed to the -> operator, it's just a string which we'd like to transform somehow in a function reference!

So what can we do (if anything)?

2. The Implementation


The best I could produce is the following:
    iterx = find_if(vecx.begin(), vecx.end(), _$1->*(&XXX::value) == 2);
    iterx = find_if(vecx.begin(), vecx.end(), _$1->*(&XXX::getValue) == 1);
Well, what can I say, it's passable. Now the compiler has a meaningful information in form of a member address, and it's not too ugly. How to implement it? With a standard technique from the first part***:
    // ->*
    template<class V, class O> struct ArrowStar : public lambda_expr {
        V O::* mptr;
        ArrowStar(V O::*m): mptr(m) { }
        V operator()(O* o) const { return o->*mptr; }
    };
we are overloading the member call operator for memebr access. For function member calls we need 2 more overloads. First for calls with 1 argument:
    template<class R, class O, class A> struct ArrowStarF : public lambda_expr {
        R(O::*fptr)(A);
        ArrowStarF(R(O::*f)(A)): fptr(f) { }
        R operator()(O* o, A a) const { return o->*fptr(a); }
    };
and for calls without arguments:
    template<class R, class O> struct ArrowStarFv : public lambda_expr {
        R(O::*fptr)();
        ArrowStarFv(R(O::*f)()): fptr(f) { }
        R operator()(O* o) const { return (o->*fptr)(); }
    }
nothing new here as well, just the standard operator overloading technique. For the == operation to work, I extended the EqTo operator from the part 1*** to do a little forwarding. I know, I should use the forwarders (like Le2_forw in part 1), but I was lazy:
    // lambda_expr ==
    template<class S, class T> struct EqTo : public lambda_expr {
        ...
        EqTo(S s, T t) : lexpr(s), val(t) { }
        template <class R>
            bool operator()(R r) { return lexpr(r) == val; }
    };
So let's do something useful at last:
    // shouldn't clash with lambda_expr: *_$1 <= *$2 !!!
    iterx = find_if(vecx.begin(), vecx.end(), _$1->*(&XXX::getVal) <= 2);
    // read field values from vecx
    vector<int> v10_1(10);
    transform(vecx.begin(), vecx.end(), v10_1.begin(), _$1->*(&XXX::getVal));
    // assign to a external counter
    int aaa;
    for_each(vecx.begin(), vecx.end(), aaa += _$1->*(&XXX::value));
Ok, the last one won't be working just now ;-), you must wait for the part 3 of the series! For the first one to work, we must extend the forwarding class from part 1*** a little for the case where only one side must be forwarded:
    template <class S, class T> struct Le2_forw : public lambda_expr
    {
        S e1;
        T e2;
        Le2_forw(S s, T t) : e1(s), e2(t) { }
        .....
        template <class U> // one side is bound! OPEN: assume left side!
            bool operator()(U a) const { return e1(a) <= e2; }
    };
And now everything is buzzing!

3. Discussion


If you think the sytax of the meber function call ist just horrible, there is another possibility to use the member functions: bind them! This you have seen (and frowned upon) in part 2*:
    for_each(vecx.begin(), vecx.end(), cout << bind(&XXX::getVal, _$1));
it's even less readable, is it? Or maybe not? Look, compared with _$1->*(&XXX::value) it's no more SUCH a bad sight! Maybe something like call_func(_$1, &XXX::getVal) would be more readable here? The advantage of this solution is that we could accept value objects in the container, and take it's address internally. I leave the decision to you, the implementation is more or less trivial.

Summing up, I couldn't achieve much progress here, because of inherent C++ design features. So maybe a macro solution? Or another level of indirection? What we really needed here is a hook for compiler errors (method not found), where we could install our own code snippet. Wait a minute! Something like compile time asserts? But how can I get around the string => function coding problem?

Something primitive like this would be possible:
    ...
    ArrowStarStr(string& name): fname(name) { }
    R operator()(O* o) const {
        return (o->func_hashtable.get(name))(); // and don't crash here!
    }
But you cannot use a POCppO anymore, you need some macro gadgetry like in Qt,
for example:
    struct XXX {
        int value;
        int getValue() { return value; }

        // now decorate:
        STORE_LAMDBA_FUNC(getValue);
    };
    // or property based:
    struct XXX {
        DEF_LAMBDA_PROPERTY(value, int);
    };
Not so pretty, not elegant, much to much effort needed. But it is the solution we have to use following the C++ language design. We just don't have introspection and cannot overload the dot. Sorry. Any ideas?

--
* http://ib-krajewski.blogspot.com/2008/01/c-pocket-lambda-library-part-2.html
** with dynamic proxies: see http://gleichmann.wordpress.com/2007/11/22/mimicry-in-action-dynamically-implement-an-interface-using-dynamic-proxy/
***http://ib-krajewski.blogspot.com/2007/12/c-pocket-lambda-library.html

2 comments:

Dean Berris said...

You can actually adapt structs into Boost.Fusion sequences and use Boost.Fusion algorithms to deal with them even from "lambda" or Boost.Spirit's Phoenix operations.

Maybe these would be worth a look for you.

Marek Krj said...

You mean something along the lines of: int i = at_c<0>(some_struct)? Well, it's not exactly what I wanted. I used to write something similiar, with names like field_1st<>(), field_2nd<>(), to access the elements of a structure earlier. But what I wanted to achieve here was more: I wanted to use the names of the fields or methods of a class. A goal too ambitious I think.