C plus plus:Modern C plus plus:Functors

From GPWiki
Jump to: navigation, search

Modern C++ : Going Beyond "C with Classes"

Introduction

We've already seen how C++ uses operator overloading and templating to use Iterators as a nice generalisation of pointers. If you've done a fair bit of programming in C, you've probably seen function pointers used as callbacks. If you're a Java programmer, maybe you used objects that implement a certain interface. Both methods work in C++, but have their drawbacks. Function pointers don't mesh well with objects, which would bother the OO crowd. Member function pointers or objects with a certain virtual base are annoying, since you have to derive from the certain base class, and are slow, since they involve virtual function calls.

Luckily, C++ has operator overloading, so we can just template on the type and use objects with overload operator()s.

Functions vs Functors

Functions, as you know, are a fundamental part of C++. They're probably second nature to you. Unfortunately, they're not all that convenient to pass around. Like arrays, they're fundamental types but aren't really "first-class" entities. You can use templates to help generate functions, but that only happens at compile-time. Wouldn't it be cool to be able to make and change functions at runtime? Well, functors let us approach that goal.

Functors are, like Iterators, pure abstractions: anything that acts like a functor is a functor. The requirements, however, are much simpler. All a functor need to be able to do is to "act like a function". This means that function pointers are Functors, as is anything with an overloaded operator().

If you look for Functors on Roguewave, you wont find anything -- they use the alternate name Function Objects.

Standard Functors

The std::lib has a number of Functors, all of which can be found in the <functional> header.

Predicates

Predicates are the subset of Functors that return bool ( or something that can be used as a boolean value ).

Relational

The relational predicates are std::equal_to, std::not_equal_to, std::greater, std::less, std::greater_equal, and std::less_equal representing ==, !=, >, <, >=, and <= respectively.

Logical Predicates

The standard provides std::logical_and, std::logical_or, and std::logical_not. You can use std::equal_to<bool> for logical xnor and std::not_equal_to<bool> for logical xor, if you need them.

Arithmetic Functors

All of the basic arithmetic operators supported by C++ are also available as Functors: std::plus, std::minus (2-parameter), std::multiplies, std::divides, std::modulus, and std::negate (1-parameter).

Using Functors With Algorithms

Here's a nice introduction to Functors:

#include <iostream>
#include <functional>
#include <algorithm>
#include <vector>
#include <iterator>
#include <cstdlib>
#include <ctime>
 
int main() {
 
    std::srand( std::time(0) );
 
    std::vector<int> v;
    std::generate_n( std::back_inserter(v), 10,
                     &std::rand );
    std::vector<int> v2( 10, 20 );
 
    std::copy( v.begin(), v.end(),
               std::ostream_iterator<int>(std::cout," ") );
    std::cout << std::endl;
 
    std::transform( v.begin(), v.end(),
                    v2.begin(),
                    v.begin(),
                    std::modulus<int>() );    
 
    std::copy( v.begin(), v.end(),
               std::ostream_iterator<int>(std::cout," ") );
    std::cout << std::endl;
 
    std::sort( v.begin(), v.end(), std::greater<int>() );
 
    std::copy( v.begin(), v.end(),
               std::ostream_iterator<int>(std::cout," ") );
    std::cout << std::endl;
 
}

First, std::generate_n is used to fill the vector with data. The vector is empty, but the magic of an Back Insert Iterator makes it so that when std::generate_n writes to the output iterator it adds it to the end of the vector. You'll notice that in this case a function pointer to std::rand makes a perfectly acceptable Functor.

The std::transform line does the equivalent of v[i] = v[i] % v2[i]; for each v[i] in v. The first 3 parameters are the 2 input ranges -- the second input range is assumed to be at least as large as the first. The fourth parameter is the output iterator. In this case it's safe to use a normal iterator into the container since we know exactly how many elements will need to be written. The fifth parameter is the binary operation performed.

There is also a 4-parameter version of std::transform which only needs one input range and takes a unary operation. Try using it and std::negate to do the equivalent of a loop of v[i] = -v[i];

Writing Your Own Functors

You probably noticed in the code above that the v2 vector serves no purpose other than to be the second range for the std::transform call. The most obvious solution to that would be to write a mod_10 function similar to the following:

int mod_10(int value) { return value % 10; }

But of course that's extremely limited. You could try templating it:

template <typename T, int V>
T mod_by(T value) { return value % V; }

Which would let you say

    std::transform( v.begin(), v.end(),
                    v.begin(),
                    &mod_by<int,10> );

But that's still too limited, since you can't specify the 10 at runtime.

What would really be best here is a structure, since a structure can have the divisor as a member variable:

template <typename T>
struct mod_by {
    T const divisor;
    mod_by(T const &d) : divisor(d) {}
    T operator()(T const &value) { return value % divisor; }
};

Which would let you say

    std::transform( v.begin(), v.end(),
                    v.begin(),
                    mod_by<int>(10) );

Notice that the divisor member is const. It's fairly common for people to be tempted to create functors with non-constant members and give operator() some sort of side effect, perhaps counting the number of times the function was called and returning a different answer after a certain point. To do so is generally not safe. Algorithms are free to copy the functor as many times as they wish, which means that the functor passed in might be copied a number of times, with different objects' operator()s called on different elements in the range.

In Closing

Using functors with your loops lets you do just about anything. Algorithms with Functor parameters can be made to be incredibly general, removing possibilities for errors in writing the loops and making it clear that a certain line applies an operation to each element ( or corresponding pair of elements ), instead of needing to hunt through the details of a specific loop.

What's Next

Anyone that's used a real functional language is probably scoffing at having to write a separate functor each time. It would be much better to be able to combine Functors and values in order to build complex functors on the fly, don't you think?