Creating C++ predicates with higher order functions and functors

Let’s say we have a collection of numbers, and we want to find out if any of the numbers is equal to 3. One might do:

bool found = false;
for(int x : numbers) {
    if (x == 3) {
        found = true;
        break;
    }
}

// interpretation of intent: for each number x in numbers, if x == 3, then found = true

Easy, that works. But a seasoned C++ developer will be quick to remind you to prefer standard algorithms, rather than reinventing the wheel. Let’s see how we can do that.

bool found = std::any_of(
         numbers.cbegin(),
         numbers.cend(),
         [](const int x) { return x == 3; });
}

// interpretation of intent: found = true if any 'int x' in range numbers.cbegin() to numbers.cend() is equal to 3

This is wonderful, because it conforms to the C++ Core Guidelines, specifically P13: Use support libraries as appropriate and P3: Express intent

However, I think the expression of intent can be improved further. The intent is to find a number equal to three.

x == 3

This condition is clear enough, but unfortunately it is kind of hidden by the distracting syntax of a lambda.

[](const int x) { /*...*/ }

I have actually seen many people being turned off from using STL algorithms, because they do not like the verbose syntax of specifying the begin + end iterators and the lambda.

In this article, let’s focus on the lambda. Ideally, this would read something like this:

bool found = std::any_of(
         numbers.cbegin(),
         numbers.cend(),
         3);

// interpretation of intent: found = true if any of the numbers in range cbegin to cend is (equal to) 3

I think this still captures the intent, and it is much more straightforward. Unfortunately, this does not compile because 3 is not a unary predicate, as expected by any_of.
What if we could do this instead?

bool found = std::any_of(
         numbers.cbegin(),
         numbers.cend(),
         equals(3));

// interpretation of intent: found = true if any of the numbers in range cbegin to cend equals 3

Great! Now we have a readable expression of intent, and it can compile. How can we achieve that?

Implementation

We can achieve that with a higher order function. A higher-order function is a function that returns another function or takes functions as arguments. In this case, we will return a function, which will be the actual predicate used by any_of:

auto equals(const int number) -> std::function<bool(const int)> {
    return [number](const int element) { return element == number; };
}

Perhaps we want to be able to use this for other types as well:

template <typename T>
auto equals(const T& other) -> std::function<bool(const T&)> {
    return [other](const T& element) { return element == other; };
}

The variable other is now captured by reference, to avoid potentially copying large objects. Furthermore, if you’re not restricted to C++11, you can use C++14’s return type inference by omitting the trailing return type. This avoids the performance overhead associated with the type-erasure of std::function:

template <typename T>
auto equals(const T& other) {
    return [other](const T& element) { return element == other; };
}

Extensibility

It is not hard to imagine that this can easily be extended for additional predicates such as greater_than, less_than, etc.

Or even to query for a specific object by id:

template <typename T>
auto has_id(const int id) {
    return [other](const T& element) { return element.id == id; };
}

///

auto obj42 = std::find_if(objects.cbegin(), objects.cend(), has_id<Object>(42));

Admittedly, the need to specify has_id<Object>(42) with explicit template argument in this last example is not as clean as simply writing has_id(42), but it is necessary for the compiler to know which type to use for T in the lambda argument list.

If you wish to be able to express a predicate like has_id without explicitly specifying the type, it is possible with a functor instead of a function:

struct has_id
{
    has_id(int id) : _id(id) {}

    template<typename T>
    bool operator()(const T& obj)
    {
        return obj.id == _id;
    }

private:
    int _id;
};

///

auto obj42 = std::find_if(objects.cbegin(), objects.cend(), has_id(42));

An additional benefit that a functor can bring is the capability to be extended for different types of objects through template specialization. Consider the following example:

struct has_id
{
    has_id(int id) : id(id) {}

    template<typename T>
    bool operator()(const T& obj)
    {
        return obj.id == id;
    }

private:
    int id;
};

struct Object {
    int id = 0;
};

class Widget {
   public:
    Widget(int id) : id(id) {}
    int get_id() const { return id; }

   private:
    int id = 0;
};

template <>
bool has_id::operator()(const Widget& obj) {
    return obj.get_id() == id;
}

///

auto obj42 = std::find_if(objects.cbegin(), objects.cend(), has_id(42));
auto widget100 = std::find_if(widgets.cbegin(), widgets.cend(), has_id(100));

Here, has_id() will work for both Objects and Widgets.
For widgets, the specialization has_id::operator(const Widget&) will be used. This knows to call get_id() to get the id of a widget.
For Objects, and other types, the generic has_id::operator(const T&) will be attempted to match the id through public field id.

The drawback with this approach, is that has_id has to be extended for each type of object that is not compatible with the generic has_id.
This drawback can be overcome by with type traits, but this is beyond the scope of this article.

Conclusion

Using higher-order functions like equals can improve readability while leveraging the power of standard algorithms. This approach keeps code expressive. This is not a new idea, but it’s something I don’t often see in practice. For more advanced predicates however, it may be more appropriate to use functors.