Polymorphic set operations for containers

With the availability of lambdas, there is little reason to no longer support polymorphic collection behaviour (“set operations”) for containers. Basically the same operations should work for any container class. They should all use the same naming, in order to support compile-time polymorphism (in templates).

With these generic algos at hand, a lot of tedious and repetitive code can be spared. They should even work for HashMap, ValueTree … you get the idea. Even String will work. Below an example for Array:

template <typename LAMBDA>
Array select (LAMBDA&& condition) const
{
    Array answer;
    for (auto each : *this)
        if (condition (each))
            answer.add (each);
    return answer;
}

template <typename LAMBDA>
Array reject (LAMBDA&& condition) const
{
    Array answer;
    for (auto each : *this)
        if (! condition (each))
            answer.add (each);
    return answer;
}

template <typename LAMBDA>
Array collect (LAMBDA&& function) const
{
    Array answer;
    for (auto each : *this)
        answer.add (function (each));
    return answer;
}

template <typename LAMBDA>
ElementType* detect (LAMBDA&& condition) const
{
    for (auto each : *this)
        if (condition (each))
            return &each;
    return nullptr;
}

template <typename LAMBDA>
bool allSatisfy (LAMBDA&& condition) const
{
    if (isEmpty())
        return true;
    for (auto each : *this)
        if (!condition (each))
            return false;
    return true;
}

template <typename LAMBDA>
bool anySatisfy (LAMBDA&& condition) const
{
    for (auto each : *this)
        if (condition (each))
            return true;
    return false;
}

All of these are functional (non-destructive, const) and therefore lend themselves to temporary filtering and processing of sets. Therefore it’s probably wise to make the results non-owning by default, even if the container took ownership of its elements.

The suggested naming is inspired by Smalltalk and could be changed to better suit Juce conventions.

Some use cases:

auto quotedNonEmptyNames = 
         names.reject ([](auto n){ return n.isEmpty(); })
              .collect ([](auto n){ return n.quoted(); });

if (names.allSatisfy ([](auto n){ return n.contains("$"); }))
    doSomething();

Extract all digits out of a String:
String name = otherString.select ([](auto ch){ return ch >= '0' && ch <= '9'; });

The idea of an abstraction layer that views String as a container of char might be a bit far reached for C++ standards, since unfortunately char isn’t an object, but it offers great expressiveness for lean code.

It wasn’t until the advent of lambdas (and the perspective of reflection on the horizon) that C++ became a serious contender for application development for me. In an attempt at porting some of my work, I was sorely missing the proposed set operations. If you never used them, you might not appreciate their usefullness yet, but maybe this can serve as an inspiration for adding more generality, expressiveness and comfort to the Juce library.

The current idiomatic C++ way of writing these functions is as non-member function accepting an iterator pair.

Indeed, they already exist in the stdlib.

select is copy_if, reject is copy_if combined with not_fn, collect is transform, detect is find_if, and the last two are all_of and any_of.

The notion of std::string as a collection of char is not at all far-fetched, it’s just that this relationship is expressed by exposing member begin and end functions, rather than inheriting from some collection interface. An advantage of this approach is that collection classes only have to implement two member functions, rather than reimplementing all the collection algorithms each time. Plus, no virtual call overhead and more opportunities for compile time optimisations.

A disadvantage is that the syntax is a little verbose, but this should improve in C++20 with the new ranges library.

Thanks for pointing to the std library. It’s certainly a workaround, but std functions don’t give Juce objects as results, which is a bit cumbersome.

What I’m also missing is partial copying of containers by element index. It’s implemented for String and could use the same interface for other indexable containers.

The whole point is offering the same naming and behaviour for all containers, so that even if you changed the base type of your containers, much your code will still compile. There is plenty of advantange in compile-time polymorphism.

EDIT: Just noticed there isn’t yet a single method for Array that returns an Array. That’s an entire category of class behaviour which is lacking.

std functions don’t give Juce objects as results, which is a bit cumbersome

There’s no way for the library to know what collection you want in a given context. What if you want to remove duplicated elements by converting a juce Array to a std set?

What I’m also missing is partial copying of containers by element index.

The STL handles this via… iterators again!

auto foo = std::string { "hello world" };
auto bar = std::string { std::next (foo.begin(), 6), foo.end() };
// bar contains "world" here

The whole point is offering the same naming and behaviour for all containers

That’s already true for the algorithms provided in the STL.

This mixing of paradigms and libraries feels extremely convoluted and backwards. Basically the “hello world” example boils down to hacking what you need everytime you need it. The code is impossible to understand w/o intimate knowledge of the std library. Doesn’t even remotely resemble object orientation, IMHO.

What I like about Juce is that is does better: Clear names and parameters for what you get.

I feel exactly the other way round. Creating purpose made containers were right at their time. With todays templating mechanisms, the need for specialised containers becomes less relevant.

As I understand it, polymorphs and virtual calls are the enemies of the optimiser, so the preference goes more away from inheritance.

I hear you, I am not a big fan of many new syntaxes of the stdlib, and I also find it inconvenient, that I have to read up new things or why certain constructs don’t work (most of the times completely understandable in hindsight).

But at the same time, I think learning the stdlib is a great tool, to get the best/fastest* possible algorithms and data structures independent of the framework you are currently using.

I think it’s worth embracing it and spending time to learn, but only time will tell…

*) exceptions prove the statement

As I understand it, polymorphs and virtual calls are the enemies of the optimiser, so the preference goes more away from inheritance.

If “preferences goes away from inheritance”, what will be left of an OO language’s purpose after all? This speed-at-all-cost obsession will eventually kill it. I see how that is useful for DSP code, but UI code doesn’t need it. Large apps need expressiveness and maintainability.

Ok, maybe my statement doesn’t deserve (wasn’t intended) as universal advice…

But once I learned the performant containers, why would I bother to use a different one just because of a different context?

Because converting back and forth between containers of the Juce library and the standard library is ugly and costly.

It’s a common mistake to consider C++ an “object oriented language”. It is not one. It is a multipurpose language that has some features that allow some object oriented constructs. But for example the standard library containers and algorithms were never even meant to be object oriented.