Chapter 19: The STL Generic Algorithms

19.1: The Generic Algorithms

Before using the generic algorithms presented in this chapter, except for those in the Operators category (defined below), the <algorithm> header file must be included. Before using a generic algorithm in the Operators category the <numeric> header file must be included.

In the previous chapter the Standard Template Library (STL) was introduced. An important element of the STL, the generic algorithms, was not covered in that chapter as they form a fairly extensive part of the STL. Over time the STL has grown considerably, mainly as a result of a growing importance and appreciation of templates. Covering generic algorithm in the STL chapter itself would turn that chapter into an unwieldy one and so the generic algorithms were moved to a chapter of their own.

Generic algorithms perform an amazing task. Due to the strength of templates, algorithms could be developed that can be applied to a wide range of different data types while maintaining type safety. The prototypical example of this is the sort generic algorithm. To contrast: while C requires programmers to write callback functions in which type-unsafe void const * parameters have to be used, internally forcing the programmer to resort to casts, STL's sort frequently allows the programmer merely to state something akin to

    sort(first-element, last-element)

Generic algorithms should be used wherever possible. Avoid the urge to design your own code for commonly encountered algorithms. Make it a habit to first thoroughly search the generic algorithms for an available candidate. The generic algorithms should become your weapon of choice when writing code: acquire full familiarity with them and make their use your `second nature'.

On the other hand, without downgrading the importance of the generic algorithms, it's clear that over the years the number of available generic algorithms have seen an almost unlimited growth. Many new algorithms were added, even though for quite a few of them it's either very easy to provide direct implementations, or which may hardly ever be used. An example is the is_sorted generic algorithm, simply returning true if a range of elements is considered sorted and false if not. How hard is it to define such a function in the occasional situation you might need it? This applies to quite a few other generic algorithms. It's of course nice to have an extensive toolbox, but do you need, e.g., screwdrivers that perfectly match the screws' heads? Most likely not.... To avoid an excessive growth of this chapter, some sections contain comparable algorithms, and in some sections links to cppreference are provided when referring to comparable algorithms.

Nevertheless, this chapter's sections cover many of the STL's generic algorithms (in alphabetical order). For each algorithm the following information is provided:

In the prototypes of the algorithms Type is used to specify a generic data type. Furthermore, the particular type of iterator (see section 18.2) that is required is mentioned as well as other generic types that might be required (e.g., performing BinaryOperations, like plus<Type>). Although iterators are commonly provided by abstract containers and comparable pre-defined data structures, at some point you may want to design your own iterators. Section 22.14 offers guidelines for constructing your own iterator classes and provides an overview of of operators that must be implemented for the various types of iterators.

Almost every generic algorithm expects an iterator range [first, last), defining the series of elements on which the algorithm operates. The iterators point to objects or values. When an iterator points to a Type value or object, function objects used by the algorithms usually receive Type const & objects or values. Usually function objects cannot modify the objects they receive as their arguments. This does not hold true for modifying generic algorithms, which are of course able to modify the objects they operate upon.

Generic algorithms may be categorized. The C++ Annotations distinguishes the following categories of generic algorithms:

19.1.1: Execution policies

Many of the following generic algorithms could very well be parallized. E.g., one of the following algorithms is generate (section 19.1.19), filling the elements with values produced by a generating function. If those values are randomly generated values then generated could very well be parallized, where each parallel thread handles a separate section of the range to be filled. Another example where parallel execution may be useful is when sorting series of values. For this the the sort generic algorithm can be used (section 19.1.54, which also contains an example of parallelized sorting).

These (and many other) generic algorithms can be executed in parallel, depending on the specified execution policy. When no execution policy is specified then the algorithms operated in their standard, sequential, way. The generic algorithms supporting execution policies have overloaded versions where the first parameter specifies the execution policy to use, followed by their remaining parameters. Function prototypes listed the following sections showing a first parameter [ExecPol,] may specify one of the execution policies introduced in this section as their first arguments. E.g., one of the sort generic algorithm's prototypes is

   void sort([ExecPol,] 
        RandomAccessIterator first, RandomAccessIterator last);
and to sort, e.g., a vector if strings (vs), it can be called as
   sort(vs.bgin(), vs.end());
or as (see below)
   sort(execution::par, vs.bgin(), vs.end());

In order to use execution policies the <execution> header file must be included, and the linking option -ltbb must be specified with linking the compiled object s.

There are four types of execution policies (all defined in the std namespace):

Whenever algorithms are called using the above policy specifications and during the execution of these algorithms functions are called generating uncaught exceptions std::terminate is called.

When using parallel execution the objects or functions passed to the generic algorithms might access data defined elsewhere. If those data are modified then it is possible that modifications are requested from different execution threads, which could result in data races or deadlocks. The programmer should ensure that data races and/or dadlocks cannot occur when using parallel execution.

19.1.2: accumulate

19.1.3: adjacent_difference

19.1.4: adjacent_find

19.1.5: all_of / any_of / none_of

19.1.6: begin / end

19.1.7: binary_search

If value is in fact present in the range of values, then this generic algorithm doesn't answer the question where value is located. If that question must be answered the generic algorithms lower_bound and upper_bound can be used. Refer to sections 19.1.30 and 19.1.61 for examples illustrating the use of these latter two algorithms.

19.1.8: copy / copy_if

19.1.9: copy_backward

19.1.10: count / count_if

19.1.11: equal

19.1.12: equal_range

19.1.13: exchange

19.1.14: fill / fill_n

19.1.15: find / find_if / find_if_not

19.1.16: find_end

19.1.17: find_first_of

19.1.18: for_each

The example also shows that the for_each algorithm may be used with functions defining const and non-const parameters. Also, see section 19.1.56 for differences between the for_each and transform generic algorithms.

The for_each algorithm cannot directly be used (i.e., by passing *this as the function object argument) inside a member function to modify its own object as the for_each algorithm first creates its own copy of the passed function object. A lambda function or a wrapper class whose constructor accepts a pointer or reference to the current object and possibly to one of its member functions solves this problem.

19.1.19: generate / generate_n

19.1.20: includes

19.1.21: inner_product

19.1.22: inplace_merge

19.1.23: iota

19.1.24: is_partitioned

19.1.25: is_permutation

19.1.26: is_sorted

19.1.27: is_sorted_until

19.1.28: iter_swap

19.1.29: lexicographical_compare

19.1.30: lower_bound

The binary_search generic algorithm (cf. section 19.1.7)can be used to determine whether or not value is present in the iterator range. The upper_bound algorithm can be used to find the last element of a series of values equal to value. The upper_bound section (19.1.61) also contains an extensive example illustrating the use of lower_bound and as upper_bound.

19.1.31: max / min

19.1.32: max_element / min_element / minmax_element

19.1.33: merge

19.1.34: minmax

19.1.35: mismatch

19.1.36: move / move_backward

19.1.37: next_permutation / prev_permutation

19.1.38: nth_element

19.1.39: partial_sort / partial_sort_copy

19.1.40: partial_sum

19.1.41: partition / partition_point / stable_partition

19.1.42: partition_copy

19.1.43: reduce

19.1.44: remove / remove_if / remove_copy / remove_copy_if

19.1.45: replace / replace_if / replace_copy / replace_copy_if

19.1.46: reverse / reverse_copy

19.1.47: rotate / rotate_copy

19.1.48: sample

19.1.49: search / search_n

19.1.50: set_difference

19.1.51: set_intersection

19.1.52: set_symmetric_difference

19.1.53: set_union

19.1.54: sort / stable_sort

19.1.55: swap / swap_ranges

19.1.56: transform

the following differences between the for_each (section 19.1.18) and transform generic algorithms should be noted: Also note that the range-based for loop can often be used instead of the transform generic algorithm. However, but different from the range-based for-loop the transform algorithm can also be used width sub-ranges and with reverse-iterators.

19.1.57: transform_reduce

19.1.58: handling uninitialized memory

Section 9.1.5 covers the placement new operator. The placement new operator is used to install values or objects in 'raw memory', i.e., memory that is already available, but hasn't yet been initialized for the intended object types.

As covered before, when calling something like auto ptr = new string{ "hello" } the string is constructed in memory specifically allocated to contain the object, and the object type's constructor initializes the object in that memory. Likewise, when calling delete ptr the string's destructor is called, followed by returning the memory allocated by new to the common pool.

When using placement new the memory to contain the object is already available, and the construction auto ptr = new (storageAddress) string{ "hello" } is used to merely construct the string at the location specified by storageAddress. That string can then (as usual) be accessed via ptr, but delete ptr cannot be used, since the memory at storageAddress was already available before invoking the placement new operator. Therefore, in these cases the remarkable situation is encountered where the object's destructor must explicitly be called (using ptr->~string()) and using delete ptr is completely wrong, causing a memory error which aborts the program.

Several generic algorithms, all supporting execution policies, are available simplifying the use of placement new. To use these algorithm the <memory> header file must be included.

Facilities are available to copy, fill, initialize, and move objects to/in uninitialized (raw) memory, as well as facilities to delete the objects stored in raw memory. Here is an overvieuw of the available facilities (cf. cppreference for more details about the algorithms handling uninitialized memory):

The algorithm Type *construct_at(Type *raw, Args &&...args) constructs an object of type Type in the raw memory at raw, passing args... to Type's constructor.

To delete the objects installed in raw memory the following facilities are available:

Here is an example:

    #include <memory>
    #include <vector>
    #include <iostream>
    #include <string>
    using namespace std;

    int main()
    {
        char raw[4 * sizeof(string)];       // raw memory to receive strings
        string *ptr = reinterpret_cast<string *>(raw);  // pointer to strings

                                            // construct 4 strings in raw
        uninitialized_default_construct_n(ptr, 4);
        destroy(ptr, ptr + 4);              // call the strings' destructors

        vector<string> vs(4, "string");     // move 4 strings to raw memory
        uninitialized_move(vs.begin(), vs.end(), ptr);

        cout << vs.front() << ", " << vs.back() << '\n' <<
                ptr[0] << ", " << ptr[3] << '\n';

        destroy(ptr, ptr + 4);              // call the strings' destructors
    }
    //  Displays:
    //      ,
    //      string, string

19.1.59: unique

19.1.60: unique_copy

19.1.61: upper_bound

The binary_search generic algorithm can be used to simply determine whether or nog value is present in the iterator range. The lower_bound generic algorithm can be used to find the first element of a series of values equal to value.

19.1.62: Heap algorithms

A heap is a kind of binary tree which can be represented by an array. In the standard heap, the key of an element is not smaller than the key of its children. This kind of heap is called a max heap. A tree in which numbers are keys could be organized as shown in figure 24.

Figure 24: A binary tree representation of a heap

Such a tree may also be organized in an array:
        12, 11, 10, 8, 9, 7, 6, 1, 2, 4, 3, 5

In the following description, keep two pointers into this array in mind: a pointer node indicates the location of the next node of the tree, a pointer child points to the next element which is a child of the node pointer. Initially, node points to the first element, and child points to the second element.

Since child now points beyond the array, the remaining nodes have no children. So, nodes 6, 1, 2, 4, 3 and 5 don't have children.

Note that the left and right branches are not ordered: 8 is less than 9, but 7 is larger than 6.

A heap is created by traversing a binary tree level-wise, starting from the top node. The top node is 12, at the zeroth level. At the first level we find 11 and 10. At the second level 8, 9, 7 and 6 are found, etc.

Heaps can be constructed in containers supporting random access. So, a list is not an appropriate data structure for a heap. Heaps can be constructed from an (unsorted) array (using make_heap). The top-element can be pruned from a heap, followed by reordering the heap (using pop_heap), a new element can be added to the heap, followed by reordering the heap (using push_heap), and the elements in a heap can be sorted (using sort_heap, which, of course, invalidates the heap).

19.1.62.1: The `make_heap' function

19.1.62.2: The `pop_heap' function

19.1.62.3: The `push_heap' function

19.1.62.4: The `sort_heap' function

19.1.62.5: An example using the heap functions

Here is an example showing the various generic algorithms manipulating heaps:
    #include <algorithm>
    #include <iostream>
    #include <functional>
    #include <iterator>
    using namespace std;

    void show(int *ia, char const *header)
    {
        cout << header << ":\n";
        copy(ia, ia + 20, ostream_iterator<int>(cout, " "));
        cout << '\n';
    }
    int main()
    {
        int ia[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
                    11, 12, 13, 14, 15, 16, 17, 18, 19, 20};

        make_heap(ia, ia + 20);
        show(ia, "The values 1-20 in a max-heap");

        pop_heap(ia, ia + 20);
        show(ia, "Removing the first element (now at the end)");

        push_heap(ia, ia + 20);
        show(ia, "Adding 20 (at the end) to the heap again");

        sort_heap(ia, ia + 20);
        show(ia, "Sorting the elements in the heap");


        make_heap(ia, ia + 20, greater<int>());
        show(ia, "The values 1-20 in a heap, using > (and beyond too)");

        pop_heap(ia, ia + 20, greater<int>());
        show(ia, "Removing the first element (now at the end)");

        push_heap(ia, ia + 20, greater<int>());
        show(ia, "Re-adding the removed element");

        sort_heap(ia, ia + 20, greater<int>());
        show(ia, "Sorting the elements in the heap");
    }
    /*
        Displays:
            The values 1-20 in a max-heap:
            20 19 15 18 11 13 14 17 9 10 2 12 6 3 7 16 8 4 1 5
            Removing the first element (now at the end):
            19 18 15 17 11 13 14 16 9 10 2 12 6 3 7 5 8 4 1 20
            Adding 20 (at the end) to the heap again:
            20 19 15 17 18 13 14 16 9 11 2 12 6 3 7 5 8 4 1 10
            Sorting the elements in the heap:
            1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
            The values 1-20 in a heap, using > (and beyond too):
            1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
            Removing the first element (now at the end):
            2 4 3 8 5 6 7 16 9 10 11 12 13 14 15 20 17 18 19 1
            Re-adding the removed element:
            1 2 3 8 4 6 7 16 9 5 11 12 13 14 15 20 17 18 19 10
            Sorting the elements in the heap:
            20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
    */