Sums and sums of squares in C++

C++ question

Suppose you have a collection, xs, of floating point values.

typedef std::vector<double> doubles;
doubles xs;
.... code which fills xs

How would you sum the elements of xs?

Now how would you calculate the sum of squares of these elements?

It’s not a trick question but there is more than one plausible answer — a fact I was reminded of when I read a recent article by John D. Cook which demonstrates how to recast a curve fitting equation to avoid loss of precision. An obvious and idiomatic approach is to loop through the values of xs accumulating the required sums1.

double sum_xs = 0.0;
double sum_squares_xs = 0.0;

for (doubles::const_iterator xi = xs.begin(); xi != xs.end(); ++xi)
{
double const x = *xi;
sum_xs += x;
sum_squares_xs += x * x;
}

Functional programmers might sniff at this double purpose loop which repeatedly rewrites the values of sum_xs and sum_squares_xs to achieve the desired result. Surely this code ought to be expressed as an application of that well-known higher order function, reduce?

It’s hard for an imperative programmer accept this complaint: the code, as it stands, is simple, idiomatic and efficient. However … consider the same problem in Python.

Reduce in Python

Python is far from being a pure functional programming language. As it happens, reduce() is being demoted from a builtin to a library function in the next major release of the language. Here’s the deprecation warning Python 2.6 spits out:

\$ python -3 -c "print reduce(lambda x, y: x + y, range(10))"
-c:1: DeprecationWarning: reduce() not supported in 3.x; use functools.reduce()
45

I’ve no real quibble about this decision: the truth is, reduce can move aside because Python already builds in specialisations of the most common reductions. In this case sum answers our first question. Combine sum with a generator expression and we’ve answered the second one nicely too.

sum_xs = sum(xs)
sum_squares_xs = sum(x * x for x in xs)

Reduce in C++

Idiomatic C++ varies across both time and team. The C++ standard library does include reduce, and it deserves to be better known. You can find it in <numeric> under the name of std::accumulate(). Accumulate accepts as arguments an iterator range, an initial value, and optionally a binary function. From the SGI STL documentation:

Prototype

Accumulate is an overloaded name; there are actually two accumulatefunctions.

template <class InputIterator, class T>
T accumulate(InputIterator first, InputIterator last, T init);

template <class InputIterator, class T, class BinaryFunction>
T accumulate(InputIterator first, InputIterator last, T init,
BinaryFunction binary_op);

Description

Accumulate is a generalization of summation: it computes the sum (or some other binary operation) of init and all of the elements in the range [first, last).

The function object binary_op is not required to be either commutative or associative: the order of all of accumulate’s operations is specified. The result is first initialized to init. Then, for each iterator i in [first, last), in order from beginning to end, it is updated by result = result + *i (in the first version) or result = binary_op(result, *i) (in the second version).

Thus, in its simpler form, std::accumulate returns the sum of elements in the input range. Like Python, we can sum numbers; unlike Python we can also sum (i.e. concatenate) strings or valarrays2 — or indeed any other type we choose, provided we implement a suitable overload of operator+().

So, to sum the elements of xs, we write:

double sum_xs = accumulate(xs.begin(), xs.end(), 0.0);

Include <functional> and we can find their product too:

double product_xs
= accumulate(xs.begin(), xs.end(), 1.0, std::multiplies<double>());

To sum the squares of elements of xs we might supply a custom binary function.

double accum_sq(double sum_so_far, double x)
{
return sum_so_far + x * x;
}

.... double sum_squares_xs
= accumulate(xs.begin(), xs.end(), 0.0, accum_sq);

Or have we got the arguments to accum_sq() the wrong way round? This is the kind of code which risks giving reduce() a bad name. Maybe we could avoid confusion by creating an intermediate container to hold the squared values?

doubles xxs;

transform(xs.begin(), xs.end(), xs.begin(),
back_inserter(xxs), std::multiplies<double>());
double sum_squares_xs = accumulate(xxs.begin(), xxs.end(), 0.0);

Here, the first three arguments to std::transform() cause xs to be zipped with itself and std::multiplies ensures xxs gets filled with squares of elements of xs. Feed the resulting range into accumulate and we’re done. It’s cute that C++ lets you do this but a programmer may well fret about the extra storage and wonder where the original simple loop went!

Std::inner_product

Std::accumulate() is reduce but sadly C++ doesn’t do functional programming well enough to really exploit its power. Nonetheless, the algorithm deserves to be better known. As it happens there’s an even less well known member of the standard library which can solve our sum of squares puzzle.

Std::inner_product is yet another reduction. It zips up two iterator ranges multiplying together elements from the resulting pairs and summing these multiples to produce a final result. You can use it to find the scalar product of two vectors, or to calculate a weighted sum or, indeed, if we feed the range xs.begin(), xs.end() to it twice, to sum the squares of elements of xs.

double sum_squares_xs
= inner_product(xs.begin(), xs.end(), xs.begin(), 0.0);

This form is undeniably compact. If you know about std::inner_product it’s easy to understand. I’ll leave you to decide if it’s better than a hand-written loop.

Example program

Here’s a simple program designed to demonstrate the use of accumulate and inner_product.

// Examples of std::accumulate and std::inner_product
#include <functional>
#include <iostream>
#include <numeric>
#include <string>
#include <valarray>
#include <vector>

typedef std::valarray<double> xyz;

// Xyz output operator
std::ostream & operator<<(std::ostream & os, xyz const & pt)
{
os << '(';
char const * sep = "";
for (size_t i = 0; i != pt.size(); sep = ", ", ++i)
{
os << sep << pt[i];
}
os << ')';
return os;
}

// Bitwise or function, for use in reductions
unsigned bit_or(unsigned u, unsigned v)
{
return u | v;
}

// Create and return a triangle
std::vector<xyz> create_triangle()
{
std::vector<xyz> pts;
double const p = {1.,1.,0.,1.,0.,1.,0.,1.,1.};
pts.push_back(xyz(p + 0, 3));
pts.push_back(xyz(p + 3, 3));
pts.push_back(xyz(p + 6, 3));
return pts;
}

// Set up some test arrays, accumulate them and print the results to stdout.
int main()
{
int const a = { 1, 2, 3 };
int const b = { 3, 2, 1 };
std::string const s = { "http://", "wordaligned", ".org" };
bool const t = { false, true, false };
std::vector<xyz> tri = create_triangle();
unsigned m = { 1<<1, 1<<3, 1<<5 };

std::cout
<< "sum(a) "
<< std::accumulate(a, a + 3, 0)
<< "\nprod(a) "
<< std::accumulate(a, a + 3, 1, std::multiplies<int>())
<< "\nsum_sqs(a) "
<< std::inner_product(a, a + 3, a, 0)
<< "\ndot(a, b) "
<< std::inner_product(a, a + 3, b, 0)
<< "\nconcat(s) "
<< std::accumulate(s, s + 3, std::string(""))
<< "\nany(t) " << std::boolalpha
<< std::accumulate(t, t + 3, false, std::logical_or<bool>())
<< "\ncentroid(tri) "
<< std::accumulate(tri.begin(), tri.end(), xyz(0., 3)) / 3.
<< "\nbitor(m) " << std::hex << "0x"
<< std::accumulate(m, m + 3, 0, bit_or)
<< '\n';

return 0;
}

This program outputs:

sum(a) 6
prod(a) 6
sum_sqs(a) 14
dot(a, b) 10
concat(s) http://wordaligned.org
any(t) true
centroid(tri) (0.666667, 0.666667, 0.666667)
bitor(m) 0x2a

1 In the case of John D. Cook’s article, which draws attention to the nuts and bolts of a calculation, the explicit loop is an ideal formulation.

2 ☡ A note of caution! Although you can add std::strings together, using std::accumulate to concatenate a large number of strings may class as premature pessimization since each addition is likely to involve dynamic memory allocation. I must also point out that std::valarray has an uncertain footing in the C++ Standard Library. In his excellent C++ standard library reference book Nicolai Josuttis sticks the boot in:

The valarray classes were not designed very well. In fact, nobody tried to determine whether the final specification worked. This happened because nobody felt “responsible” for these classes. The person who introduced valarrays to the C++ standard library left the committee a long time before the standard was finished.