Joined Output and the Fencepost Problem

2006-10-30, , , Comments

If fenceposts are placed 10m apart then how long is a row of 10 fenceposts? Quickly now …

If you answered 100m you got it wrong. Since the gap between two posts is 10m and there are 9 gaps in a row of 10 posts, the correct answer is 90m.

This is an example of the fencepost problem. It comes up all the time in computing. In this post, I tackle the example of using C++ to output a range of items.

C++ Range Output

In Omit Needless Code, Kevlin Henney develops some concise and expressive code to output a collection of strings separated by newlines:

std::vector<std::string> items;
... // populate items
typedef std::ostream_iterator<std::string> out;
std::copy(items.begin(), items.end(), out(std::cout, "\n"));

It’s the "\n" which adds the newlines. We could have used any other string — a simple space " " or a comma followed by a space ", " might be more suitable in some cases.

Unfortunately, though, this code adds a newline after each item in the list, including the last item, which is rarely what we want — just like we didn’t want the gap after the terminal post in the fencepost problem. For whitespace, such as a newline, this may not be too upsetting; by happy coincidence, we may even want an extra newline. Usually, though, the extra gap is a nuisance, as the following example shows.

The Full Monty

#include <iostream>
#include <iterator>
#include <string>
#include <vector>

typedef std::vector<std::string> breakfast;

void
add_full_monty(breakfast & add_to)
{
    static char const * items[] =
    {
        "bacon",
        "eggs",
        "beans",
        "toast",
        "sausage",
        "tomatoes",
        "black-pudding",
    };
    static char const ** const last_item
        = items + sizeof items / sizeof items[0];
    add_to.insert(add_to.end(), items, last_item);
}

void
output_breakfast(breakfast const & items, std::ostream & to)
{
    typedef std::ostream_iterator<breakfast::value_type> out;
    std::copy(items.begin(), items.end(), out(to, " and "));
}

int
main(int argc, char * argv[])
{
    breakfast my_breakfast;
    add_full_monty(my_breakfast);
    std::cout << "For breakfast I had ";
    output_breakfast(my_breakfast, std::cout);
    std::cout << '.' << std::endl;
    return 0;
}

Running this program produces:

For breakfast I had bacon and eggs and beans and toast and sausage and tomatoes and black-pudding and .

which isn’t entirely satisfying. There’s one " and " too many. Before fixing the problem, did you notice the surplus comma after the "black-pudding" and just before the end of the static array initialiser?

    static char const * items[] = {
        ....
        "black-pudding",
    };

This looks rather like another fencepost problem; as if a careless cut-and-paste has resulted in one comma too many. In fact, although the comma doesn’t need to be there, it’s innocuous, since, in a rare moment of generosity, C++ (and C) specifically allow for it. Permitting this extra separator makes life much easier for code-generators which don’t have to worry about treating the final item in a collection differently.

Python Joined-Up Output

Here’s some Python code which does a similar job. Running:

def full_monty():
    return "spam", "spam", "eggs", "spam",

print "For breakfast I had %s." % " and ".join(full_monty())

produces:

For breakfast I had spam and spam and eggs and spam.

The string join method uses the string " and " to fill the gaps between tuple items, which is just what we wanted.

Note here the trailing comma after the final "spam" in the return from the full_monty function. Again, it’s not necessary in this case, but you would need it to return a tuple comprising a single item.

def healthy_breakfast():
    return "muesli",

For consistency, I prefer to include this comma even when it’s not required.

C++ Joined-Up Output

Here’s some C++ code which will do the job of outputting a range of items, separated by a join string, and which doesn’t suffer from the fencepost problem

joined_output.hpp
#if !defined JOINED_OUTPUT_HPP_INCLUDED
#define JOINED_OUTPUT_HPP_INCLUDED

#include <ostream>

template <class Iterator>
void
joined_output(std::ostream & put_to,
              Iterator first,
              Iterator last,
              char const * join_with)
{
    if (first != last)
    {
        Iterator curr = first++;
        while (first != last)
        {
            put_to << *curr << join_with;
            curr = first++;
        }
        put_to << *curr;
    }
}
#endif // defined JOINED_OUTPUT_HPP_INCLUDED

It’s not especially pretty but at least this library code can be tucked away somewhere and used as required rather than replicated everywhere joined-up output is needed. Its use is straightforward, as the following test code demonstrates:

test_joined_output.cpp
#include "joined_output.hpp"
#include <cassert>
#include <list>
#include <sstream>
#include <vector>
#include <iostream>
#include <iterator>

void
test_bidirectional_joined_output()
{
    int const ii[] = { 0, 1, 2, 3, 4, 5 };
    int const * const end_ii = ii + sizeof ii / sizeof *ii;
    std::list<int> const ll(ii, end_ii);

    std::stringstream put_to;
    joined_output(put_to, ii, end_ii, " ");
    joined_output(put_to, ll.begin(), ll.end(), " ");
    assert(put_to.str() == "0 1 2 3 4 5" "0 1 2 3 4 5");

    put_to.str(std::string());
    joined_output(put_to, ii, end_ii, "");
    joined_output(put_to, ll.begin(), ll.end(), "");
    assert(put_to.str() == "012345" "012345");

    put_to.str(std::string());
    joined_output(put_to, ii, ii, " ");
    joined_output(put_to, ll.begin(), ll.begin(), " ");
    joined_output(put_to, ll.end(), ll.end(), " ");
    assert(put_to.str().empty());

    put_to.str(std::string());
    joined_output(put_to, ii + 1, ii + 2, " ");
    assert(put_to.str() == "1");
}

void
test_input_joined_output()
{
    std::istringstream input("0 1 2 3 4 5");
    std::ostringstream output;
    joined_output(output,
                  std::istream_iterator<int>(input),
                  std::istream_iterator<int>(),
                  " + ");
    assert(output.str() == "0 + 1 + 2 + 3 + 4 + 5");
}

int
main(int argc, char * argv[])
{
    test_bidirectional_joined_output();
    test_input_joined_output();
    return 0;
}

Specialised C++ Joined-Up Output

If we were dealing with a bi-directional or random access iterator then our implementation of joined_output() could get away without using any local variables:

#include <algorithm>
#include <iterator>
#include <ostream>

template <class BidirectionalIterator>
void
joined_output(std::ostream & put_to,
              BidirectionalIterator first,
              BidirectionalIterator last,
              std::string const & join_with)
{
    if (first != last)
    {
        typedef typename
            std::iterator_traits<BidirectionalIterator>::value_type V;
        std::copy(first, --last,
            std::ostream_iterator<V>(put_to, join_with));
        put_to << *last;
    }
}

This being C++, we probably do fret about such micro-optimisations in library code. So, for completeness, here’s the file joined_output.hpp reworked to show how joined_output() can be specialised according to iterator type. I prefer the earlier and simpler version but I’m posting this version in case I ever need to remember how to do this sort of thing again.

joined_output.hpp
#if !defined JOINED_OUTPUT_HPP_INCLUDED
#define JOINED_OUTPUT_HPP_INCLUDED

#include <algorithm>
#include <cassert>
#include <iterator>
#include <ostream>

template <class Iterator>
void
joined_output(std::ostream & put_to,
              Iterator first,
              Iterator last,
              char const * join_with)
{
    if (first != last)
    {
        joined_output(typename std::iterator_traits<Iterator>::iterator_category(),
                      put_to, first, last, join_with);
    }
}

template <class BidirectionalIterator>
void
joined_output(std::bidirectional_iterator_tag,
              std::ostream & put_to,
              BidirectionalIterator first,
              BidirectionalIterator last,
              char const * join_with)
{
    assert(first != last);
    typedef typename
        std::iterator_traits<BidirectionalIterator>::value_type V;
    std::copy(first, --last,
        std::ostream_iterator<V>(put_to, join_with));
    put_to << *last;
}

template <class InputIterator>
void
joined_output(std::input_iterator_tag,
              std::ostream & put_to,
              InputIterator first,
              InputIterator last,
              char const * join_with)
{
    assert(first != last);
    InputIterator curr = first++;
    while (first != last)
    {
        put_to << *curr << join_with;
        curr = first++;
    }
    put_to << *curr;
}

#endif // defined JOINED_OUTPUT_HPP_INCLUDED