Joined Output and the Fencepost Problem
If fenceposts are placed 10m apart then how long is a row of 10 fenceposts? Quickly now …
If you answered 100m you got it wrong. Since the gap between two posts is 10m and there are 9 gaps in a row of 10 posts, the correct answer is 90m.
This is an example of the fencepost problem. It comes up all the time in computing. In this post, I tackle the example of using C++ to output a range of items.
C++ Range Output
In Omit Needless Code, Kevlin Henney develops some concise and expressive code to output a collection of strings separated by newlines:
std::vector<std::string> items; ... // populate items typedef std::ostream_iterator<std::string> out; std::copy(items.begin(), items.end(), out(std::cout, "\n"));
It’s the "\n" which adds the newlines. We could have used any other
string — a simple space " " or a comma followed by a space
", " might be more suitable in some cases.
Unfortunately, though, this code adds a newline after each item in the list, including the last item, which is rarely what we want — just like we didn’t want the gap after the terminal post in the fencepost problem. For whitespace, such as a newline, this may not be too upsetting; by happy coincidence, we may even want an extra newline. Usually, though, the extra gap is a nuisance, as the following example shows.
The Full Monty
#include <iostream>
#include <iterator>
#include <string>
#include <vector>
typedef std::vector<std::string> breakfast;
void
add_full_monty(breakfast & add_to)
{
static char const * items[] =
{
"bacon",
"eggs",
"beans",
"toast",
"sausage",
"tomatoes",
"black-pudding",
};
static char const ** const last_item
= items + sizeof items / sizeof items[0];
add_to.insert(add_to.end(), items, last_item);
}
void
output_breakfast(breakfast const & items, std::ostream & to)
{
typedef std::ostream_iterator<breakfast::value_type> out;
std::copy(items.begin(), items.end(), out(to, " and "));
}
int
main(int argc, char * argv[])
{
breakfast my_breakfast;
add_full_monty(my_breakfast);
std::cout << "For breakfast I had ";
output_breakfast(my_breakfast, std::cout);
std::cout << '.' << std::endl;
return 0;
}
Running this program produces:
For breakfast I had bacon and eggs and beans and toast and sausage and tomatoes and black-pudding and .
which isn’t entirely satisfying. There’s one " and " too
many. Before fixing the problem, did you notice the surplus comma after the
"black-pudding" and just before the end of the static array
initialiser?
static char const * items[] = {
....
"black-pudding",
};
This looks rather like another fencepost problem; as if a careless cut-and-paste has resulted in one comma too many. In fact, although the comma doesn’t need to be there, it’s innocuous, since, in a rare moment of generosity, C++ (and C) specifically allow for it. Permitting this extra separator makes life much easier for code-generators which don’t have to worry about treating the final item in a collection differently.
Python Joined-Up Output
Here’s some Python code which does a similar job. Running:
def full_monty():
return "spam", "spam", "eggs", "spam",
print "For breakfast I had %s." % " and ".join(full_monty())
produces:
For breakfast I had spam and spam and eggs and spam.
The string join method
uses the string " and " to fill the gaps between tuple items, which
is just what we wanted.
Note here the trailing comma after the final "spam" in the return
from the full_monty function. Again, it’s not necessary in this
case, but you would need it to return a tuple comprising a single
item.
def healthy_breakfast():
return "muesli",
For consistency, I prefer to include this comma even when it’s not required.
C++ Joined-Up Output
Here’s some C++ code which will do the job of outputting a range of items, separated by a join string, and which doesn’t suffer from the fencepost problem
#if !defined JOINED_OUTPUT_HPP_INCLUDED
#define JOINED_OUTPUT_HPP_INCLUDED
#include <ostream>
template <class Iterator>
void
joined_output(std::ostream & put_to,
Iterator first,
Iterator last,
char const * join_with)
{
if (first != last)
{
Iterator curr = first++;
while (first != last)
{
put_to << *curr << join_with;
curr = first++;
}
put_to << *curr;
}
}
#endif // defined JOINED_OUTPUT_HPP_INCLUDED
It’s not especially pretty but at least this library code can be tucked away somewhere and used as required rather than replicated everywhere joined-up output is needed. Its use is straightforward, as the following test code demonstrates:
#include "joined_output.hpp"
#include <cassert>
#include <list>
#include <sstream>
#include <vector>
#include <iostream>
#include <iterator>
void
test_bidirectional_joined_output()
{
int const ii[] = { 0, 1, 2, 3, 4, 5 };
int const * const end_ii = ii + sizeof ii / sizeof *ii;
std::list<int> const ll(ii, end_ii);
std::stringstream put_to;
joined_output(put_to, ii, end_ii, " ");
joined_output(put_to, ll.begin(), ll.end(), " ");
assert(put_to.str() == "0 1 2 3 4 5" "0 1 2 3 4 5");
put_to.str(std::string());
joined_output(put_to, ii, end_ii, "");
joined_output(put_to, ll.begin(), ll.end(), "");
assert(put_to.str() == "012345" "012345");
put_to.str(std::string());
joined_output(put_to, ii, ii, " ");
joined_output(put_to, ll.begin(), ll.begin(), " ");
joined_output(put_to, ll.end(), ll.end(), " ");
assert(put_to.str().empty());
put_to.str(std::string());
joined_output(put_to, ii + 1, ii + 2, " ");
assert(put_to.str() == "1");
}
void
test_input_joined_output()
{
std::istringstream input("0 1 2 3 4 5");
std::ostringstream output;
joined_output(output,
std::istream_iterator<int>(input),
std::istream_iterator<int>(),
" + ");
assert(output.str() == "0 + 1 + 2 + 3 + 4 + 5");
}
int
main(int argc, char * argv[])
{
test_bidirectional_joined_output();
test_input_joined_output();
return 0;
}
Specialised C++ Joined-Up Output
If we were dealing with a bi-directional or random access iterator
then our implementation of joined_output() could get away without
using any local variables:
#include <algorithm>
#include <iterator>
#include <ostream>
template <class BidirectionalIterator>
void
joined_output(std::ostream & put_to,
BidirectionalIterator first,
BidirectionalIterator last,
std::string const & join_with)
{
if (first != last)
{
typedef typename
std::iterator_traits<BidirectionalIterator>::value_type V;
std::copy(first, --last,
std::ostream_iterator<V>(put_to, join_with));
put_to << *last;
}
}
This being C++, we probably do fret about such micro-optimisations
in library code. So, for completeness, here’s the file
joined_output.hpp reworked to show how joined_output() can be
specialised according to iterator type. I prefer the earlier
and simpler version but I’m posting this version in case I ever need
to remember how to do this sort of thing again.
#if !defined JOINED_OUTPUT_HPP_INCLUDED
#define JOINED_OUTPUT_HPP_INCLUDED
#include <algorithm>
#include <cassert>
#include <iterator>
#include <ostream>
template <class Iterator>
void
joined_output(std::ostream & put_to,
Iterator first,
Iterator last,
char const * join_with)
{
if (first != last)
{
joined_output(typename std::iterator_traits<Iterator>::iterator_category(),
put_to, first, last, join_with);
}
}
template <class BidirectionalIterator>
void
joined_output(std::bidirectional_iterator_tag,
std::ostream & put_to,
BidirectionalIterator first,
BidirectionalIterator last,
char const * join_with)
{
assert(first != last);
typedef typename
std::iterator_traits<BidirectionalIterator>::value_type V;
std::copy(first, --last,
std::ostream_iterator<V>(put_to, join_with));
put_to << *last;
}
template <class InputIterator>
void
joined_output(std::input_iterator_tag,
std::ostream & put_to,
InputIterator first,
InputIterator last,
char const * join_with)
{
assert(first != last);
InputIterator curr = first++;
while (first != last)
{
put_to << *curr << join_with;
curr = first++;
}
put_to << *curr;
}
#endif // defined JOINED_OUTPUT_HPP_INCLUDED