Joined Output and the Fencepost Problem
If fenceposts are placed 10m apart then how long is a row of 10 fenceposts? Quickly now …
If you answered 100m you got it wrong. Since the gap between two posts is 10m and there are 9 gaps in a row of 10 posts, the correct answer is 90m.
This is an example of the fencepost problem. It comes up all the time in computing. In this post, I tackle the example of using C++ to output a range of items.
C++ Range Output
In Omit Needless Code, Kevlin Henney develops some concise and expressive code to output a collection of strings separated by newlines:
std::vector<std::string> items; ... // populate items typedef std::ostream_iterator<std::string> out; std::copy(items.begin(), items.end(), out(std::cout, "\n"));
It’s the "\n"
which adds the newlines. We could have used any other
string — a simple space " "
or a comma followed by a space
", "
might be more suitable in some cases.
Unfortunately, though, this code adds a newline after each item in the list, including the last item, which is rarely what we want — just like we didn’t want the gap after the terminal post in the fencepost problem. For whitespace, such as a newline, this may not be too upsetting; by happy coincidence, we may even want an extra newline. Usually, though, the extra gap is a nuisance, as the following example shows.
The Full Monty
#include <iostream> #include <iterator> #include <string> #include <vector> typedef std::vector<std::string> breakfast; void add_full_monty(breakfast & add_to) { static char const * items[] = { "bacon", "eggs", "beans", "toast", "sausage", "tomatoes", "black-pudding", }; static char const ** const last_item = items + sizeof items / sizeof items[0]; add_to.insert(add_to.end(), items, last_item); } void output_breakfast(breakfast const & items, std::ostream & to) { typedef std::ostream_iterator<breakfast::value_type> out; std::copy(items.begin(), items.end(), out(to, " and ")); } int main(int argc, char * argv[]) { breakfast my_breakfast; add_full_monty(my_breakfast); std::cout << "For breakfast I had "; output_breakfast(my_breakfast, std::cout); std::cout << '.' << std::endl; return 0; }
Running this program produces:
For breakfast I had bacon and eggs and beans and toast and sausage and tomatoes and black-pudding and .
which isn’t entirely satisfying. There’s one " and "
too
many. Before fixing the problem, did you notice the surplus comma after the
"black-pudding"
and just before the end of the static array
initialiser?
static char const * items[] = { .... "black-pudding", };
This looks rather like another fencepost problem; as if a careless cut-and-paste has resulted in one comma too many. In fact, although the comma doesn’t need to be there, it’s innocuous, since, in a rare moment of generosity, C++ (and C) specifically allow for it. Permitting this extra separator makes life much easier for code-generators which don’t have to worry about treating the final item in a collection differently.
Python Joined-Up Output
Here’s some Python code which does a similar job. Running:
def full_monty(): return "spam", "spam", "eggs", "spam", print "For breakfast I had %s." % " and ".join(full_monty())
produces:
For breakfast I had spam and spam and eggs and spam.
The string join method
uses the string " and "
to fill the gaps between tuple items, which
is just what we wanted.
Note here the trailing comma after the final "spam"
in the return
from the full_monty
function. Again, it’s not necessary in this
case, but you would need it to return a tuple comprising a single
item.
def healthy_breakfast(): return "muesli",
For consistency, I prefer to include this comma even when it’s not required.
C++ Joined-Up Output
Here’s some C++ code which will do the job of outputting a range of items, separated by a join string, and which doesn’t suffer from the fencepost problem
#if !defined JOINED_OUTPUT_HPP_INCLUDED #define JOINED_OUTPUT_HPP_INCLUDED #include <ostream> template <class Iterator> void joined_output(std::ostream & put_to, Iterator first, Iterator last, char const * join_with) { if (first != last) { Iterator curr = first++; while (first != last) { put_to << *curr << join_with; curr = first++; } put_to << *curr; } } #endif // defined JOINED_OUTPUT_HPP_INCLUDED
It’s not especially pretty but at least this library code can be tucked away somewhere and used as required rather than replicated everywhere joined-up output is needed. Its use is straightforward, as the following test code demonstrates:
#include "joined_output.hpp" #include <cassert> #include <list> #include <sstream> #include <vector> #include <iostream> #include <iterator> void test_bidirectional_joined_output() { int const ii[] = { 0, 1, 2, 3, 4, 5 }; int const * const end_ii = ii + sizeof ii / sizeof *ii; std::list<int> const ll(ii, end_ii); std::stringstream put_to; joined_output(put_to, ii, end_ii, " "); joined_output(put_to, ll.begin(), ll.end(), " "); assert(put_to.str() == "0 1 2 3 4 5" "0 1 2 3 4 5"); put_to.str(std::string()); joined_output(put_to, ii, end_ii, ""); joined_output(put_to, ll.begin(), ll.end(), ""); assert(put_to.str() == "012345" "012345"); put_to.str(std::string()); joined_output(put_to, ii, ii, " "); joined_output(put_to, ll.begin(), ll.begin(), " "); joined_output(put_to, ll.end(), ll.end(), " "); assert(put_to.str().empty()); put_to.str(std::string()); joined_output(put_to, ii + 1, ii + 2, " "); assert(put_to.str() == "1"); } void test_input_joined_output() { std::istringstream input("0 1 2 3 4 5"); std::ostringstream output; joined_output(output, std::istream_iterator<int>(input), std::istream_iterator<int>(), " + "); assert(output.str() == "0 + 1 + 2 + 3 + 4 + 5"); } int main(int argc, char * argv[]) { test_bidirectional_joined_output(); test_input_joined_output(); return 0; }
Specialised C++ Joined-Up Output
If we were dealing with a bi-directional or random access iterator
then our implementation of joined_output()
could get away without
using any local variables:
#include <algorithm> #include <iterator> #include <ostream> template <class BidirectionalIterator> void joined_output(std::ostream & put_to, BidirectionalIterator first, BidirectionalIterator last, std::string const & join_with) { if (first != last) { typedef typename std::iterator_traits<BidirectionalIterator>::value_type V; std::copy(first, --last, std::ostream_iterator<V>(put_to, join_with)); put_to << *last; } }
This being C++, we probably do fret about such micro-optimisations
in library code. So, for completeness, here’s the file
joined_output.hpp
reworked to show how joined_output()
can be
specialised according to iterator type. I prefer the earlier
and simpler version but I’m posting this version in case I ever need
to remember how to do this sort of thing again.
#if !defined JOINED_OUTPUT_HPP_INCLUDED #define JOINED_OUTPUT_HPP_INCLUDED #include <algorithm> #include <cassert> #include <iterator> #include <ostream> template <class Iterator> void joined_output(std::ostream & put_to, Iterator first, Iterator last, char const * join_with) { if (first != last) { joined_output(typename std::iterator_traits<Iterator>::iterator_category(), put_to, first, last, join_with); } } template <class BidirectionalIterator> void joined_output(std::bidirectional_iterator_tag, std::ostream & put_to, BidirectionalIterator first, BidirectionalIterator last, char const * join_with) { assert(first != last); typedef typename std::iterator_traits<BidirectionalIterator>::value_type V; std::copy(first, --last, std::ostream_iterator<V>(put_to, join_with)); put_to << *last; } template <class InputIterator> void joined_output(std::input_iterator_tag, std::ostream & put_to, InputIterator first, InputIterator last, char const * join_with) { assert(first != last); InputIterator curr = first++; while (first != last) { put_to << *curr << join_with; curr = first++; } put_to << *curr; } #endif // defined JOINED_OUTPUT_HPP_INCLUDED