Singly Linked Lists in C++
In a recent post I wrote about removing items from a singly linked list. I presented a couple of alternative implementations, and in the comments section readers suggested yet more versions.
My implementations were written in C: the post was inspired by something Linus Torvalds had said about low-level programming skills, and I’m guessing he meant C programming skills. The fact is, C programmers are condemned to reimplement these basic functions on this basic structure because the C standard library has nothing to say about singly linked lists. Until recently the C++ standard library was similarly silent on the subject, only offering a doubly linked list.
C++ introduces … the linked list!
That’s all changed now with the introduction of std::forward_list
. The class interface doesn’t mention links or pointers but a quick glance through its contents makes it clear that if you imagine the container to be a templated version of a classic singly-linked list, you won’t go far wrong.
This gives forward_list
some members you won’t find elsewhere in the std::
namespace. For example, std::forward_list::before_begin()
, which returns an iterator just before the beginning of the list — much as the more familiar end()
is just past the end.
Why is before_begin()
necessary? You can’t dereference it and you can’t reverse through a forward list till you get to it. Well, since forward list iterators can only go forwards, instead of the familiar sequence insert()
, erase()
and emplace()
methods we have insert_after()
, erase_after()
and emplace_after()
, not to mention splice_after()
. The before-the-beginning iterator allows you to use these operations to modify the node at the head of the list.
Quick question: what’s the complexity of std::list::size()
?
Trick question: and how about std::forward_list::size()
?
Remove_if for forward lists
Using pointers-to-pointers to modify linked lists gives this elegant and compact C implementation of remove_if()
, which de-lists all nodes which match a supplied predicate.
void remove_if(node ** head, remove_fn rm) { for (node** curr = head; *curr; ) { node * entry = *curr; if (rm(entry)) { *curr = entry->next; free(entry); } else curr = &entry->next; } }
How does the C++ standard library support this algorithm?
Std::remove_if()
operates on an iterator range, remove_if(first, last, pred)
. All it requires is that the iterators are forward iterators so it should just work on a forward_list
.
Hang on though: what if pred(*first)
is true? How can a node be removed from a linked list without reference to its predecessor? Actually, the node isn’t removed — the value it contains gets overwritten by the value in the first node (if any!) for which the predicate returns false. In fact, remove_if
doesn’t remove anything from the container! What it does is return an iterator, call it new_last
, such that the range (first, new_last)
holds the elements which have been kept, and (new_last, last)
holds those which have been “removed”, which is to say they can still be dereferenced but their value is implementation dependent.
Remove_if
usually pairs up with erase:
container.erase(remove_if(first, last, pred), last);
There is no std::forward_list::erase(iterator)
— remember, we can only erase after — so the usual remove_if algorithm is of little use for forward lists.
Forward_list::remove_if()
As usual, we should prefer member functions to algorithms with the same names. C++’s forward_list
has its very own remove_if
which manipulates pointers rather than moves values, and which really does remove and destroy items.
Forward_list::remove_if()
operates on the list as a whole, not an iterator range — as we’ve seen, an iterator cannot remove itself. I took a look at a couple of implementations of this function to see how it’s done.
Here’s LLVM’s libc++ implementation:
template <class _Tp, class _Alloc> template <class _Predicate> void forward_list<_Tp, _Alloc>::remove_if(_Predicate __pred) { iterator __e = end(); for (iterator __i = before_begin(); __i.__ptr_->__next_ != nullptr;) { if (__pred(__i.__ptr_->__next_->__value_)) { iterator __j = _VSTD::next(__i, 2); for (; __j != __e && __pred(*__j); ++__j) ; erase_after(__i, __j); if (__j == __e) break; __i = __j; } else ++__i; } }
There’s no need for any special treatment of the first list node here, since we have its predecessor, the before_begin()
node. The function does double back though, figuring out the next range to erase before going back to erase it — and the erase function isn’t pretty.
template <class _Tp, class _Alloc> typename forward_list<_Tp, _Alloc>::iterator forward_list<_Tp, _Alloc>::erase_after(const_iterator __f, const_iterator __l) { __node_pointer __e = const_cast<__node_pointer>(__l.__ptr_); if (__f != __l) { __node_pointer __p = const_cast<__node_pointer>(__f.__ptr_); __node_pointer __n = __p->__next_; if (__n != __e) { __p->__next_ = __e; __node_allocator& __a = base::__alloc(); do { __p = __n->__next_; __node_traits::destroy(__a, _VSTD::addressof(__n->__value_)); __node_traits::deallocate(__a, __n, 1); __n = __p; } while (__n != __e); } } return iterator(__e); }
For comparison, here’s how GCC’s libstdc++ does the same thing.
template<typename _Tp, typename _Alloc> template<typename _Pred> void forward_list<_Tp, _Alloc>:: remove_if(_Pred __pred) { _Node* __curr = static_cast<_Node*>(&this->_M_impl._M_head); while (_Node* __tmp = static_cast<_Node*>(__curr->_M_next)) { if (__pred(*__tmp->_M_valptr())) this->_M_erase_after(__curr); else __curr = static_cast<_Node*>(__curr->_M_next); } }
Here, erasing (after a) node reads:
template<typename _Tp, typename _Alloc> _Fwd_list_node_base* _Fwd_list_base<_Tp, _Alloc>:: _M_erase_after(_Fwd_list_node_base* __pos) { _Node* __curr = static_cast<_Node*>(__pos->_M_next); __pos->_M_next = __curr->_M_next; _Tp_alloc_type __a(_M_get_Node_allocator()); allocator_traits<_Tp_alloc_type>::destroy(__a, __curr->_M_valptr()); __curr->~_Node(); _M_put_node(__curr); return __pos->_M_next; }
This version walks through the list removing nodes which match the predicate as it finds them. Don’t be confused by &this->_M_impl._M_head
: it’s not the node at the head of the list, it’s the one before.
Lessons from C++
Maybe this code wouldn’t persaude Linus Torvalds to rethink his opinion of C++, but if you can see past the angle brackets, underscores and allocators, it’s simple enough.
It’s also:
- subtle, so I’m glad someone else has written and checked it
- generic, so I can put what I want in a list without casting or indirection
- standard, so I know what to expect
- supported
The before-begin node idea serves equally well in C, enabling list modifiers which have no need of double indirection or special case code for the list head.
void remove_after(node * prev, remove_fn rm) { while (prev->next != NULL) { node * curr = prev->next; if (rm(curr)) { prev->next = curr->next; free(curr); } else prev = curr; } }
Pass this function the before-begin node to remove all items from the list which match the predicate.