## Group When

Phil Nash’s recent tweet intrigued me.

He later clarified what he was after — and had now found — linking to a solution posted a couple of years ago by Tomas Petricek. The function `groupWhen` splits a sequence into groups, starting a new group whenever the predicate returns true.

``` module Seq =
/// Iterates over elements of the input sequence and groups adjacent elements.
/// A new group is started when the specified predicate holds about the element
/// of the sequence (and at the beginning of the iteration).
///
/// For example:
///    Seq.groupWhen isOdd [3;3;2;4;1;2] = seq [; [3; 2; 4]; [1; 2]]
let groupWhen f (input:seq<_>) = seq {
use en = input.GetEnumerator()
let running = ref true

// Generate a group starting with the current element. Stops generating
// when it founds element such that 'f en.Current' is 'true'
let rec group() =
[ yield en.Current
if en.MoveNext() then
if not (f en.Current) then yield! group()
else running := false ]

if en.MoveNext() then
// While there are still elements, start a new group
while running.Value do
yield group() |> Seq.ofList }

```

Here’s a nice Haskell version coded up by @sdarlington.

Maybe takewhile and dropwhile could power a Python solution, but my first choice would be itertools.groupby. `Groupby` chops a sequence into subsequences, where the elements of each subsequence have the same key value. A suitable key function, in this case, must change its return value every time the sequence yields an element for which the predicate holds. It could toggle between a pair of values, for example. Or it could just count the number of times the predicate holds.

```class count_p:
''' Return a value which increments every time the predicate holds.
'''
def __init__(self, pred):
self._n = 0
self._pred = pred

def __call__(self, v):
self._n += self._pred(v)
return self._n

def group_when(pred, xs):
return (gp for _, gp in groupby(xs, count_p(pred)))

```

Here, `group_when` accepts an iterable and returns an iterable sequence of iterable groups. Clients choose how to consume the results.

```>>> def odd(v): return v % 2
>>> xs = group_when(odd, [3, 3, 2, 4, 1, 2])
>>> print([list(g) for g in xs])
[, [3, 2, 4], [1, 2]]

```

Note that `count_p` does something very like itertools.accumulate. Here’s another version of `group_when` which takes advantage of this.

```def group_when(pred, xs):
xs, ys = tee(xs)
accu = accumulate(map(pred, ys))
return (gp for _, gp in groupby(xs, lambda _: next(accu)))

```

§

After a short break, here’s a third version of `group_when`. This is the first time I’ve found a use for `takewhile` and `dropwhile`. Beware: as the teed streams `xs` and `ys` diverge, the amount of backing storage required will grow … only for the stored values to then be dropped!

```from itertools import *

def group_when(p, xs):
def notp(x): return not p(x)
xs = iter(xs)
while True:
x = next(xs)
xs, ys = tee(xs)
yield chain([x], takewhile(notp, xs))
xs = dropwhile(notp, ys)

def odd(x):
return x % 2

[list(g) for g in group_when(odd, [3, 3, 2, 4, 1, 2])] # [, [3, 2, 4], [1, 2]]

```