Eurovision 2008 charts

2008-05-29, , Comments

Last time I visited the Google chart API I discovered maps had been added. Since then I’ve been itching to use them. Eurovision 2008 got me scratching that itch.

Votes for Serbia

As ever the API is a delight to use, if somewhat restricted. You’re limited to selecting a geographical area from a small set, but happily this set includes Europe. Chart data is supplied as a string of concatenated ISO 3166-1 alpha-2 country codes (e.g. “TV” for “Tuvalu”), and the chart value for each data point (i.e. country) maps to a colour gradient. Your palette is limited: water masses get a fill colour; a pair of colours provides the gradient used for countries in the chart data string; and omitted countries get a default colour.

To depict Eurovision results I chose suitably lurid colours. The results for Serbia, who hosted the event in 2008, are shown at the top of this page. Serbia appears in yellowy-orange, and the mauvey-bluey colours show who voted for Serbia — the darker the shade, the higher the vote[1].

The Maps

Here are the maps in order of Eurovision 2008 scores. Hover your mouse over them to see the country and the score. If you’re wondering why you can’t see Israel, it’s because Israel isn’t in Europe.

Votes for Russia Votes for Ukraine Votes for Greece Votes for Armenia Votes for Norway Votes for Serbia Votes for Turkey Votes for Azerbaijan Votes for Israel Votes for Bosnia and Herzegovina Votes for Latvia Votes for Georgia Votes for Portugal Votes for Iceland Votes for Denmark Votes for Albania Votes for Spain Votes for Sweden Votes for France Votes for Romania Votes for Croatia Votes for Finland Votes for United Kingdom Votes for Germany Votes for Poland

The Script

Here’s the script I used to generate the pictures. It’s hacky, sub-optimal and packed with workarounds to get the job done. As Mark Dominus puts it:

Not everything we do is a brilliant, diamond-like jewel, polished to a luminous gloss with pages torn from one of Donald Knuth’s books.

I do think the code also shows how adept Python is at working with XML (XHTML in this case) and text processing, and why every programmer should know at least one scripting language.

As input it needs:

Eurovision charts
'''Generate Eurovision charts using Google chart API.

Method:
* get a mapping between country names and country codes
* find the final results table in the Wikipedia Eurovision page
* extract voting information from this table
* munge all this information into google chart URLs
'''
import string
import xml.dom

def internal_co_name(co_name):
    '''Convert a country name into a form used internally.
    
    We remove punctuation and convert to uppercase. This helps map
    between the official country names and the less standard ones used
    on the Wikipedia page.
    '''
    ascii = string.ascii_letters
    return ''.join(c.upper() for c in co_name if c in ascii)

def country_codes_dict(iso_3166_fp):
    '''Return a dict mapping country names to 3166-1-alpha-2 codes.
    
    We get these from the ISO website in a ISO-8859 encoded text file
    which contains a header followed by records of the form
    AFGHANISTAN;AF
    '''
    import re
    import codecs
    ccode_match = re.compile(r"^([^;]+);(\w\w)", re.UNICODE).match
    lines = codecs.iterdecode(iso_3166_fp, "iso-8859-1")
    co_codes = dict((internal_co_name(m.group(1)), m.group(2))
                    for m in map(ccode_match, lines) if m)
    # Add some shortened forms
    co_codes.update(dict(MACEDONIA="MK", BOSNIA="BA", RUSSIA="RU", MOLDOVA="MD"))
    # Hack! Serbia should be SR but, as at 2008-05-28, the google
    # chart api seems to want the country code for the former Serbian
    # and Montenegro.  This line can be removed once the chart api
    # matches its documentation.
    co_codes["SERBIA"] = "CS"
    return co_codes

def tree_walk(node):
    '''Recursively walk the nodes in a tree.'''
    yield node
    for node1 in node.childNodes:
        for node2 in tree_walk(node1):
            yield node2

def tree_find(root, pred):
    '''Return the first node for which the predicate holds, or None.''' 
    for node in tree_walk(root):
        if pred(node):
            return node

def next_sibling_find(node, pred):
    '''Return the first next-sibling node for which the predicate holds.'''
    while node.nextSibling:
        node = node.nextSibling
        if pred(node):
            return node

def children(node, name):
    '''Return (tag-)named child elements of a node.'''
    return node.getElementsByTagName(name)

def is_text(node):
    return node.nodeType == xml.dom.Node.TEXT_NODE

def is_element(node):
    return node.nodeType == xml.dom.Node.ELEMENT_NODE

def text(node):
    '''Return text from a node of the general form <td><b>12</b></td> or <td>0</td>'''
    while not is_text(node) and node.childNodes:
        node = node.childNodes[0]
    return node.data if is_text(node) else None

def country_votes(tr):
    '''Convert a row from the results table.
    
    Returns country name, score and votes for that country.
    '''
    td = children(tr, 'td')
    co_name, score = text(td[0]), int(text(td[1]))
    votes = [int(text(cell)) for cell in td[2:] if text(cell)]
    return co_name, score, votes

def results(wiki_table, co_code_dict):
    '''Extract the results from the Wikipedia results table.
    
    Returns the column headings country codes and a list of votes.
    '''
    import operator
    second = operator.itemgetter(1)
    trs = children(wiki_table, 'tr')
    ths = children(trs[1], 'th')[1:]
    # The title of each column header 'a' element looks like "ESCFranceJ.svg"
    # The [3:-5] slice converts this to "France", and we then look up
    # the 3166 alpha-2 code
    ICN = internal_co_name
    cols = [co_code_dict[ICN(children(th, 'a')[0].getAttribute('title')[3:-5])]
            for th in ths]
    votes = sorted((country_votes(tr) for tr in trs[2:]), 
                   key=second, reverse=True)
    return cols, votes

def results_table(wiki_page):
    '''Return the results table from the Wikipedia Eurovision results.
    
    By inspection, this is the first table after the "Final_2" node.
    '''
    import xml.dom.minidom
    def final_2(n):
        return is_element(n) and n.getAttribute('id') == 'Final_2'
    def htm_table(n):
        return is_element(n) and n.tagName == 'table'
    doc = xml.dom.minidom.parse(wiki_page)
    node = tree_find(children(doc, 'body')[0], final_2)
    return next_sibling_find(node.parentNode, htm_table)

def eurovision_vote_map(co_name, co_codes, scores, hi_score, missing):
    '''Return the URL of a map showing Eurovision votes for a country.
    '''
    # Use simple text encoding
    simple = string.uppercase + string.lowercase + string.digits
    simple_hi_ix = len(simple) - 1
    mapurl = (
        'http://chart.apis.google.com/chart?'
        'cht=t&chtm=europe&'  # Map of Europe
        'chld=%(countries)s&' # String of country codes
        'chd=s:%(values)s&'   # Values for these countries, simple encoding
        'chco=%(def_colour)s,%(lo_colour)s,%(hi_colour)s&'
        'chf=bg,s,%(sea_colour)s&'
        'chs=%(width)dx%(height)d')
    # Use a hack here to highlight the country being voted for.  Don't
    # include this country in the chart data, then it will get the
    # default colour. Assign all missing entries and zero scoring
    # entries the 'lo_colour'.
    omit = internal_co_name(co_name)
    values = ''.join(simple[simple_hi_ix * score // hi_score] for
                     score in scores) + 'A' * (len(missing)//2)
    countries = ''.join(c for c in co_codes if c != omit) + missing
    width, height = 250, 125 # The maximum map size is 440, 220
    def_colour, sea_colour = 'FFCC00', '00FFCC'
    lo_colour, hi_colour = 'FFFFFF', '000066'  
    return mapurl % locals()

def get_map_urls(wiki_results_fp, ccodes_fp):
    '''Return the URLs for Eurovision results charts, ordered by score.
    '''
    wiki_table = results_table(wiki_results_fp)
    co_codes = country_codes_dict(ccodes_fp)
    cols, votes = results(wiki_table, co_codes)
    # We only really need the countries which 1) weren't part of
    # Eurovision and 2) which appear in the Google chart of Europe, but
    # I don't have a definitive list of these. So just treat every
    # country in the world not in Eurovision as missing.
    missing = ''.join(set(co_codes.values()) - set(cols))
    hi_score = max(v for _, _, vv  in votes for v in vv)
    return "\n".join(eurovision_vote_map(co_name, cols, vv, hi_score, missing)
                    for co_name, score, vv in votes)

if __name__ == '__main__':
    # Download from: 
    # http://en.wikipedia.org/wiki/Eurovision_Song_Contest_2008
    # http://www.iso.org/iso/list-en1-semic-2.txt
    # (or use urllib.urlopen on these urls).
    print get_map_urls(open('Eurovision_Song_Contest_2008'),
                       open('list-en1-semic-2.txt'))

Serbia and Montenegro?

My program produced bizarre output at first: something was wrong with Serbia, and I assumed I’d made an off-by-one error. As I write this (2008-05-29) there seems to be a problem with the way the google chart API handles the country code for Serbia, “RS”. That’s why I’ve substituted “CS”, the old code for Serbia and Montenegro — a country which ceased to exist in 2006. I’ll have to adjust the code as the situation (both politically and googley) develops, since the maps which appear on this page are served live by google.

Terry Wogan

I’m not going to draw any conclusions from these charts except to say that I never thought I’d mention Wogan on this site, let alone link to the Telegraph. Eurovision voting is clearly a sensitive topic. I recommend a visit to this Andy Brice article, where he uses software similar to his wedding table planner to render some more detailed voting maps. The analogy of all these countries having to sit next to each other for a musical event, like grumpy inlaws at a wedding, made me laugh.

Andy Brice’s pictures were produced with C++ and Qt following the same QA procedure that I used with the code for this article:

I wrote some throwaway code to generate these images in C++ and Qt over a few hours on a wet bank holiday Sunday. QA amounted to ‘that looks about right’.


[1] Yes, I am colour-blind!