Big City Skyline Puzzle

2007-10-01, , , Comments

There’s a relatively short supply of computer science puzzles[1] and many new ones simply re-spin the classics — once you’ve removed the packaging they come down to the same old binary search, quick sort, bit-vector, …

So I was interested to find one which was new to me, referenced in this post on the official Google blog.

Little big Google

Actually, Google supply a couple of puzzles. The subtext of both is that Google is a big company working on big problems, but hey!, a childish sense of wonder, curiosity and competition are still encouraged. The first puzzle takes you to a new job in Big City, where the beautiful skyline is filled with millions of impossibly tall thin buildings. The second expands your scope (perhaps due to over-crowding in BC?) — now you’re at Google Moon, searching nothing less than the Universe for nothing more than stock quotes.

I’m not going to spoil either puzzle. I won’t talk about the lunar one since I haven’t had time to look at it yet, except to say that it looks like a variant on the traveling salesman problem (which won’t make it any easier to solve).

Big City Skyline

The Big City Skyline puzzle requires careful budgeting of time and space. I don’t think it’s giving much away to say that a quadratic algorithm won’t cut it. An O(N log N) solution will, though, provided each stage completes in roughly a microsecond — not unreasonable on a 2GHz machine. The 512MB space limit doesn’t look too bad since the largest numbers involved fit comfortably into 8 bytes: all you have to do is choose a reasonably compact container and avoid replicating the inputs.

As usual I put together some unit tests[2] and a first implementation using Python.

Wasting Resources

As a rule of thumb, I reckon Python to be an order of magnitude more wasteful of CPU cycles and memory than my favourite low-level language, C++.

What do I mean by an order of magnitude? Well, it’s a factor somewhere between 2 and 10. The problem is, I really have no better understanding or control than that. Sometimes it’s 2, sometimes it’s 10, sometimes it’s somewhere in between. I have looked at the CPython implementation, but not closely enough to understand how much memory a list of N large integers requires, nor how long it takes to iterate through such a list. The language reference provides few guarantees.

This wastefulness doesn’t just mean a Python program consumes proportionately more resources. It also means there will be a point at which a Python program fails because it exhausts resources and the machine starts to thrash — and at this point:

  1. big-O complexity analysis ceases to be of any practical value
  2. you’ll need a higher-spec machine
  3. or a leaner program

The Big City Skyline puzzle specifies a target platform (a platform similar to the one I used for the exercise), ruling out option 2.

As you may have guessed, my first Python solution bumped into my machine’s limits well before I’d reached the “hard” values of N — a disappointment, though perhaps not a huge surprise. Actually, either a pass or a fail would have been a surpise: I had no way of knowing beforehand. Python does ship with an [array module][array module] which I’m guessing might have done the job — had sizeof(int) been 8 on my platform that is, which it isn’t. There’s also a [numpy module][numpy module] which provides far more, but which I’ve never had cause to use yet.

Accounting for Resources

I’ve never had cause to use numpy because I’m proficient in C and C++. One great thing about C, and to a lesser extent, C++, is that it’s easy to predict how a simple program will consume resources — especially if, as in this case, you can use standard vectors and algorithms. I reworked my Python program into C++, checked it passed the tests. It then crunched through a random Big City skyline in well under 30 seconds. The important point is that this was not a surprise. I knew how much memory would be needed and (roughly) how much time would be taken.

I was pleased how clean the C++ solution was. It reminded me that on a good day, the C++ standard library allows you to write code which is clear, concise and efficient.


[1]

You’ll find a decent list of these puzzles and an interesting meta-discussion over at Coding Horror (look in the comments too, where you’ll find a broken solution to the Google Skyline puzzle).

[2]

Unit tests and some timing tests for the skyline puzzle are available via anonymous subversion access at: http://svn.wordaligned.org/svn/etc/skyline