I wish I could find a dedicated feed for Computerworld’s “A to Z of Programming Languages” series — I’d subscribe, they’re great reading. (My second wish is for online magazines to find a better way of generating revenue than all those noisy adverts.) Subscribed or not, I found and read the recent interview with Larry Wall about Perl. Perl 6 gets its first mention in the Q and A on the second page.
Would you have done anything differently in the development of Perl if you had the chance?
Either nothing or everything. See Perl 6.
Except you can’t really see Perl 6 yet. This ambitious new version of the language is, Larry Wall says, scheduled for release on Christmas Day. When pressed further on progress, he adds
We’re certainly well into the second 80 percent.
A metric all software developers can relate to!
As a consequence of its protracted emergence, some of Perl 6’s best features have been backported to Perl 5.10. Larry Wall, again:
One of the most popular things is the use of “say” instead of “print”.
I don’t follow Perl closely enough to know if
say is two characters fewer to type, it’s chatty, and I’m all for variety.
Perl 6 may have slipped another Christmas, but notably, soberly, sensibly, Python recently hit a milestone in its own ambitious trajectory. Python 3.0 (final) was released on December 3rd 2008. Python 3.0 is, I think, the first version of the language which breaks backwards compatibility: so, for example, a Python 2.2 program should work unchanged in Python 2.6, but a Python 2.6 program is unlikely to work in Python 3.0.
It’s a bold move, and one which has taken a lot of smart people a lot of hard work. For many others the hard work has just begun: forking the language marks the start, not the end, of a period of transition.
As Perl 5.10 anticipates Perl 6, so Python 2.6 anticipates Python 3.0. Some features, such as binary literals, have been backported from 3.0; the
-3 flag warns about Python 3.0 incompatibilities in 2.6 code; and a new tool,
2to3, converts 2.6 code into 3.0 code.
Despite looking forwards in this way, Python 2.6 is unlikely to mark the end of the Python 2.N line, and even for new users on greenfield projects Python 3.0 may not be a wise choice. For one thing it’s new, whereas (e.g.) 2.5 is battle-hardened; for another, many popular third-party libraries and frameworks have yet to be released against 3.0. Although the standard documentation for Python 3.0 is complete, the “current documentation” linked to from the Python homepage resolves to Python 2.6.1, and that’s where you’re likely to find yourself if you e.g. google for help on a particular Python topic, or click a link from an online article. If you’re after a book on Python, the choice for Python 3.0 is limited.
As with Perl 6,
$ python3.0 -c "say = print" $ python2.6 -c "say = print" File "<string>", line 1 say = print ^ SyntaxError: invalid syntax
Here’s a blot on Python 2.6 and its predecessors: the
range() builtin function returns a complete list, even if you only want to consume its elements one at a time.
Xrange(), which generates numbers on demand, is more efficient and generally what’s needed. Similarly in 2.6
filter() return complete lists; for elements on demand, use
itertools.ifilter(). And Python 2.6 provides both lazy and complete ways to access keys, values and items in a dict.
Note the redundancy here:
range() is equivalent to
Python 3.0 simplifies things, letting
range() do what
xrange() used to, and eliminating the awkwardly named
Dict.iteritems() etc. have gone; thus
dict.items() is lazy, and if you need the complete list of all
(key, value) pairs in a dict,
list(dict.items()) does the job.
These changes add little to the power of the language, and may even seem to wilfully break backwards compatibility. For me, they’re about consistency, and reducing interfaces to a minimal complete set. I applaud them.
☀ ☁ ☂ ☃
More significantly, Python 3.0 builds in proper support for Unicode, or at least the basis for proper Unicode support. The problem here being, Unicode is necessarily complex — as any system which encompasses so many subtle cultural differences must be — and however cleverly Python has adapted to the challenge, some of this complexity must rise to the surface of Python 3.0 programs.
Is this complexity really essential? Could a modern language reasonably ignore Unicode, or delegate its support to a standard library? Has Python become less attractive to learners and novices? When Paul Graham launched a new lisp dialect, Arc, at the start of 2008, he noted:
I went to a talk last summer by Guido van Rossum about Python, and he seemed to have spent most of the preceding year switching from one representation of characters to another. I never want to blow a year dealing with characters. Why did Guido have to? Because he had to think about compatibility. But though it seems benevolent to worry about breaking existing code, ultimately there’s a cost: it means you spend a year dealing with character sets instead of making the language more powerful.
Which is why, incidentally, Arc only supports Ascii. MzScheme, which the current version of Arc compiles to, has some more advanced plan for dealing with characters. But it would probably have taken me a couple days to figure out how to interact with it, and I don’t want to spend even one day dealing with character sets.
Sad to say, it would take me more than a couple of days to figure out MzScheme’s advanced character plan, so I’m not qualified to comment on Paul Graham’s decision. Many others did, at the time, and if you follow the link in the blockquote above, you’ll find a few words of explanation which I’ll paraphrase here: Arc is not about the details of character sets, it’s a high-level language, for writing short programs.
I class Python as a high-level language too, and regard its power and accessibility as the source of its popularity. Python is also a mainstream language and one increasingly used at the heart of internationalised applications. I agree with James Bennett: Unicode support is fundamental and necessary.
Anyone who’s visited Word Aligned before will know that most of the example code here is in Python. I’m aware that on several occasions I’ve waved away Unicode issues (an anagram solver which fails to identify “face” as an anagram of “café”, for example).
Like Paul Graham, I can justify my decision. I want the code presented on this site to work, but not just so you can cut and paste it. I’m not a library provider. I use Python here primarily because it’s succinct and accessible. I want you to read it! Sometimes blurring the distinction between characters and bytes makes for short and sweet examples.
If I switch to Python 3.0, will I still be able to cut these corners? Or will my code become more fiddly because it must deal more explicitly with character encoding issues? The truth is, I don’t know yet, I’ve only written one Python 3.0 program.
At work, our choice is obvious. We shall continue to use Python 2.5 for the immediate future. There’s no advantage to switching even a point revision at this stage in the project I’m working on, and the third party code we depend on has yet to be released against Python 3.0.
On this website, I can take a complementary and more forward looking approach. Anything published in 2008 and before should work with Python 2.6. Anything published in 2009 and beyond should work with Python 3.0.
print('Happy new year!')