Someone jokingly asked me how I would sort a million 32-bit integers in 2 megabytes of RAM, using Python [...] obviously this is a joke question -- the data alone would take up 4 megabytes, assuming binary encoding
This question seems to come up a lot at Google - didn't their CEO ask it of Obama and McCain when they visited?
I like that instead of going into a complicated solution to a tricky problem, GvR shows the very simple and general way to solve a slightly easier version more people are likely to encounter. Unfortunately I was sort of hoping he would take the Programming Pearls approach instead of treating it as a joke. Bently showed several ways to do this kind of constrained memory sorting given differing assumptions, but perhaps that kind of thing just isn't interesting for the places Python is used.
I wrote this algorithm to solve a problem for myself one night about 10 years ago (agree - not a very tough problem, but very useful if you don't have enough memory). I told a client about it and he tried to patent it! I never checked to see if it went through...
Well, it was a complicated data format for many 10s of GB of seismic data on machines that probably had 32MB or 64MB of memory. The partial sort and merge did the trick for me.
Stephenson came to do a reading in Cambridge, MA a few weeks ago, of a couple amusing conversations. I haven't read the rest of it yet, but he writes fantastic dialog.
It's not a terribly hard exercise, I'd expect that almost any programmer could do the math and figure out that a solution requires files. After that I'd expect a candidate to at least sketch out a logical solution.
It's cool that it's Guido, though. And that it's a complete solution.
The point here is to show off (among other things) the buffered I/O in Py3k. It's much easier (i.e. shorter, more readable) in Py3k to implement an efficient solution than in earlier Python versions. If it were a really hard problem, it would be harder to see the benefits of the new language features.
This question seems to come up a lot at Google - didn't their CEO ask it of Obama and McCain when they visited?
I like that instead of going into a complicated solution to a tricky problem, GvR shows the very simple and general way to solve a slightly easier version more people are likely to encounter. Unfortunately I was sort of hoping he would take the Programming Pearls approach instead of treating it as a joke. Bently showed several ways to do this kind of constrained memory sorting given differing assumptions, but perhaps that kind of thing just isn't interesting for the places Python is used.