Archived News for July 2007 - Something Emporium

Linux Killer?

So Tycho said this was just about the most fascinating thing he'd read lately, and I have to agree.

Jack Thompson would both love and hate Reiser; he's a case study for both sides of the argument that violent videogames create violent people. I'd definitely be interested in seeing where the kid, Rory Reiser ends up in 10 years.

I'm at work and it is boring me stupid.

Planet Blogging

If anyone is blogging elsewhere I could aggregate their posts into SE, if they'd like. A bit like Planet (example: Debian Planet). I'd just need an RSS feed of your posts (or a URL if you don't know what RSS is).

New insights on poverty and life around the world

I feel I've been far too sceptical in tone for the last few weeks. So, I wanted to post something amazing and good. Here's Hans Rosling, a Swedish professor, with an amazing 20 minute presentation.

The software is nice - and a webapp (holy crap). But I think it's the data itself, and the funny and intriguing running commentary from Hans, that really makes the presentation. I also like the way he pronounces Chile.

Don't miss the last three minutes, but don't skip ahead to watch them either. You'll wreck the bathos.

Eagle vs. Shark

Pingback

I think I'm almost finished an implementation of a conforming Pingback client, using PEAR's XML_RPC2 classes. I'm just having a little trouble testing.

The specification recommends pinging its url, which gives tracking.damowmow.com as the pingback server address. But pinging that gives a 500 error. And I can't tell if the message you get while making a normal HTTP request to it is a joke!

There's also the main page for the domain, which is just as unhelpful.

So here are some random URLs I'm going to ping. If you're pinged by this post, well, thanks for helping me out:

Greg Reinacker's weblog - works!
Ryan's journal - ~~doesn't!~~ Yay, works!
Sam Ruby's interwingly - stuff!

Ctaatgt

Gattaca is a fantastic movie. It had been on my 'to watch' list for years, until I finally watched it last night.

It has Uma Thurman in it. I was waiting for her to flip out and go on a crazed killing spree. Apparently the future will be just like the present, only cars will have green headlights.

Smoking pays a fairly big part in the proceedings, which I guess symbolises that these guys aren't perfect - one of them is Sky Captain in a wheelchair (action figure available from Mattel). Apparently, Titan is just like tobacco smoke in a red wine glass. Sounds awesome.

The film, released in 1997 and set in the 'not too distant future', mentions laws that ban discrimination on the basis of genome. In April this year, such laws were actually passed. The Genetic Information Nondiscrimination Act makes genoism illegal in the US. Holy crap! I guess that means that, in twenty years time, all hallways will be as funky as this one:

My father was right. It didn't matter how much I lied on my resume. My real resume was in my cells. Why should anybody invest all that money to train me when there were a thousand other applicants with a far cleaner profile? Of course, it's illegal to discriminate, 'genoism' it's called. But no one takes the law seriously. If you refuse to disclose, they can always take a sample from a door handle or a handshake, even the saliva on your application form. If in doubt, a legal drug test can just as easily become an illegal peek at your future in the company.

Skiing

I'm away skiing until Saturday evening. Hopefully the bad weather will have moved far enough south for me to get some runs in tomorrow. Wish me luck.

Edit: We had awesome weather every day. Very cold (ice on the inside of the cabin), but sunny and clear.

SJD - Songs from a Dictaphone

I got my copy of SJD's new album, Songs from a Dictaphone, today. The CD itself looks like a reel to reel tape.

I got all excited checking the album out with cdparanoia. It gave twelve tracks, one more than the listed eleven. Ooh! Turns out there's just thirteen seconds of (absolute) silence before the first track. Damn. I was hoping for something cool hidden there.

Theme? Oh, yes. This from the track listing:

Jesus
I Am the Radio
Lucifer

And yeah, track six fits as you'd think. The lyrics (included in the inset - great): "God's guitar is on the radio. He plays the bass on just one string oh." and so on. There's also a bit in common between the last couple of songs ("Forty flights up, with the travelling blues", "Falling through the air till they hit the ground").

I don't know if I like the transition to SJD being a full, consistent band in recording. I'm sure it makes a lot of sense for touring - no problem with that. And these songs are much more suited for a band than any of SJD's previous repertoire. The idea still makes me nervous though.

I guess it's the music that matters though. I love the gospel touches here and there. Holy hell the last song is good.

I'm a sucker for lyrics, and this album has some awesome, awesome ones. The density is incredible: a stanza is repeated no more than twice, most of the songs have no repeated lyrics at all. Some fragments:

Lucifer, I've been so fucking bad,
And you're the best excuse I've ever had.

Just a note
To say hello
While I watch
The garden grow
And here's a weed
And here's a thorn
And here's a welcome
I've outworn

Just spilling ink
Upon a page
Green for envy
Red for rage
Blue is something
I can't hide
And black is just
Too beautiful to keep inside

North Western motorway is quiet tonight
A blue green flicker from the windows
Of every house that we pass by

Flight of the Conchords

Flight of the Conchords has been getting better after the first couple of shows. I've still only heard a handful of original songs, though - the majority are taken from their live shows. EZTV have picked them up, for those who are interested.

Is it because of my mould farm? Is it because of aspergillus fillius? Is it something from a while ago? From when we're at school, and you said you got a hickey from Judy Bailey, and I told everyone it was from a vacuum cleaner? It it because I drank all the orange juice? Is it because I drank all the apple juice? Is it something subtle? Is it because I eat too loudly?

It's business, it's business time!

Business hours are over!

New Zealand - Why Not?

I can't help but post this one. Yep, more Flight of the Conchords caps. Again, their funniest show yet. With an original song (about how Brett has it going on). Let's review:

Brett has bulimia. And an eye-patch.

Holy hell. John Hodgman! (he's a PC)

And then, in order, we have:

Jemaine Clement as (1972) Ziggy Stardust Bowie!

Jemaine Clement as (1980) Ashes to Ashes music video Bowie!

And, best of all

Jemaine Clement as (1986) Labyrinth Bowie!

All followed by a massive space opera music video.

Cyclic Linked Lists and Racing Pointers

It's common for programmers to be given interview questions about programming. Usually, these questions involve implementing quick, small algorithms - for example, you might be asked to write a function that reverses its (string) input.

Today, I heard an interesting question to do with linked lists. A linked list is a data structure consisting of a number of nodes, each of which has (in this case) a single link to another node. Mathematicians: think 'directed graph'.

Here's an example of a simple linked list:

           +------+
head ----> | Node |
           |------|
           | +data|       +------+
           | +next| ----> | Node |
           +------+       |------|
                          | +data|
                          | +next| ----> NULL
                          +------+

To traverse the list, we'd start at 'head', a pointer to the first node. We'd then grab successive nodes, by using the 'next' property of whatever node we're up to. Finally, we'd reach NULL, the end of our list.

The advantage of a linked list is that it's quick (O(1)) to add items to (and remove them from) one end of the list, which makes the linked list very helpful for implementing a stack. For those interested, retrieving a particular node from a linked list is a O(n) task, because the list must be traversed until the node is found.

Anyway, there's a class of linked list we can call cyclic. A perfectly cyclic list is one where the nth node's 'next' pointer points to the same place as 'head':

               +--------------------+
               |                    |
               v                    |
           +------+                 |
head ----> | Node |                 |
           |------|                 |
           | +data|       +------+  |
           | +next| ----> | Node |  |
           +------+       |------|  |
                          | +data|  |
                          | +next| -+
                          +------+

But, consider that a cyclic list need not be perfectly cyclic. Instead, the cycle could involve any subset of the list. Here's an example of a very small cycle:

                  +---+
                  v   |
-> [ ] -> [ ] -> [ ] -+

And a larger one:

           +-----------------+
           v                 |
-> [ ] -> [ ] -> [ ] -> [ ] -+

The question was: write a (good) algorithm that checks if a linked list is cyclic. The algorithm should be optimised for memory usage.

At first, that doesn't seem too hard. Here's a naïve solution:

Solution 1: For each node, if the node's next pointer is NULL, then the list is not cyclic.

This idea is perfectly correct, and we'll use it in all our solutions; if we ever come across a NULL pointer, our job is done.

But there's a major problem with this idea: we can't implement it as an algorithm to find cyclic lists. To see why, consider that, in a cyclic list, we don't know when we've reached the end of the list and started looping back round. Which is a major problem: if we can't tell when we've finished traversing the list, and we don't come across a NULL, this decision method won't finish.

If our linked list is perfectly cyclic then it's not a worry. We simply need to check if we've reached 'head' - if we have, we've checked all the nodes without coming across a NULL, and the list is not (perfectly) cyclic. But consider a non-perfect cyclic linked list, like the one I gave above. In this case, we can only tell if we've looped back upon ourselves by checking against every node we've visited. And that brings us to:

Solution 2: For each node, store the node's location. As we traverse the list looking for a NULL pointer, compare the current node with the list of previously visited locations. If the current node is there, we've finished traversing the list, and it is cyclic.

This is a working solution. The problem is that it is inefficient, both in terms of space (memory) and time (CPU cycles).

To see why it's slow, consider a list of a hundred items. As well as traversing the list (an O(n) operation), you'd be doing a number of comparisons (0 < x <= n) at each node. This works out to be something like n² (= 10,000) operations.

More pertinently, we're storing a pointer to each node in memory. So, we're using n * [the size of a pointer] units of space. The question didn't specify CPU time as a constraint. So, let's try optimising the memory we're using.

Solution 3: For each node, iterate from the head node to the current node, checking if the node points to any of the nodes between the two.

This solution uses two pointers that keep track of where in the list we are up to. The first, the 'outside' pointer, just goes through the list. The second, the 'inside' pointer, checks that none of the nodes between the start of the list and the 'outside' node are pointed to by the 'outside' node.

So, in effect, this algorithm targets the right-most node in the diagrams above - the node with the long pointer. Here's another way to think about it: instead of storing the nodes visited, this method uses the fact that the list has already stored them. We only need remember our position.

While this algorithm is very efficient in memory, it's about as slow as solution 2. The number of comparisons done at worst case is something like n² / 2 (triangular), which means it's still O(n²).

This is a correct answer to the question. It's as efficient in memory as you can get. But it's a little unsatisfying. There is another solution - I found it fairly remarkable. It uses the same amount of memory, and is O(n):

Solution 4: Start with two pointers to the first element of the list. Now, move the first pointer one node ahead of its current position. Move the second pointer two nodes ahead of its current position. If the first pointer equals the second, the list is cyclic. Repeat.

This algorithm is, to my mind, not very intuitive. I call it the racing pointers method. It works because:

The second pointer is going faster than the first, so
It will reach the cyclic portion of the list before the first pointer.
The algorithm will keep running, however, until the first pointer eventually enters the cyclic portion.
At which time, the second pointer will loop back, behind the first
And approach it. Then:
- If the second pointer is one node behind the first pointer, they will meet at the next movement.
- If the second pointer is two nodes behind the first pointer, they will meet in two movements.
- And so on.

If the list is perfectly cyclic, then the faster pointer will catch the slower one exactly half way through the list. I think the worst performance you could wring out of the algorithm would be n. I think it's fairly clear that the fast pointer can be at most min(size of cycle, n/2) nodes behind the slow pointer.

In any case, I think it's a beautiful method, because it's so different from the others.

Skiing

I'm back at the snow until Sunday evening, from tomorrow afternoon. I'll see if I can take some cellphone footage or something this time, although you really look like a tosser skiing with your phone out.