Case 26: How many hyphens?

5th December 2002

Filed by: Officer Taylor

The Offence

The Internet Engineering Task Force (IETF) does a fine job of producing RFCs - the specifations that prescribe the technical details of how the Internet works. These documents, unlike those produced by officals standards bodies such as ISO and ANSI, are freely available to anyone who wants them.

Occasionally, though, the editorial process is not as thorough as it ought to be. This is from the specification of what makes a legal Internet host-name, in RFC 1034:

[The components of Internet host names] must follow the rules for ARPANET host names. They must start with a letter, end with a letter or digit, and have as interior characters only letters, digits, and hyphen.

The problem here is a small but critical one: the specification refers to ``letters'' (plural) and ``digits'' (plural), but ``hyphen'' (singular). Is it saying that you can only have one hyphen in a host-name component? It turns out that no-one really knows the answer to this: the legality or otherwise of host names such as shs-sales-mkting.co.uk is not really well established one way or the other.

The Verdict

We are tempted to return a verdict of not guilty on the grounds of diminished responsibility - the people who write RFCs are not paid for their work, and have to find the time on evenings and weekends.

But, hey, badgers to that. We of the SAGP have never found anyone not guilty before, and we're not going to start now! Accordingly we find the accused guilty of unacceptable vagueness in what is supposed to be a formal specification.

The Sentence

We feel compelled to sentence the IETF harshly, not so much because the offence is particularly grave, but because its consequences are so wide-reaching. Because of a moment's carelessness on the part of the author, and a few more moments on the part of the reviewers, there are many thousands - maybe even millions - of computers on the Internet whose names may or may not be legal, and no-one knows for sure.

In fact, it's even worse than that: one perfectly reasonable reading of the specification we quoted would be that each Internet host-name component must consist of an arbitrary number of letters and digits, plus exactly one hyphen. In which case, this site would have to be renamed to something like sa-gp.mike-taylor.or-g.u-k. Not an appealing prospect. (Fortunately, no-one takes that interpretation seriously, otherwise every single .com address would have to change to either .c-om or .co-m.)

With all this in mind, we sentence the IETF, arbitrarily and vindictively, to a hundred and fifty years hard labour. And let that be a lesson to the rest of you.

Next case!

Feedback to <mike@indexdata.com> is welcome!