Y2K Et Al

Saturday, March 20, 1999

Introduction

This is written for a US audience and presumes an understanding of the North American dialing and US Postal conventions.

Y2K dominates the news. But it's only one of many related issues that can generate a ripple effect of problems infiltrating every day life. Should we split area codes? What happens when the post office changes zip codes? Can we anticipate the consequences, and are we ready to deal with them?

I just visited a small company's web site that had a Y2K statement explaining that dates between 80 and 99 would be treated as being in the 1900's and the numbers less than 80 would be interpreted as being in the next millenium. They added humorously that they won't need to worry about this so-called solution until 2080. But what about old orders on file from before 1980? Or subtle problems like sorting files which may include files dated before 1980.

I visited the Sony site. Unlike the smaller software company, Sony has vast resources and a well-maintained site. But there was no sign of the splits in the 508 and 617 area codes. No sign of the local zip code changes from 021xx to 024xx. In fact, whenever I go to a site that asks for my zip code, I usually have to give the old zip codes in order to find the information I need.

It is too much to expect every site to anticipate all possible changes. It is not always clear what the appropriate action is. Does one update the area codes in an historic database of old business cards to keep them current or does one preserve them as archived records?

But these aren't only problems with computer systems. Have you ever been excited by a poster for an event only to find out that the event it advertised took place last year. Or you look at your address book and don't know if the area code is still valid because you don't know when you made the entry or when the area codes were split?

The reason for using two-digit years in computers is not just because it takes less space, it's because we have taken a familiar representation and moved it into a new domain without rethinking.

The Role of Context

Computers have forced us to examine how we handle and represent information.

We live in a very complex and ever-changing world. We can cope with the complexity by focusing on a small amount of what we experience. We don't need to understand the physics of the floor to know that it will support us or even that there is a floor. We just concentrate on getting to our destination.

When we ask what time it is, often it is sufficient to say "a quarter to" This makes a lot more sense than having to say "13:45 on the 23rd day of March in the year 1999 of the Common Era (or Anno Domini) according to the Julian calendar in Eastern Daylight Time. But if we want to remember an appointment, we need to record more information. If it is soon, it might be just the day of the week, or maybe the year, month and day if it is further into the future.

We don't always get this right. A note that says "Joe called at 11:45" won't be meaningful if it isn't read till the next day. Or if we fail to think about someone reading our announcement a year later, the announcement will become wrong when interpreted in the new context. If we agree to talk at 9:00, is that in my time zone or yours?

Stale information is only an occasional problem with paper-based systems, old information simply fades away. But the Web is rife with stale information or information that has lost its context. A meeting announcement with just the day of the week and a time might seem fine when viewed in the context of a calendar but baffle someone who goes directly to a web page after finding it in a search engine.

Time is not the only source of confusion. Is that price in Dollars or Euros, with or without tax, is it yesterday's price or today's? Did I get the phone number before or after the area code change? As long as old information generally fades away and we don't encounter these confusions too often, we can laugh or grumble and move on.

Information

But we are increasingly storing information in databases. We are now posting information to the Web where it doesn't fade away. We are using phone numbers as identifiers. We keep addresses in databases, both personal and corporate.

More important, we are relying on idiot savants, our computers, to interpret the information. And few of us understand how to represent information in this new environment.

Representing information is a fundamental communications skill. Just like we must learn to avoid idioms when speaking to people who do not share our culture, we need to learn how to represent information for use by computers and so that it makes sense when viewed in other contexts and times. It's OK for me to type 01 and assume my computer is smart enough to translate it into 2001. But when it is stored in a database, it the "01" loses its context. It must be stored as 2001.

Our public policy must also reflect the persistence of information:

  • Area code splits are no longer tolerable. Because phone numbers are stored in too many places, we can't change old ones. Instead of splitting area codes, we must add new ones.
  • zip codes cannot change. Not only are they stored in many places but changes happen so rarely that we don't have the tools to change all the personal address book databases.

Computers not only force us to reconsider our practices they also give us the tools to implement new policies. The dialing pattern and zip codes are more than numbers, they encode policies. The area code and exchange tell the phone network how to route a call. The first three digits of the zip code represent the postal distribution center. If we build a new postal center, we must change the zip code. Or, at least, we used to.

As we've show with 800 numbers, we no longer interpret the number itself for routing (i.e., 800-273 meaning that the call is directed to New York City). The number is now used to look up the route in a database. Without such flexibility, any changes to 800 numbers due to moving or changing carriers required that everyone who had a copy of the number needed to be notified to update their own databases. Just like we now do with area code splits!

Representation

The issues of Y2K, area codes and zip codes are fairly straightforward. It is easy to understand that we need to represent them in ways that allow them to be stored and then viewed in new contexts

It's also obvious that dates should be represented in full form including the full four digit year and time zone. But the issue gets more complicated as we think more about it. In a calendar, I just write the day by itself. But if I expand that date, I need to show the full expanded date since it can be viewed outside the month table. It's fine for my email reader to show me a friendly date as long it stores the date in full form. But if I forward a meeting notice that has been localized for my reading, the recipient won't know how to interpret the date. Leap seconds are another issue which I've written about separately.

In fact, the issues of representation are very complex and beyond the modest scope of this essay. The examples of Y2K, area Codes, zip codes are fairly simple. Seeing them as issue of representation can provide a guide to setting policies for area code splits and related issues.

These concepts of representation also extend to problem solving � we should think of solving a problem as finding a representation that makes it sufficiently simple. Copernicus didn't prove that the Earth revolved around the Sun. He showed that modeling the solar system with the Sun at the center was a simpler way to explain his observations than the Ptolemaic system of epicycles.

Understanding representation is part of functional literacy in a society in which we are all storing information in databases and publishing it on the Internet.

The issue itself is not new, it provides a way to explain what we already do. When we give directions, what information is important to the task and what is not? When do we need to show all the twists and turns and when can we just give a schematic map? How do we handle ambiguous information? In an address book, when "John Smith, VP Sales" changes, do we want to update the title to track John Smith or do we want to change the name to track the VP of Sales?

Computers have given us new tools for solving problems. But the larger importance is in giving us tools for thinking. And they force us to be explicit about issues such as representation.

As we move more of our information into databases and use computers to handle it, we need to be aware of the fact that the information will be received in a different context and possibly in a significantly different time-frame than originally intended. A memo about the Y2K problem dated 3/1/99 highlights this. Not only is the year two digits but 3/1 could be March 1st in the United States or January 3rd in Europe.

Y2K, Area Codes, and Zip Codes (Re)examined

As we think about representations, it's worth revisiting familiar problems from a larger perspective.

Y2K This issue is already receiving a lot of attention, but little understanding. The two digit date field is a consequence using our casual practices when designing software. The space savings rewarded this approach.. The good news is that we are used to recovering from such confusion, even if this might be on a large scale.
While we are thinking about dates, we also need to remember that dates are written as m/d/y (Month/Day/Year) in the United States and as d/m/y in other parts of the world.
Area Codes We must stop splitting area codes. Phone numbers are stored in too many places. The Web makes the issue obvious but we also have school alumni lists and other printed material that grows stale as the world changes around it.

Omitting the area code is a convenient shorthand, just like omitting the 19 in writing the year. But as more of our calls go beyond the local area, the 7 digit phone number, it will be easier to dial a full number rather than remembering the special case of local calls. Numbers are often written without the "1-" prefix since, in early systems, it was often not required or even permitted. Taking advantage of the fact that it is the same as the international prefix, it would be best to simply write the number as "+1-aaa-xxx-nnnn" so that they are valid internationally. The "+" means that the next number is a country code. By "coincidence", the "toll" prefix and the North American country codes are the same.

Once we start rethinking the phone number, we can address the issue of dialing people rather than phone. Why does a household with 10 people have a single phone number? Why not assign 10 different numbers or allow additional digits? An email address serves a similar purpose as an identifier for reaching a person's mailbox. The two naming systems will merge. (See my essay on email addressing for a more complete discussion on email.) The term "number", like the term "dial" are misleading relics of mechanical switches and rotary dials.

Given these changes, the frequent splitting of area codes seems like a futile, but annoying, response to the larger trends.
Zip Codes There is even stronger reason to stop reassigning them. We can't expect full support for such a rare event. While they may have started out as tools for the Post Office, zip codes are now owned by society at large.

Encoding routing information in zip codes is an example of a common practice of encoding information in identifying numbers. The area-code/exchange of a phone number identifies the location of a phone. Item number "NY-242K" might be in the New York warehouse and the "K" indicates it is a kit. This was very useful before computers but remains a convenient way to make them meaningful to people. But it makes it harder to change these systems.

People have gotten used to assuming similar zip codes are located near each other even though it is often not true. Zip codes have also become keys in geographic databases. Whatever the original intent of the number scheme was, it is important to be aware of the consequences of a change. In computing, we've learned how to maintain compatibility with existing applications, whether or not they conform to the official rules. The Post Office must recognize that its numbering system is not just for mail delivery.

Letter to the Boston Globe

The pragmatic purpose of this essay is educate those making near-term decisions about whether to split the area codes in Boston rather then creating overlays. Towards this end:

With all the focus on Y2K and, occasionally, area code splitting, it is helpful to step back and understand that these are examples of the larger issues of how we represent information.

The Y2K problem is the consequence of translating the normal practice of omitting the "19" when we write the year. People are used to interpreting dates in a context (such as "now") that computers don't have. We see a similar problem when writing "3/1/99" and then publish to the international audience on the Web. Most of the world will see this as a date in January.

The practice of omitting the area code to make dialing easier is very similar. It was very helpful in the days of mechanical phones when few "long distances" calls were long distance. As more, if not most, of our calls require an area code, this "savings" now becomes the burden of having the dialing rules change as we cross the boundaries between Wellesley, Newton, Waltham and Weston.

Much more important is the fact that phone numbers are stored in many address books and documents. It is difficult to find them all and change them for every split. The web is now rife with stale area codes. If I see a 617-890 number is it an old Waltham number or a new Boston number. There is no way to tell.

The zip code change (021 to 024) is even more problematic. Very few databases have been updated with the new zip codes. As with Y2K, it's no surprise that we are not prepared for rare events. Y2K is beneficial as a reminder to design systems to be resilient in the face of change.

We must recognize that the consequence of changing these identifiers goes well beyond printing new stationery. We need a simple and stable phone number as "+1-617-555-1212". The plus means that this number is valid anywhere in the world.

For a more philosophical discussion, see http://www.frankston.com/public/essays/Y2KEtAl.asp.

Bob Frankston Site