The Zune bug

I’m not sure if it’s really true, but it comes from a reputable source:

Cause for ZUNE leapyear problem

It’s kind of a funny bug, actually:


year = ORIGINYEAR; /* = 1980 */
while (days > 365) {
    if (IsLeapYear(year)) {
        if (days > 366) {
            days -= 366;
            year += 1;
        }
    } else {
        days -= 365;
        year += 1;
    }
}

I heard that it had something to do with leapseconds, but it actually was a simple problem with leap years. Very easy to fix too. And now I understand why Microsoft’s workaround was just to let it run out of batteries and wait until the day after the last day of the leap year and turn it on again (source). All you had to do is get out of this infinite loop for the last day of the leap year.

I’m so glad I’m not working on any software that gets deployed to devices in the wild that you can’t just press a button (or maybe a couple of buttons) and deploy a fix. If something like this happened to any of my software, somebody would be paged and we could find the bug, patch it and deploy it in less than an hour (depending on whether there are higher priority builds and deployments going on at the same time). Poor Zune-ers.

Advertisements

2 Responses to “The Zune bug”


  1. 1 Brian January 5, 2009 at 3:39 pm

    unit test FAIL

    • 2 michelgoldstein January 5, 2009 at 6:59 pm

      Certainly it should have been unit tested and could have been caught in an unit test. The problem was that the person creating the tests didn’t think about the edge case of the last day of a leap year. If they had, probably they wouldn’t even need a unit test to realize this was broken.

      Then you can start on the TDD camp saying that if you were only thinking of unit tests first you probably would have written a last day of leap year test, but I still think it all depends on what the person had in mind on where they thought of what could be an edge case on this code. Anyway, I’m not trying to say that the engineers that worked and tested this code are not to blame for this bug, I’m just saying that I could see how this could have easily been ignored in the testing process. Much easier if all the testing was done in the system level, where testing something for a specific date would be very painful and sometimes not even possible, unless you test for 4 years to catch all the possible dates in a 4-year leap year cycle.

      Oh, and I think this slashdot discussion is much better than my blog on this one (or on pretty much anything that makes it into slashdot):

      http://slashdot.org/article.pl?sid=09/01/04/2034248


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




RSS My FriendFeed RSS

  • An error has occurred; the feed is probably down. Try again later.

%d bloggers like this: