the

askTog logo

bug
house

Search WWW Search asktog.com
  Table of Contents  •  Intro  •  10 Most Wanted Bugs  •  Bug Hall of Fame

  Pandemic  •  Applications  •  Websites & Browsers  •  OS-X  •  Windows  •  Multiple OSs
  Networks  •  Security Bugs  •  Hardware & Drivers  •  Programming & Command Lines

Last update: Tue, May 12, 2009



 

Join my intensive (and fun!) lecture/ workshop course. Sign up now!

Interaction Design course: Go from zero to interaction designer in just three days.

You may be coming in cold from engineering or graphic design. You may already be an interaction designer wanting to "fills in the blanks," establishing a more solid theoretical and practical base. I've aimed this course at all of you, covering the needs of both individual contributors and managers.

Join me as I teach the Apple method and show you how to not only organize for and come up with successful designs, but sell them to engineering and upper management.

It's intensive, yes: A one-semester-equivalent with a strong, real-world bias. However, we have a lot of fun along the way, and you'll leave having worked with a team to design and build a complete project, so you will have not only learned, but experienced everything taught.

User Experience Conference Website There's more than my course at an NN/g conference. You'll find a breadth of other specialized courses and networking opportunities that will put you and your company at the leading edge of the design curve.


Bug: The First Recorded Computer Bug

Supplier: Grace Murray Hopper

Alias: "Bug Zero"

Product: Harvard Mark II Aiken Relay Calculator (Computer)        

Bug:        

Bug first observed: 3:45 PM, September 9, 1945

Observer: Grace Murray Hopper

Fix: Removed moth from relay contacts

Current resting place of bug: Smithsonian Institute's Museum of American History

Discussion: The above is a photograph of the very first recorded computer bug, discovered wedged in a relay of the Harvard Mark II computer, then under development, by the legendary Naval officer, "Amazing Grace" Murray Hopper. As you can see, it is quite dead, making it eligible for inclusion in our Hall of Fame.

Entomological etymology: Conventional wisdom holds that this moth was the genesis of the term, "bug." Evidence exists, however, that the term, "bug," was in common use among hardware developers at least as far back as Thomas Alva Edison and his researchers. From an 1878 Edison letter:

It has been just so in all of my inventions. The first step is an intuition, and comes with a burst, then difficulties arise—this thing gives out and [it is] then that "Bugs"—as such little faults and difficulties are called—show themselves and months of intense watching, study, and labor are requisite before commercial success or failure is certainly reached.

Thomas Edison to Tivadar Puskás, Nov 13, 1878, Edison papers, Edison National Lab, National Park Service, West Orange, N.J., quoted in Thomas P. Hughes, American Genesis: A History of the American Genius for Invention, Penguin, 1989, p 75.

From the Oxford English Dictionary:

n. A defect or fault in a machine, plan, or the like. orig. U.S. 1889 Pall Mall Gaz. 11 Mar. 1/1 Mr. Edison, I was informed, had been up the two previous nights discovering `a bug' in his phonograph--an expression for solving a difficulty, and implying that some imaginary insect has secreted itself inside and is causing all the trouble.

A look at the larger view of the log book just below reveals the line, "First actual case of bug being found," attributed to Rear Admiral Hopper. The word, "actual," would seem to imply the team's earlier informal use of the word to describe "virtual" bugs in the same manner we do today.

None of this takes away from their claim that, at least in the computer world, this was the "first actual case of a bug being found."

Log book with entry memorializing Grace Murray Hopper's discovery of a "bug" (moth) in the Harvard Mark II Aiken Relay Calculator (Computer)

"Bug Zero" Log Book Page

Source: Department of the Navy, Navy Historical Center, Washington, DC

As a footnote, a friend of mine spent his early life as an appliance repairman. He was called upon to fix a washing machine whose mechanism made strange and disturbing squealing sounds whenever it was turned on. He was able to solve two mysteries at once when he discovered both the source of the unnerving screeching and the location of the neighbor's missing cat.

(The cat was fine except for a smear of grease and a crushing loss of dignity.)


Bug Name: Macintosh Disk Trash

Duration: Exactly 17 years, 2 months

Supplier: Apple Computer, Inc.

Alias: 

Product: Mac OS 1 through OS 9

Bug: To eject a floppy disk, users would drag the image of the disk into the trash.

Bug first observed: January 24, 1984 Around 1986

Observer: Tog

Class of error: Establishing two different resulting behaviors for the same object

Principle: Use different objects, with different appearance, to support different resulting behaviors

Bug Fixed: March 24, 2001

Fix: 

Discussion: A real-world trash can is where you put things you never want to see again. Yes, you can normally fetch them out if you get to them quickly enough, but your initial purpose in putting them in there it to get rid of them forever.

A forgotten engineer at Apple decided the trash can on the Mac ought to serve two purposes. He created a new rule: sometimes you put something in the trash to signify you want to get rid of it, and sometimes you put it in the trash to signify you don't want to get rid of it.

What could be clearer?

Of course, the rule was a little more complex: If the object was a file, get rid of it, but if the object was a whole volume, then remove it's image from the desktop and unmount it cleanly, so no harm befalls it.

You try warning new users to be very, very careful not to accidentally drop their 1 page memo in the trash and destroy it, but at the same time insist that they drop their 100 gig external drive in that very same trash. Why? Because if they fail to drop their drive in the trash, the system might destroy it when they pull the plug!

Why was the bug so persistent? Files and volumes were different enough that the error rate was low enough that Apple's engineering management did not received enough flack to fix it. They further rationalized that the trash trick was actually a short cut for power users. The long way around, for the regular users, remained in the menu bar. The problem was, in this case, the short cut was so superior that no one ever used the other method.

Why so attractive? Fitts' Law: The larger the target object, the faster it can be acquired. Because the trash was a large object close to a corner, users could mouse to it with great speed. (The corner pins the mouse, so they can hit the corner as fast and hard as they want without overshooting. See my discussion of Fitts' Law for further details.)

Where people got into trouble: I've covered the first problem the bug caused. It was difficult for new users to make any sense of the trash's split personality.

A second problem plagued users new and old alike. In the early days of the Macintosh, we didn't have those fancy hard disk you kids have today. We had floppie disks, and floppies didn't hold a lot. Often, we would use a whole floppy to transport a single document between office and home. One disk, one document.

After working for several hours on said document, it was time to eject the disk by dragging it to the trash. Often both disk and document would both be out on the desktop; both with similar names. It was really easy for even the most expert user to end up dragging the image of the document to the trash, instead of the disk.

Users usually quickly realized the problem when the disk was not immediately ejected—unless they got distracted by a phone call or something. Even then, at least in the early days, all would turn out OK. When they realized their error, they would open the trash, and there would be their precious work.

This was an outcome up with which the engineers would not put.

A new and "improved" trash can was introduced.
When empty, it was the familiar Macintosh trash.
            


But when it was full, magic happened:

The engineers just though they were providing users with a "neato" way to tell an empty can from a full one, but to many users, the new appearance suggested a painfully distended belly. Millions of people developed the unnecessary and undesirable habit of immediately emptying the trash as soon as the swelling showed. Drop something in; empty the Trash. Drop something in; empty the Trash. It became unconscious habit.

Remember the fellow who dropped the document in the trash by accident? Now, when he saw the trash instantly swell up, he was likely to erase it forever—drop something in; empty the trash—relieving the trash of it's apparent distress, even as another portion of his brain, in very real distress, was yelling, "No! No! No!"

Recipe for disaster: Take one scary bug, add a merely irritating bug, and stir. Result: A fatal bug.

Bug Fixed: March 24, 2001

Fix: System X has two distinct objects, a trash can and a disk ejector. They do not look anything alike.

The trash can looks like a photo of a real trash can.             
 

When it's full, the can itself looks almost the same. (It should be identical.) It has some neatly crumpled up paper in it, but no longer looks in distress.
 

The ejector object is an entirely different shape and carries a standardized symbol, instead of having a photorealistic appearance.

These two objects take up the same place on the screen, a neat trick, and one properly based on history.

How do you have two objects occupy the same place? It's easy, as long as they don't do it at the same time. Under System X, as soon as a user begins to drag a volume image, the trash can disappears, to be replaced by the eject object.

Since, at the time the switch takes place, the user is looking at the volume they are moving, not the trash can, many users are probably not consciously aware the switch is happening. The correct object is alway there when they arrive. The machine tracks the user's mental mode.

Conversely, errors are immediately picked up because the desired object is not there when they arrive.

Longest time from redesign to implementation: Among the records to which this bug might lay fair claim is the >16 years between the time the twin-object fix, as described above, was designed and specified and the time it was implemented. I know when it was designed and specified, because I was the one who did it, at Apple, back in 1985. It was satisfying to see it finally come to light in System X.

Not that I share any delusions that they found my design in a forgotten drawer and shouted, "Eureka!" No, they just came up with what seemed the most reasonable solution, sharing my keen sense of the obvious.

Sadly, at the same time as they fixed the disk trash bug, they trashed the primary advantage of the disk trash strategy to begin with. They relocated the trash to, well, a random location on the screen, by putting it in the Dock, a design bug unto itself, removing much of the Fitts' Law advantage and any possibility of motor memory in the process. (See my Top Nine Reasons the Dock Still Sucks.) Two steps forward, one step back.

Reader Response

Hi Tog!

While browsing your site, as I do every now and then, I came across the "Macintosh Disk Trash" bug. You say:

"A forgotten engineer at Apple decided the trash can on the Mac ought to serve two purposes. He created a new rule: sometimes you put something in the trash to signify you want to get rid of it, and sometimes you put it in the trash to signify you don't want to get rid of it."

If I might respectfully differ... my recollection of events is slightly different. (And I think it was Steve Capps who came up with it...)

In 1984 the Mac had only one floppy drive. Therefore, in System 1.0, it was impossible to eject and completely forget about a volume - but it was necessary to be able to copy stuff from one floppy to another. So, you had to select the floppy icon and then eject it from the menu, but a grayed-out version of the icon would remain. You could insert a second floppy but often had to reinsert the original immediately afterwards - the hated "floppy shuffle".

But once the shuffle was finished, you would proceed with the original floppy in place and would drag the grayed-out second icon to the Trash, to tell the system you wouldn't use that floppy anymore. So that's reasonably coherent.

The same applied for dual-drive systems. But somewhere before System 3.2 (I forget the exact version) when dual-drive systems were becoming more common, and the first hard drives began to appear, the select-eject-trash thing became too cumbersome and the natural way was to implement "drag icon to trash" as a shortcut for that - even though it was a strange way of overloading the Trash!

So many people started using that shortcut that the original way was forgotten and the grayed-out icons were de-implemented, so that icons vanished as soon as they were ejected. But everybody had gotten used to it. An interesting case of learning a non-logical thing by abbreviating a progression of logical steps!


So at the very least you might want to change the origin of this bug to a little later than "January 24, 1984"...

-Rainer Brockerhoff

Tog's Response

Rainer's memory exceeded my own. His analysis is excellent. I would only add that dragging the grayed-out image to the trash was just as bad an idea as dragging a non-grayed out image. The model was that objects had a separate and distinct representation, a sort of ghostly shell that was left behind when the object was removed. That was a pretty complex concept in itself, one that left many users more than a little confused. Nonetheless, it was a requirement for a computer designed specifically to disallow the addition of external disk drives, hard or floppy. The problem arose in then treating this ghostly presence as a an object in its own right, one that could be thrown in the trash. That just made people even more confused—how many objects were there, one or two? How could there be two when there was only one floppy being represented?

The whole trash thing was an expedient short-cut. It should have been implemented then as it finally was with the advent of System X, wherein the trash immediately changes to an eject icon as soon as the user begins to drag a volume object toward it. This was not difficult code to write; it was just easier to ship it as a hack.


Bug Name: Automobile Self-Destruct Switch

Supplier: Remco

Bug's Favorite saying: "Burn, Baby, Burn!"

Product: Remco Lube Pump for Lexus RX-300

Bug: The driver must accurately toggle a hidden, completely unlabelled switch inside the engine compartment in response to changing conditions. If, even once, the switch is forgotten or flipped the wrong way, it will destroy the $5000 engine and transmission within five minutes.

Class of error: Engineers assuming Mom and Dad have the same talent, training, and attention to mechanical detail as themselves.

Principle:  Human-machine interaction design should be performed by interaction designers.

Discussion: Was this some sort of dark nightmare that user interaction designers awaken from screaming? No, it was a real, aftermarket device for Lexus RX-300s being sold by a company in Florida called Remco.

Lexus had a problem with their RX-300 series transmissions: They had a tendency to go bad if you towed the car any distance. This might not seem like much of a problem to you, but, then, you don’t live in a 40 foot motorhome, travelling the country. To people like us who must drag a car after us, towability is everything.

Lexus didn’t know they had a problem when they released the car, so they had put a four year warrantee on the transmission, specifically certifying it for towing. When the claims started coming in, they downshifted, and, in 2002, pulled all warrantee coverage entirely for towing their new vehicles.

That’s where Remco stepped in. They make well-reviewed aftermarket products to make untowable vehicles towable. In general, their products are well-engineered and built with solid, reliable components. Unfortunately, their design for the RX-300, while equally well-engineered, was a user-interaction disaster.

Remco pumps run transmission fluid through the transmission while the car is being towed, so the gears do not dry out, which Lexus had identified as the cause of the failures. A small fluid-shunt switch was used to switch between running the transmission fluid through its normal path or running it through the pump.

If the car were towed with the switch in the "drive" position, no lubrication would take place while the car is being towed. If the car were driven with the switch in the "tow" position, no lubrication would take place because no fluid will be delivered to the transmission's own pump. Leave the switch in the wrong position, and you replaced the possibility of a transmission failure with a sure thing.

Few users would recognize the importance of taking it on themselves to address the problem by labelling the switch and methodically going through a self-prepared checklists each and every time. Even those that went that far might not recognize the importance of having your copilot (spouse) recheck your results. (That saved us once.) Because the switch was under the hood of the car, no casual glance would let you know you were facing disaster.

Proposed Specific Fix: Enough information, in this case, was readily available that the system itself could determine the correct position of the switch since the car was connected electrically to the RV during towing. Replacing the manual switch with a system-controlled solenoid switch would have solved the problem.

General fix: Can you even imagine a luxury car company like Lexus installing as original equipment an undocumented, under-the-hood switch that must be flipped several times per month, depending on changing conditions, with failure resulting in immediate, wide-spread destruction? Of course not! Why? Because they have professional designers trained to understand the capabilities and foibles of human beings.

The Three-Mile Island near-disaster was a result of just this sort of lack of understanding of humans and their propensity for error. The solution for the nuclear industry was not to "retrain" their engineers, but to hire hundreds of human factors professionals.

99% of systems engineers have no more business designing the human interface than I do working on the Linux kernel. However, they are not real the problem, because they don't want to design the human interface, to begin with. The engineering managers are at fault for forcing them to do so. The managers need to start hiring interaction designers before the disasters happen.

For a terrifying, yet amusing blow-by-blow account of one user's misadventures with the Remco pump, read my, "Anatomy of a Panic."

Bug on list since: 1 Feb 2005


Bug Name: Crash of European Space Agency Flight 501: Ariane 5 Rocket

Duration: 36.7 seconds

Supplier: European Space Agency (Contractor Not Identified)

Alias: The most expensive bug resulting from one line of code

Product: Ariane 5 Rocket

Bug: Computer program trying to stuff a 64-bit number into a 16-bit space.

Bug first observed: 36.7 seconds after launch, June 4, 1996

Submitted by: Terrence Martineau

Class of error: When you ASS-U-ME... you can make and ASS out of U, the European Space Agency and burn up $370 million dollars in equipment, and jeopardize 10 years and $7 billion worth of work.

Principles:

  1. Systems, no matter how mundane they may seems, must fail gracefully.
  2. Turn off unnecessary systems when not needed.
  3. Realistic modelling and testing should be done of any system before it is released.

Discussion:

It turned out the one line of code was actually unnecessary.

For a complex explanation in engineer-speak: http://en.wikipedia.org/wiki/Ariane_5_Flight_501

For an extremely lucid explanation by Jame Gleick: http://www.around.com/ariane.html


You may submit candidates for this list. Bugs must have already been exterminated and be widely known or have particular historical significance.   to send your candidates, using the general format that follows. Do not worry about extracting the Class and Principle.

Bug Name:

Duration:  

Supplier: 

Alias: 

Product: 

Bug: 

Bug first observed: 

Observer: [your name here]

Class of error: 

Principle: 

Bug Fixed: 

Fix: 

Discussion: 

Have a comment about this article? Send a message to Tog.

Previous AskTog Columns >


Don't miss the next action-packed column!
Receive a brief notice when new columns are posted by sending a blank email to asktoglist-subscribe@yahoogroups.com.

return to top

---
 
Contact Us:  Bruce Tognazzini
 
Copyright Bruce Tognazzini.  All Rights Reserved