views:

5481

answers:

38

Is there a good list of "worst software project failures ever" in the history of software development?

For example in Canada a "gun registry" project spent around two billion dollars.

This is of course, insane, even if the final product "sort of worked".

I have heard of an FBI Case file system which there have been several attempts to rewrite, all of them so far, failures.

There is a book on the subject (Software Runaways). There doesn't seem to be be a software "boondoggle" list or "fiasco" list on Wikipedia that I can see.

(Update: Based on a response of human sympathy Therac-25 would be the 'winner' of this question, except that I was internally thinking more of Software projects that had as their deliverable, mainly software, as opposed to firmware projects like Therac-25, where the hardware and firmware together are capable of killing people, and also, the question was more intended to address boondoggles and bureaucratic failures rather than tragic explosions, or deaths. In terms of pure software monetary debacles, which was my intended question, there are several contenders, and an interesting future community wiki, is "what are the common traits among large software project failures with budgets over $100 million US".)

+6  A: 

Windows Vista comes to mind

Cody C
can throw in Windows ME for that matter
Cody C
+1 Vista cost $10B, by one estimate. http://seattletimes.nwsource.com/html/businesstechnology/2003460386_btview04.html
Bill Karwin
Anyone remember Microsoft BOB?
RobH
To be fair, most of Vista was used as the basis of Win7, which looks to be fairly successful.
Wahnfrieden
@Wahnfrieden: And Vista's server version, Server 2008, seems to have been good, from everything I've heard. I'd call it a series of UI problems, which have likely been fixed in 7.
David Thornley
Vista had far more than just UI problems at launch. People have just forgotten that since a lot of the bad stuff was fixed in patches.
Wahnfrieden
I've had some funky issues with Server 2008 configuration screens not doing what they say they're going to do, but once it's running, it seems solid.
Brian Knoblauch
+12  A: 

The worst one is when you are responsible for it.

z-boss
I know a satellite engineer who has secret hopes the rocket will blow up so he won't have to worry his little fiddly bit on the satellite will fail.
Nosredna
@Nosredna: That sounds like a plot from a TV detective show.
Michael Myers
I'm not saying he DID anything about his wish. That I know of.
Nosredna
+15  A: 

I recommend you take a look at thedailywtf.com for numerous articles about software failure. In many cases the stories are true; only the names have been changed to protect the guilty.

Randolpho
Unfortunately, from what I've read, some of the facts are changed to make it funnier.
David Thornley
Probably true. :)
Randolpho
+11  A: 

The Netscape rewrite.

Raj More
+25  A: 

Mars Climate Orbiter 23 September 1999 Orbiter Crash landed on surface due to metric-imperial mix-up

Otávio Décio
Seriously, what was the excuse for using imperial measurements in a scientific project? More importantly, why wasn't there adequate end-to-end testing to detect this kind of error months and months before launch?
Juliet
No excuse, the upcoming Constellation Program (replacement for the space shuttle) will still use imperial units.http://www.newscientist.com/article/dn17350-nasa-criticised-for-sticking-to-imperial-units.html
Ludwig Weinzierl
Since Americans don't use metric measurements, it actually isn't very surprising. However, that it wasn't checked and caught in the qa process in a project this size...
HLGEM
Americans DO use metric measurements in science, from grade school on. We don't use them for cooking or driving or checking the temperature or measuring how tall and fat we are, but we do use them for science, and we have for a long time. At least since the 1970s.
Nosredna
@Ludwig Weinzierl. That article does a pretty good job of explaining why they won't switch to metric. Money.
Nosredna
To support Nosredna, we do use metric for almost all calculations. I don't know what engineers use, but I know optics are in both imperial and metric units (mainly for ease of calculation).
Steve
"...At least since the 1970s". Unless you're NASA.
KitsuneYMG
NASA's problem is that they are perpetually using assets from the previous mission, so there's no obvious time to completely retool.
Nosredna
Should have used boost::units !
tragomaskhalos
Almost everyone misunderstands this event. They were aware that the contractor was using imperial measurements - that's allowed by NASA for American contractors. The issue had nothing to do with the measurements conversion - that was just a symptom of the problem. It's a much more nuanced issue. People just like to sensationalize it: those dumb scientists!
Wahnfrieden
Nosredna: I think metric is more widespread than you think. I was born in 1980, and in American public school I was *only* taught the metric system (because "we'll all be using it by the time you graduate from high school"!). If I ask my coworker the temperature outside, I'll get a response in °C. When I shop for groceries, I buy a liter of olive oil -- and when the label has both I still look at the metric, because it's easier to compare price-per-unit. Even with no further action, eventually the old people like my parents (who don't know metric) will die.
Ken
Shoulda used Smalltalk, where (with the appropriate class libs loaded) "2 meters + 3 inches + 6 furlongs" is perfectly valid (and evaluates correctly!) :-)
Bob Jarvis
+7  A: 

Here's a list of some of the worst:

http://www.wired.com/software/coolapps/news/2005/11/69355

These include the Morris worm, the Kerberos vulnerability, the Therac-25, and the Mariner I space probe failure, among others.

AlbertoPL
the soviet pipelines story doesn't sound very plausible IMO
ammoQ
nothing to do with opinion, it's been made public and did happen. http://www.msnbc.msn.com/id/4394002
AlbertoPL
ammoQ is right. The story has no credibility, because there are no facts to support it. There was no pipeline explosion in 1982 - that's a fact. Everything else is one man's words. But I'm sure his book sells good. A bit more on this here: http://www.bookscape.co.uk/short_stories/computer_hoaxes.php
Igor Krivokon
hmm, I get to point the finger at a lot of people who told me this was true... seems you're right.
AlbertoPL
+6  A: 

According to The Mythical Man Month, OS/360.

Zack
It was hardly a failure in any way... First it still lives thru z/OS 40 years later, second it sure was late and over budget, but this is just ordinary trouble for software projects, specially that big.
wazoox
Fred Brooks, the manager of the project would tell you differently... something along the lines of how humbling it is to make such an expensive mistake.
San Jacinto
+52  A: 

The Ariane 5 integer overflow :

alt OUCH

darkrain
That one is hard to beat in terms of spectacularity.
Fredrik Mörk
actually.... it's tragic :(
darkrain
I don't know if it's tragic... there wasn't human life lost, to my recollection... just a huge ton of money.
San Jacinto
Very tragic, but so true! I guess this one is especially sensitive since its a "minor" bug with huge concequences
Henri
Interesting. A 64bit floating point to 16 bit integer conversion. Number was too big for the 16 bit integer. Code was in Ada. Although other parts of the code had protection, that part didn't. The result was an exception, and the flight system interpreted the diagnostic bit pattern as flight control data. Fancy!
Nosredna
on the same note, wasn't there a nasa satellite that flew into the sun? they later found one team was working in metric and one in imperial.
James
I think that the 64 -> 16 bit conversion wasn't protected because in the previous generation hardware they proved that it couldn't overflow 16 bits. I remember this story from my systems engineering course.
KitsuneYMG
Is this a project failure or a bug? I'd say a bug.
CodeSlave
was ready to add this...seems to be an expensive one also
andreas
@ CodeSlave...this is definitely an example of project failure. When the project explodes into little pieces that is ultimate failure. Think about it.
Devtron
@James: Mars, not the sun. :) That was the Mars Climate Orbiter (http://en.wikipedia.org/wiki/Mars_Climate_Orbiter)
Greg D
@James that story is widely misunderstood. They KNEW that the two teams were working in metric and imperial - that's allowed by NASA. The actual problem had more to do with management procedures than anything to do with metric vs imperial. It's a much more nuanced issue.
Wahnfrieden
They should have used Mercurial.
orokusaki
I blame the French.
Rusty
This is not purely a software project failure. It was an engineering project failure, and the overall system fault that destroyed the vehicle was a software fault, but it was a software + engineering failure whenever a vehicle can explode due to a software bug. (The software should be able to fail completely without a vehicle exploding, or else the hardware engineering team are to blame for engineering a potential time bomb.)
Warren P
@Warren P: I believe the explosion was the self-destruct mechanism being invoked because of the bug (because the rocket was going off course because the exception was being interpreted as flight control data), rather than being a direct cause of the bug.
Rich
+22  A: 

Netscape 6.0. It cost them a considerable lead in the browser wars.

Here's Why: "Things You Should Never Do"

JohnFx
I you did read it, have a look on In Search Of Stupidity - over 20 years of high-tech marketing disasters. Netscape have it's own chapter :) +1.
Sylvain
Thanks JohnFx for that link.
drikoda
+1 great article
markh44
+47  A: 

Duke Nukem Forever.

Justin Largey
A classic, that transcends the gaming industry.
pearcewg
Now it truly is forever.
Nosredna
@Nosredna, maybe not. I'm pretty sure the parent company has sued for rights for the franchise.
Simucal
Half Life 2: Episode 3 just might join Duke Nukem Forever on the vaporware shelf. :-)
Warren P
Actually, Duke Nukem Forever was taken over by Gearbox Software and is expected to be released sometime in 2011.
Bernard
I came here to say this! @bernard : about time, is been in dev since 1998!
Dave
+26  A: 

Windows ME

Will Eddins
the worst operating system built by Microsoft
jerbersoft
@jerbersoft that's saying a lot !
GuiSim
http://www.deanliou.com/WinRG/
Arnis L.
orokusaki
Vista is a fine operating system. The problems that plagued it were driver issues, mostly because of Nvidia's late support and BSOD-causing issues, and the "I'm a PC" Apple ads. The latter is what really spread bad hype about it.
Will Eddins
crokusaki: ME was not an early XP, but a later 98. XP was built on 2000, ME not.
ammoQ
@ammoQ... i don't think he'll let the facts get in his way
John
@Will Eddins: Vista has problems still, and had more problems when it was released. Driver problems were partly Microsoft's fault, since hardening video drivers for anti-piracy purposes made them harder to write. You're also missing one of the big sources of ill-will, the "Ready for Vista" certification that Microsoft allowed for computers that couldn't handle the new UI. The Apple ads were irrelevant compared to that certification fiasco.
David Thornley
@David Thornley: Yeah, I didn't think about the Ready for Vista fiasco, but that seems like a marketing fail more than a software issue.
Will Eddins
@orokusaki:ME had absolutely nothing to do with XP. It was a completely different architecture - from the DOS low level starting, up to the user-mode system libraries.
slacker
+9  A: 

How about the Pentium bug- http://en.wikipedia.org/wiki/Pentium_FDIV_bug

Andriyev
Somewhere I have a key chain with one of those Pentiums in Lucite.
Nosredna
Nice souvenir :) We had brief case analysis sessions on the Pentium bug and also other software failures like Ariane 5 failure, Patriot Missile failure etc. in our Formal methods course.
Andriyev
The bug itself wasn't nearly as bad as the horrendous mishandling of the PR afterwards.
JohnFx
Yeah, the PR was worse. Still, the bug was pretty bad. 4195835*3145727/3145727 = 4195579 (The flawed Pentium)
Nosredna
Technically that's a hardware error, although you could say that the high level chip design language stuff is software. But then, everything is software, if you include anything designed with a formal language, on a computer.
Warren P
Way overblown. Only notably because of the botched/completely clueless to human factors response from Intel.
Brian Knoblauch
+46  A: 

Therac-25.

Let us all remember that our carelessness, in life an whatever profession we choose, has real consequences.

San Jacinto
This is the one I was going to post.
Nosredna
Not as simple as the others (and thus less obviously preventable), but more egregious since it involves lethal radiation aimed at a person :0
StuffMaster
I agree, but every article I've ever read about the problem hints at very poor planning for safety. For instance, on most missile systems, there are mechanical safeties since the software isn't trusted for the level of reliability required.
San Jacinto
Is this a project failure or a bug? I'd say a bug.
CodeSlave
Read a couple articles about how they handled this bug. It was both.
San Jacinto
If the question was "most devasting software flaw" this would win. I suppose that it resonates with the public too, and could have been a catalyst for all kinds of government regulatory changes.
Warren P
+6  A: 

Three people died (and three others horribly injured) due to a small and rare race condition in a medical radiation device.

A really sad story.

SPWorley
we were thinking along the same line, i see.
San Jacinto
And this one is also the one I was going to post. :-)
Nosredna
+13  A: 

Crystal Reports

Brownman98
LOL @ CRYSTAL REPORTS!
JonH
Too bad I can only give ONE upvote for this!
Brian Knoblauch
+7  A: 

Chrysler's infamous Payroll application, that spawned the Extreme programming concept.

Something like 5+ years of development, and it never cut a single check, then had the plug pulled. Chrysler then banned the practice of Extreme Programming.

Neil N
I know a lot more non-XP project having known the same end :)
Sylvain
Check your facts. "Never cut a single check" is flat wrong (though it only paid 10,000 people out of the 87,000 initially planned), the plug was pulled when the company got bought out (and after a key non-engineering position couldn't be filled), and while there was an announcement that Chrysler had 'de facto' banned XP, they later started using it again. http://en.wikipedia.org/wiki/Chrysler_Comprehensive_Compensation_System
Joe White
Congrats on your wikipedia skills.
Neil N
Interesting! Thanks I hadn't heard of this. But as a distruster of buzzwords, agile, extreme, etc, this one make me chuckle. As if those who 'canned' the project could put the whole thing down to XP and have any idea what they were even disliking. Irony.
Warren P
"Extreme" makes people think "risky", and risky is exactly what you DON'T want with payroll! Payroll is probably the single most important thing as far as keeping employees from going postal! So yeah, "Extreme Programming" will immediately be blamed for a failure if it was in place.
Brian Knoblauch
+4  A: 

The 1985 failure by the IRS to adequately test the new (but overdue and over budget) Sperry Univac system.

The agency sent refunds of tens of thousands of dollars to people who were owed nothing. They failed to send refunds to people who were owed money.

The departments were so backed up that ceiling tiles were removed so that tax returns could be shoved up into the ceiling. Returns (including checks from taxpayers) were flushed down toilets and taken home by employees to be thrown away. All so employees could look like they were keeping up.

In the wake of the fiasco, a new $20 billion plan was put into place to overhaul the system. It was also a fiasco.

Nosredna
+11  A: 

In Britain, the NHS National Programme for IT. It might reach GBP 20 billion, so 30 billion USD.

It's now part of law that every British contractor must work on the project at some point...

gbn
And it may b scrapped. Now we can afford all those databases to track our every physical and internet movements in real time. Yay!
Callum Rogers
Of course now Clegeron will be trying to scrap the ID database too. We sure know how to piss IT money up a wall in the UK :)
Neil Aitken
@Neil Aitken: I prefer the David and Nick portmanteau :-)
gbn
@gbn indeed, all hail Dick Clegeron :)
Neil Aitken
+13  A: 

The worst?... How about the Ballistic Missile Early Warning System (BMEWS) nearly causing global thermonuclear destruction after detecting the moon rising over the horizon and erroneously classifying it as an incoming missile attack from Siberia (Chapter 2 "Boardwalks across the Tar Pit" from Mechanizing Proof by Donald MacKenzie).

That's pretty scary $h!t!

gnovice
That's no moon! -- Oh wait, nevermind, it is.
kenj0418
"On October 5, 1960, the moonrise occurred directly in the path of the Thule detection radar, producing a strong signal return. While the computer system never generated an impact prediction, the large amount of data caused enough concern that the equipment was subsequently modified to reject moon returns based on their long (2 second) delay."Is that book overstating it or is Wikipedia understating it? http://en.wikipedia.org/wiki/Ballistic_Missile_Early_Warning_System
Matthew Lock
+2  A: 

Another payroll I don't think has been mentioned is Wisconsin's payroll system. it has cost $28.4 million so far but they reckon $12 million more is needed.

This HAS to be a software failure imho.

Univ. of Wisconsin's 30-Year-Old Payroll System Needs a $40 Million Fix

Improfane
A: 

There is a company in Boise, whose name I will not mention. Their purpose was to build a shopping engine that would be used for female shoppers to enjoy a virtual-mall-like shopping experience. Instead, the project had 68 high-end developers (enough to build an operating system), and millions of dollars hemorrhaging each year, and the withered away over 2 years time.

orokusaki
how did they want to create a 'virtual mall' experience?
Click Upvote
They wanted the shop to be a mall of other people's shops (some sort of large-scale affiliate system), with a personalized feel (based on some ridiculous profiling algorithms.
orokusaki
+5  A: 

What about the Denver Airport Automated Baggage Handling System? The system's budget was $193 Million dollars of 1994 - and software delays were costing $1 Million dollars a day. According to Scientific American (September 1994), the system consisted of 100 networked computers connected to 5,000 electric eyes, 400 radio receivers and 56 bar-code scanners.

Kwang Mark Eleven
No, the Denver International Airport (DIA) Baggage Handling System (BHS) is just one cluster in the cluster-frak that was much of the DIA project. The biggest mistakes in the BHS was that they tried to do 8 years of work in 2 years, especially when it was a very new technology (only one airport in Germany had a similar system, and it was a fraction of the size of DIA's).
CodeSlave
+3  A: 

Air Traffic Control System. The US still uses a card-based system designed in the 60's. I know that at least 1 major program to computerize it failed. If the system were computerized correctly, airport congestion would be mitigated

...at the risk of complete, immediate collapse if anything goes wrong with the new system. At least with the old system only being partially computerized, they have *some* idea what's going on when the computer system hiccups. NextGen really scares me. Knowing what I do about technology, and how they plan to deploy it, it looks to have powerful (and needed) features, yet the specifications look VERY fragile...
Brian Knoblauch
+9  A: 

One case, more of a classic project failure, closer to the Canadian example mentioned by OP:

AKE (Finnish Vehicle Administration, a public agency) ordered a project to overhaul its information systems in 1999, mainly from the companies TienoEnator (now Tieto) and WM-data (now part of Logica).

It was supposed to be ready 2003, but has been continually postponed. In 2007 it was estimated that it'd be ready 2011. So right now the project has been going on for a decade, and it will be at least 8 years late!

The budget has gone through the roof too: original estimate was 16 million €; actual total costs as of 2009 have been more than 50 million € (~70 million USD), more than 300% cost overrun, so far.

From news articles, it seems like a proper mess of every kind of leadership, coordination and requirements problem: teams with overlapping responsibilities; no-one having an adequate picture of what was actually needed when the project started; the project responsible at AKE having been changed at least 5 times.

So, nothing as spectacular as Mars landers crashing, or people dying because of this (afaik!), but these sort of things are probably among the most common failures in this field. The main consequences: loads of wasted taxpayer money, and screwed reputation for the software/IT industry. :-\

Sources (in Finnish):

Jonik
Then again, the companies responsible do have also plenty of other projects in their portfolio with similar success rates. In this sense I would not say the AKE project screwed the reputation of software/IT industry; instead it merely reinforces the already existing image :)
Schedler
+6  A: 

How about the Sergeant York gun? http://en.wikipedia.org/wiki/Sergeant_York_Gun

"Unable to hit drones moving even in a straight line, the tests were later relaxed to hovering ones. The radar proved unable to lock even to this target, as the return was too small. The testers then started adding radar reflectors to the drone to address this "problem", eventually having to add four. Easterbrook, still covering the ongoing debacle, described this as being similar to demonstrating the abilities of a bloodhound by having it find a man standing alone in the middle of an empty parking lot, covered with steaks. The system now tracked the drone, and after firing a lengthy burst of shells the drone was knocked off target."

billmcc
In February 1982 the prototype was demonstrated for a group of US and British officers at Fort Bliss, along with members of Congress and other VIPs. When the computer was activated, it immediately started aiming the guns at the review stands, causing several minor injuries as members of the group jumped for cover.
Click Upvote
Sounds a bit like the demonstration scenen in Robocop 1
sum1stolemyname
+3  A: 

Definitely not the worst failure, but probably one of the most stupid...

Microsoft Zune Leap year bug:

behrk2
+1  A: 

Microsoft server crash nearly causes 800-plane pile-up http://www.techworld.com/opsys/news/index.cfm?newsid=2275

meade
+1  A: 

At least Therac-25 shipped.

My vote goes to the California Department of Motor Vehicles, which hired Tandem and Ernst & Young to to replace its aging IBM mainframe-based driver registry system with a shiny new one. The five-year project, launched in 1987 with a budget of $25 million, was cancelled in 1994 after $44 million had been spent and there was still no delivery date in sight. The state then spent another half million dollars to find out what had gone wrong.

Robert Rossney
But it killed people!
Callum Rogers
I wasn't kidding. It wasn't just that Therac-25 shipped. Once the defect was fixed, hospitals continued to use it. The company's still in business. That's not a failed project. It's an engineering failure, and a software-development-practice failure, and a regulatory failure. Software geeks think it's epochal because it was a case where bad software killed people.But that's the nature of engineering defects. The DC-10 killed people, too - a lot more people than Therac-25 did. And it was McDonnell-Douglas's flagship product for a decade.
Robert Rossney
Some times, it's better not to ship
ckarras
...as a California taxpayer, I helped fund that. And the next attempt.
DaveE
You should look into the California Administrative Office of the Courts's CCMS project, speaking of things you'll be paying for as long as you live here.
Robert Rossney
I worked on that near the beginning, when it was a simple port of the 1980-era CCMS LA County was using. Unlimited scope creep and high-level champions....
DaveE
A: 

Munich migration to Linux. (It counts as a software project, because they are writing their own Linux distro.)

quant_dev
Did they fail? The project is still ongoing. LiMux is the first Linux-based workplace certified for industry use - http://en.wikipedia.org/wiki/LiMux
mjustin
Exactly, it's still ongoing after sth like 10 years...
quant_dev
"31 December 2009: The first step, the complete switch to OpenOffice.org enabling the Open Document Format as standard format is done" That sounds like success to me.
Wayne Werner
How long did it take and how much did it cost?
quant_dev
+1  A: 

Waste Management suing SAP for $100 million seems like it could be on the list in terms of dollar values on a failed project.

JB King
+1  A: 

Duke Nukem Forever...

Mica
+1  A: 

The Distributed Computing Environment, at least for OS/2.

The Taligent operating system.

The IBM Workplace OS.

Large chunks of money ($2 billion for the last, according to Wikipedia), large chunks of time, and you have probably never heard of any of them.

But those are just the ones from my resume.

Tommy McGuire
OS/2 itself could be described as a huge failure. Not because it was never finished, but because IBM actually believed that Microsoft would keep supporting it even when they were building Windows... If that's not Über Fail, then i don't know what is.
Tor Valamo
+1  A: 

Goldmine CRM. Without doubt.

Worzel
heh... I worked on that.
Jherico
A: 

Y2K - Have anyone actually calculated how much it costs all over the world to fix this?

Rafal Ziolkowski
Are you saying the effort to fix it was a failure? If so, why? Also, I'd argue that this is not a single project but hundreds of them.
JohnFx
Not a project, but a moniker for a class of date-related bugs, most of them in COBOL code, and which I think was mostly a bonanza for those who made money selling "Y2k certified" replacements for aging corporate mainframe infrastructure.
Warren P
The assumptions that sooo many folks made were valid at the time - very few systems, electronic or otherwise, lasted untouched for more than 10 years.I sure don't want every implementation decision I make critiqued 30 years after the fact.
DaveE
@Warren: In 1999, I had to check some of our C++ software, and found some date bugs. I also had to upgrade our CVS and Gnats installations, which were not Y2K-compliant. I have a great deal of sympathy for the programmers of 1970 who wanted to save two columns of data on a punch card. I have much less for the programmers of the 1990s.
David Thornley
A: 

I go to school at UCA. Their website redesign cost over $130,000. Most of that I gather went to the Sungard CMS which sucks tremendously.

Seriously - there is no excuse for that crap.

Wayne Werner
A: 

Samsung PC Studio... Samsung New PC studio...

You can say what ever you want but this is the fact... :)

mangia
A: 

Winamp 3. (Nullsoft media player)

:P

Andrei Rinea