tags:

views:

500

answers:

11

It seems like old iron is rock solid software. Why is that? Is it because the software is so mature, that all the bugs have been worked out? Or is it because people have gotten so used to the bugs that they don't even recognize them and work around them? Were the software specs perfect from day one and once the software was written, everything just worked? I'm trying to understand how we've come from mainframe computing days which everyone now espouses as just working to feeling that TDD is now the way to go.

+1  A: 

Oh, they definitely have bugs--see thedailywtf.com for some more entertaining examples. That said, most of the "mainframe" applications one sees today have had 30 years to get all the kinks worked out, so they have a bit of an advantage over most applications created in the last few years.

Wyatt Barnett
+18  A: 

Why on Earth do you think they don't have bugs?

IBM has a vast infrastructure of bug reporting and resolution tools (PMRs, APARs and PTFs) which is heavily used.

Mainframe software which hasn't been touched for many years will certainly be well understood (in terms of its foibles) and will likely have had many bugs either fixed or worked around. All of the new stuff being developed nowadays actually plans for a certain number of bugs and patches from GA (general availability) to at least GA + 36 months.

The mainframe espouses RAS principles (reliability, availability and serviceability) beyond what most desktop hardware and software could ever aspire to - that's only my opinion of course, but I'm right :-) That's because IBM knows all too well that the cost of fixing bugs increases a great deal as you move through the development cycle (it's a lot cheaper to fix a bug in unit testing than it is to fix one in production - that's cost both in terms of money and reputation).

There's a great deal of effort and cost expended on only releasing bug-free software but even they don't get it perfect.

paxdiablo
It is certainly not always the case that old mainframe software that hasn't been touched is well understood. It is much more likely that it is completely forgotten about and that the managers wake up with cold sweats wondering what would happen if one of the old programs that never break abended. Thank god I work on windows software. Nothing ever works for long enough to get forgotten about!!!
Modan
@Modan, *in terms of its foibles*. Its *internls* may not be understood at all (the source may even be gone) but, if it hasn't abended in the last twenty years with all the garbage data it would have been subject to in that time, it's incredibly unlikely to start tomorrow. And if it had abended, you would have found a workaround or replaced it.
paxdiablo
+9  A: 

There are no bugs in main frame software, only features.

Itay Moav
contrary to common desktop applications, that have *undocumented* features.
voyager
Preemptive use of community wiki. Well done.
Kobi
A: 

While I don't have experience with mainframes, I'm guessing it's the first point you made: the software has been around for decades. Most remaining bugs will have been worked out.

Besides, don't forget fiascos like Y2K. All of the bugs people have stumbled on have been worked out, and in 20 years most situations will probably have occured. But every once in a while, a new situation does manage to come along that makes even 20-year old software stop working.

(Another interesting example of this is the bug found in, I believe, BSD Unix. It was found a year or so ago, and it's been around for 20 years without anyone running into it).

Edan Maor
+1  A: 

I think programming was just an advanced field that only chosen engineers could work in it. The world of programming now is much much bigger with lower entry barriers in every aspect.

AraK
+4  A: 

There asre PLENTY of bugs on mainframe software, they are just not publisized as much due to the relatively small group of developers affected. Just ask someone who does mainframe development how many ABENDS they see on a daily basis!

ennuikiller
ABEND! I still seem to call it that from time to time and the infamous 0C4
Preet Sangha
//SYSABEND DD SYSOUT=A- the start to a very, very long day.
Paul Tomblin
It's not developers affected, it's customers. And the number of people affected by a bug in mainframe software can be *huge* - such as a recent bank debacle in Australia recently where no account information was available over the net and many online transactions were delayed for days. What do you think is at the back end of all that online banking infrastructure, SQL Server on a couple of PCs? :-)
paxdiablo
+1  A: 

I learned to use debuggers and analyse core dumps on big iron mainframes. Trust me they only came about because of bugs. You're just plain wrong.

However mainframe architectures have been designed for stability under high stress (well compared to say non mainframe systems) so maybe you can argue they are better in that way. But code wise? Nah bug are still there...

Preet Sangha
+6  A: 

I used to work on mainframe apps. The earlier apps didn't have many bugs because they didn't do much. We wrote hundreds if not thousands of lines of FORTRAN to do what you'd do with a couple of formulas in Excel now. But when we went from programs that got their input by putting one value in columns 12-26 of card 1, and another value in columns 1-5 of card 2, etc, to ones that took input from an interactive ISPF screen or a light pen and output on a Calcomp 1012 plotter or a Tektronix 4107 terminal, the bug count went up.

Paul Tomblin
+1  A: 

I think it's a few things. First is that the cycle of fix-a-bug-recompile was more expensive in mainframes usually. This meant the programmer couldn't just slop out code and "see if it works". By doing in-your-head compile and runtime simulations you can spot more bugs than letting the compiler catch them.

Second, everybody and their brother wasn't a "programmer." They were usually highly trained specialists. Now programs come from guys sitting in their basement with a High School diploma. Nothing wrong with that!!! but it does tend to have more bugs that the engineer that's been doing it professionally for 20 years.

Third, mainframe programs tend to have less interaction with their neighbors. In Windows, for example, a bad app can crash the one next to it or the entire system. On mainframes they usually have segmented memory so all it can crash is itself. Given the tons of things running on your typical desktop system from all kinds of marginally-reliable sources it tends to make any program flaky to some degree.

Maturity is definitely a factor. A COBOL credit-card processing program that was written 20 years ago and has been refined over and over to eliminate bugs is less likely to have a problem than a 0.1 version of any program. Of course, there is the issue that these old rewritten infinite times programs usually end up spaghetti code that's nearly impossible to maintain.

Like anything, it depends mostly on the programmer(s) and their methodology. Do they do unit testing? Do they document and write clean code? Do they just slop-and-drop code into the compiler to see if there are any bugs (hoping the compiler can catch them all)?

Deverill
+1  A: 

My experience with mainframe application software (as opposed to operating systems) is pretty out of date, but my recollection is that the majority of applications are batch applications that are, logically, very simple:

a) Read an input file
b) Process each record (if you are feeling daring, update a database)
c) Write an output file

No user input events to worry about, a team of qualified operators to monitor the job as it runs, little interaction with external systems, etc, etc.

Now the business logic may be complex (especially if it's written in COBOL 68 and the database isn't relational) but if that's all you have to concentrate on, it's easier to make reliable software.

Bigwave
"batch" takes on different connotations in large systems. Batch processing is (in small part) tuned to give maximum or identified CPU time towards a job vs. being interrupted and rescheduled like what may happen in a common Unix system.The 3 steps you gave above is the same way it worked on every system at the same time. Today's mainframes do just about everything that we're doing elsewhere today.
Xepoch
+1  A: 

I've never worked on software for mainframes myself, but my dad was a COBOL programmer in the 1970's.

When you wrote software in those days, finding bugs was not as simple as compiling your source code and looking at the error messages the compiler spits back at you or running your program and looking at what it was doing wrong. A typist had to punch the program into punch cards, which would then be read into the computer, which would print out the results of your program.

My dad told me that one day someone came with a cart full of boxes of paper and put them next to the door of the room where he was working. He asked "What's that?!", and the guy told him "That's the output of your program". My dad made a mistake which caused the program to print out a huge amount of gibberish on a stack of paper that could have used up a whole tree.

You learn from your mistakes quickly that way...

Jesper