Reading through some of the questions here, the general concensus seems to be that there to continues to be an enourmous amount of COBOL code "out there", not just because it's a nightmare to refactor or re-code, but simply because for a certain market segment (financials etc.), it has proven itself to be more than capable of holding its own. But what is it about the language that causes it to be so? How can something that is several decades old continue to perform well enough to hold its own against more modern languages, with all the comensurate improvements in memory management etc.? Have the COBOL compilers etc. simply improved silently in the background? Or is there something inherent in the language that means it is extremely efficient for a given set of operations?
views:
292answers:
5It's because COBOL programs (at least the old ones) are very simply structured, so it's not at all difficult to compile them to efficient machine code. For example, "good-old" cobol programs have no need for efficient memory management, because dynamic allocation of memory simply doesn't happen; the memory layout is fixed at compile time.
The COBOL language was designed in the 1950s to match the capabilities of the slow, RAM-limited machines available at the time. Not to mention the lack of interactive terminals. Many aspects of the design are made to be easy to compile into straightforward machine code with no optimization needed. For instance, there are no variables. Only a single block of working storage, with names that refer to byte arrays of a specific fixed length starting at a fixed location. COBOL programs compile to efficient machine code by design.
As CPUs got faster and RAM got more plentiful, COBOL compilers did add new features like key-indexed file I/O and built-in MERGE algorithm, and support for interactive text terminals. Noawadays there is even object-oriented COBOL.
So part of the reason is that the code was portable to new CPU architectures since it was a high-level language, yet very efficient since it was designed to not use fancy features like those found in ALGOL-60, an ancestor of C. And part of the reason is that COBOL evolved to fit into newer OSes and capabilities. For instance, SQL databases are just more sophisticated forms of the simple table-oriented files that COBOL was designed to handle. Overlay linkers allowed huge COBOL programs to be written as long as the execution flow was roughly sequential. Any feature that was better done in Assembler or PL/1 or FORTRAN, could be accessed via PROCEDURE calls.
The closest modern language to COBOL is Python, because you can write clean programs that almost read like English without extraneous punctuation everywhere, but you can leverage a large and sophisticated library of features rather than having to code your own all the time. Of course Python has adopted all of the features of ALGOL-60 and more, because it was designed in the modern era when you don't have to fit everything into 16k of RAM.
Another point if favour or COBOL in a financial setting is that it provies native data types and mathematical operators for doing decimal fixed point arithemetic (see: [Packed Decimal][1]).
Packed Decimal is useful for doing financial computations because it maintains a fixed number of digits before and after the decimal point. This makes it a little easier to deal with rounding of financial amounts.
Few languages other than COBOL, PL/1 and Algol can efficiently do arithmetic in decimal fixed point. IBM mainframe computers have dedicated hardware circuts for doing calculations in BCD which helps keep COBOL performance somewhere up in the stratosphere.
[1]: http://en.wikipedia.org/wiki/Packed%5Fdecimal"Packed Decimal"
On OS/360 and its descendents there was about a ratio of four assembly instructions to a COBOL verb, the hardware designers had a good look at the COBOL spec and built an instruction set to support it.
Even seemingly monstrous statemens like :-
PERFORM BEGIN-PARA THROUGH END-PARA VARYING I FROM 1 BY 2 TO MAX_ARRAY.
Transalates to about eight assembly instructions (only 1 of which is inside the loop)