How is a new programming language actually formed/created ?

views:

400

answers:

+12 Q:

How is a new programming language actually formed/created ?

Fortran->Algol->Cpl->Bcpl->C->C++->Java .....

Seems like every language is built upon an ancestor language. My question : New languages extend parent ones or there is some kind of a trick?

e.g. System.out.print() in Java ; is it actually printf() in C, and so on (printf is actually .. in Cpl)?

If so, doesn't this make every further language be slower and need more memory? What separates a new language from a framework?

+7 A:

Languages are not slow, Implementations [created by compilers to assembly] are slow. In theory, you could have a C++ interpreter that ran slower than a PHP compiler or whatever. Languages also do not consume memory, implementations consume memory.

A project is a language when the grammar (or syntax) is different. (You could have both a language and framework in the same project)

Languages are formed just by the general creative process. Someone sees something kinda cool and thinks it could be better so they make it that way and eventually you get an entirely different language.

Earlz 2010-04-29 23:08:34

Actually, there *are* C++ interpreters, and they *are* dog slow.

Jörg W Mittag 2010-04-29 23:38:33

Languages can effect the maximum speed of a compiler--or more accurately a language can be built to increase the efficiency of a compiler.

Bill K 2010-04-29 23:45:58

@Bill of course. In theory though PHP can be faster than Assembly.

Earlz 2010-05-09 01:58:37

@Earlz Depends, for optimal coding that is not correct. PHP does not translate until an efficient enough set of codes. In fact, your example is a perfect demonstration of my point--PHP is not really made to be optimized well and will not likely be very fast because it would be so hard to create an efficient compiler.

Bill K 2010-05-10 16:30:36

@Bill but you could create a CPU with an instruction set designed for PHP. Coding in C or some other low level language may not be as fast as PHP because of PHP assumptions made for optimization etc. My point is that Languages are not slow Implementations are. There is no set in stone thing saying PHP must be definition be slower than some other language.

Earlz 2010-05-10 17:30:28

@Earlz it seems like we're almost saying the same thing--and I'm not disagreeing with you anyway, I've seen DAMN FAST basic compilers, but still the language doesn't allow you to provide hints to the compiler for advanced optimizations. The implementation has MUCH more to do with it than the language does, but the language also has an impact on what the implementation can do at a given level of effort. An implementation of a simple language with minimal syntax would be easier to optimize than a one for a powerful OO language with dynamic typing, reflection and no "int" type--for instance.

Bill K 2010-05-10 18:03:25

+2 A:

Languages can (and often are) written from scratch. But languages can (and often do) build on the concepts of prior languages. Why reinvent the wheel when there is a perfectly round one laying at your feet?

Robert Harvey 2010-04-29 23:14:16

+1 A:

It seems to me there are 2 main ways that new languages get created:

1) someone decides they need to do some specific work, and a special purpose language would help them get that work done. That person (along with maybe some other users) find the language useful and start extending to to other purposes. Eventually the language supports enough stuff that it can be considered a general purpose language.

2) someone versed in programming languages decides that there are problems with a current language that might be solved with a new approach (that might be a radical change or an incremental change from before). The new language is designed from the ground up to be a general purpose language. This might be a description of how Java or C# came about.

Most languages will have some similarity to others (such as your printing example) because those operations are pretty useful in most any computing context.

Then there are languages like FORTH or APL which I just can't explain...

Michael Burr 2010-04-29 23:17:28

+3 A:

There is a big difference, which may not be obvious, between saying that a language is built on the concepts of a predecessor and actually being built with it. C compiles into ASM so it is built on it but not necissarily with it. Often times (most?) C compilers are actually written in C. C++ is built on C, it supports the full spectrum of C and adds a bunch of stuff to it. Java is a totally different thing, likewise with .NET. They "compile" to a pseudo ASM referred to as IL for .NET and ByteCode for Java. Both require some other step or VM (virtual machine) to run.

John 2010-04-29 23:18:29

Programming languages build upon the prior experience and knowledge of the language designer, and of the community in general, just as new cars are built using what was learned by building cars a few years ago.

I don't think it's accurate to make a blanket statement that all languages are built upon some other language that preceded it. The designer of the new language certainly should have experience using multiple existing languages, understand their strengths and weaknesses for a specific purpose, and then design their new language to incorporate all the strengths / great ideas of others and avoid the weaknesses of others as much as possible. Learn from others successes and failures. (Those who ignore history are doomed to repeat it)

As noted in other responses, building a new language is in no way related to the performance of the implementation of other languages. New languages and new implementations of the same language usually replace, not extend, prior examples. Even the performance of one implementation of a language compared to another implementation can vary considerably: consider the Borland C++ compilers that ran circles around other C++ compilers of the same era, or compare the runtime performance of the 1.0 Java virtual machine to later implementations that added performance boosts like the hotspot compiler.

To use the car example: because the Ford Model T was a car and belched black smoke, and the Tesla Roadster is a car and came after the Model T, is it true that the Tesla Roadster must also belch black smoke simply because it is a car and it came after the Model T? Of course not. The Tesla is a very different implementation of the idea of a car. Its design reflects a lot of things that have been learned since the Model T was made, but it is not a Model T.

dthorpe 2010-04-29 23:28:52

Language designers have to take into account popularity. I've no doubt that a large part of C#'s popularity is due to the fact that there isn't that much difference from Java's syntax, which isn't too different from C, and so on.

It's already hard work learning a new language's foibles, so it's easier for everybody if the syntax isn't too different from other languages.

As regards speed, it's not dependent on the language, but on the compiler used to convert that language into another language, which could be straight out machine code, or assembly, or in the case of C#, Java, etc, byte code which is then run on a Virtual Machine.

Your final question is also interesting. C# and .NET are quite different beasts. When a language (like C#) is targeted for .NET, a compiler is made that can convert that language into bytecode that can run on that VM. This means that C#.net code can quite happily call Assemblies written in VB.NET, for instance.

The same applies to Java and Scala, both written for the JVM. Scala is a functional language, while Java is an OOP language, yet both can happily call eachother, since in the end, it's just bytecode running on a VM.

Hope this answers your question.

Jean Azzopardi 2010-04-29 23:42:55

+2 A:

At a superficial level, languages grow from predecessors because there are users who already know the syntax and there are not too many different ways to actually do the syntax.

At a more meaningful level, prior languages may be being used in repetitive ways that could be made easier by adding some syntax and behavior. I'm thinking of how one would do OOP in C before C++ came along. Or the distinction between Fortran with GOTO and Algol with block structure. It was a pain to keep making labels, and they could be automatically generated.

Personally, I could be wrong, but I don't see general purpose languages (GPL) evolving much further (not to say small languages won't proliferate). I do think special-purpose languages (DSLs) will continue to grow, and I think one of the key features of any GPL will be how well it assists the creation of new DSLs.

I think this because there is a continuum of representations between problem-specific data structure on the one hand, and programming languages on the other. Any data that is read by some program is, in a sense, expressed in a language, and that program is its interpreter. So the only thing that really separates the extremes is the degree of sophistication and generality of its interpreter.

What I look for in a DSL is the property of minimum redundancy with respect to its intended problem domain. The idea is if there are some requirements, and if a program is written (by a human, at a keyboard) to correctly implement those requirements, and then if a single coherent change is made to the requirements, there is some amount of editing that must be done to the program to correctly implement the change, then the redundancy of the language w.r.t. the domain is the size of such edits, averaged (somehow) over the space of possible changes. A very simple way to measure this is to use a diff program between the before-and-after code. The number of differences is a measure of the redundancy for that change. This is a bit long-winded, but that's what I look for to be minimized, to say that a language is well adapted to a domain.

If redundancy of a language is minimized, then it means fewer edit changes are required to implement functional changes, and not only is the code likely to be shorter, but there are thus fewer chances to put in bugs.

The way programmers are currently taught, these ideas are in their future. Minimizing redundancy of source code is not yet seriously valued. What we have instead is bandwagons like OOP that, in spite of their obvious value, tend to lead to massively redundant code. One promissing development is the rise of code generators, but again they are in danger of becoming ends in themselves rather than serving the goal of reducing source code redundancy.

Mike Dunlavey 2010-04-29 23:46:44

Haven't code generation projects been on the way since the early 60s?

Paul Nathan 2010-04-30 00:37:04

@Paul Nathan: Yes of course, and I believe it's because they serve a fundamental purpose. What I think may be new is a rising awareness of what they accomplish, in terms of implementing DSLs. Another way to look at it is *partial evaluation*. The danger is that they be thought of in terms of what they *are* rather than in terms of what they *accomplish*.

Mike Dunlavey 2010-04-30 01:06:11

@Mike: I'm not knocking them, just... being aware of the continual non-fulfillment of the promises regarding them. :-)

Paul Nathan 2010-04-30 04:17:29

@Paul: I think we're saying the same thing from different angles.

Mike Dunlavey 2010-04-30 11:47:03

As someone who works primarily on language-based projects, I think that there are two important reasons that new languages are created: boredom and frustration. When programmers are bored, they'll come up with all kinds of ridiculously ambitious projects to fill their time, and languages provide nearly endlessly interesting new challenges. When programmers are frustrated with existing tools, they tend to create new ones to fill the perceived gap. From DSLs to general-purpose languages, I find that all language projects that I have seen boil down to basically these two reasons.

Jon Purdy 2010-04-29 23:54:31

Actually new languages can be much faster. For example C/C++ or even Java can be much faster then hand-written assembler. Go/Haskell/... which can be panellized easily can be much faster then assembler on modern hardware - what's that assembler code is '25% faster' then compared single thread Haskell/Go program if you can turn on switch which makes Haskell/Go 4x faster (i.e. 3x then optimized assembler) on Core Quad not mentioning that the code is much less buggy?

There are studies that platforms with GC are actually faster - since programmers have more time optimizing programs/seeking for others errors rather then finding memory leaks - although the 'ideal' programs will be slower.

Also you can have many reimplementation of language:

Based on native code (assembler)
Based on low-level language (C, LLVM)
Based on cross-platform framework (Java, parrot)
Based on interpreter
Based on single-platform framework (.Net - yes I know about mono ;) but it is still mostly single-platform)

For example Ruby have Ruby MRI (interpreter), JRuby (Java), .Net (IronRuby) etc. They usually much differ in terms of speed. C have numerous compilers and can in theory have interpreter. Haskell have native code generator (GHC & co.), low-level language (GHC -fvia-c and new GHC + LLVM) and intepreter (ghci).

Languages generally are created by:

Authors like language A and B so he combines it into language C
Authors have a brand new idea (like objects, duck-typing) which cannot be expressed in existing languages/a new idiom which cannot be expressed as good in existing languages so they create new language which brand new feature.
Authors finds some feature terrible and wants to get rid of it (for any reasons). Like enum in old Java.

In first and third case new language 'inherits' from one or more languages (usually many). But usually it is combination of them.

Maciej Piechotka 2010-04-29 23:55:40

+1 A:

Each language builds on the concepts of another (language design). Each new language learns what worked in previous languages and what did not work. Also languages are targeted at different groups. If you write a language for long-term maintainability and reliability, that language will probably not be appropriate for shell scripts.

Many languages are testing beds for features or concepts. These are usually extremely flexible and fun and can be very quick to code in. Basic was one of these, as are Perl, Ruby, ...

Other languages are reduced to a bear minimum and rarely change--they focus on backwards compatibility and consistency. New features would be avoided and, when they are added, would be tested in-depth before being added to the standard (Yeah, they tend to be standards based more than the previous group). C, Java, Ada and C++ were designed to fit here. C# is possibly a crossover with more features being added than these others, but more stability than the previous group.

Now, besides what drives language features, there is how the language is built. Lanuages are often initially written in another language, however not in the way you are assuming. Java is mostly written in Java now, but the JVM is probably mostly hand-coded assembly, however you can be SURE that C's printf is nowhere to be found in Java.

A compiled language generally consists of your program reduced to a certain set of codes (machine language or bytecode) and is then packaged with a set of routines (like System.out.println). However, println doesn't just call C's println, instead a library is created (written in some combination of Java, C, assembly) that knows how to do the output itself.

In C the library would be generated the same way, a combination of C and assembly that generates the assembly code that can execute "printf"

Bill K 2010-04-30 00:00:25

I've voted up some answers above, but note also that the Lisp class of languages actually in some ways encourage the programmer to effectively make a new language feature whenever needed, creating a new dialect of lisp. You can argue the toss about what the difference is between creating a new function of your own and creating a new language feature, but that is the claim of Lispers.

whybird 2010-04-30 00:09:18

+11 A:

Norman Ramsey 2010-04-30 00:10:07

+1 you should be a lecturer :-)

aaa 2010-04-30 01:03:18

ansaurus

tags:

views:

answers:

How is a new programming language actually formed/created ?

related questions