views:

343

answers:

5

Hi, for a long time, I am thinking and studying output of C language compiler in assembler form, as well as CPU architecture. I know this may be silly to you, but it seems to me that something is very ineffective. Please, don´t be angry if I am wrong, and there is some reason I do not see for all these principles. I will be very glad if you tell me why is it designed this way. I actually truly believe I am wrong, I know the genius minds of people which get PCs together knew a reason to do so. What exactly, do you ask? I´ll tell you right away, I use C as a example:

1: Stack local scope memory allocation:

So, typical local memory allocation uses stack. Just copy esp to ebp and than allocate all the memory via ebp. OK, I would understand this if you explicitly need allocate RAM by default stack values, but if I do understand it correctly, modern OS use paging as a translation layer between application and physical RAM, when address you desire is further translated before reaching actual RAM byte. So why don´t just say 0x00000000 is int a,0x00000004 is int b and so? And access them just by mov 0x00000000,#10? Because you wont actually access memory blocks 0x00000000 and 0x00000004 but those your OS set the paging tables to. Actually, since memory allocation by ebp and esp use indirect addressing, "my" way would be even faster.

2: Variable allocation duplicity:

When you run application, Loader load its code into RAM. When you create variable, or string, compiler generates code that pushes these values on the top o stack when created in main. So there is actual instruction for do so, and that actual number in memory. So, there are 2 entries of the same value in RAM. One in form of instruction, second in form of actual bytes in the RAM. But why? Why not to just when declaring variable count at which memory block it would be, than when used, just insert this memory location?

+17  A: 

How would you implement recursive functions? What you are describing is equivalent to using global variables everywhere.

That's just one problem. How can you link to a precompiled object file and be sure it won't corrupt the memory of your procedures?

Mehrdad Afshari
Beat me to it :(
Blindy
Well, if object file is linked statically, you could solve that, becouse you wold know all the syntax. And is there some reason for the second question?
B.Gen.Jack.O.Neill
It would also make safe multithreading very difficult, if every thread that called function Foo() had to use the same memory addresses for local variables.
Jeremy Friesner
b-gen-jack-o-neill: Not easily. And my point is: how would you pass an argument to a function? I don't quite get your second question.
Mehrdad Afshari
+2  A: 

Since you are comparing assembler and c (which are very close together from an architectural standpoint), I'm inclined to say that you're describing micro-optimization, which is meaningless unless you profile the code to see if it performs better.

In general, programming languages are evolving towards a more declarative style (i.e. telling the computer what you want done, rather than how you want it done). When you program in an imperative language (like assembly or c), you specify in extreme detail how you want the problem solved. This gives the compiler little room to make optimization decisions on your behalf.

However, as the languages become more declarative, the compilers are getting smarter, because we are giving them the room they need to make more intelligent performance optimizations.

Robert Harvey
I have to totally disagree with this. There are many, many good researchers that do great work on what you are calling "micro-optimization." Actually, I'm not sure what you mean since this question is directly addressing more efficient compiler output. I think all this business about "declarative programming" is more about playing to a fad than getting at the meat and potatoes here. Presumably we'd like to to have some way for all these declarative stuff to perform with some reasonable efficency. As an aside, I think much more is being made of this declarative stuff than is really warranted.
BobbyShaftoe
Doing micro-optimizations on a program is usually foolish. Doing micro-optimizations on generated code for a compiler makes much more sense, and this isn't even micro-optimization. These are proposals for structurally changes in compiled code.
David Thornley
+6  A: 
  1. Because C (and most other languages) support recursion, so a function can call itself, and each call of the function needs separate copies of any local variables. Also, on most current processors, your way would actually be slower -- indirect addressing is so common that processors are optimized for it.

  2. You seem to want the behavior of C (or at least that C allows) for string literals. There are good and bad points to this, such as the fact that even though you've defined a "variable", you can't actually modify its contents (without affecting other variables that are pointing at the same location).

Jerry Coffin
+2  A: 

The answers to your questions are mostly wrapped up in the different semantics of different storage classes

  • Google "data segment"
  • Think about the difference in behavior between global and local variables.
  • Think about how constant and non-constant variables have different requirements when functions are called repeatedly (or as Mehrdad says, recursively)
  • Think about the difference between static and non static automatic variables again in the context of multiple or recursive calls.
dmckee
+1  A: 
  1. If every function would put its first variable at offset 0 and so on then you would have to change the memory mapping each time you enter a function (you could not allocate all variables to unique addresses if you want recursion). This is doable, but with current hardware it's very slow. Furthermore, the address translation performed by the virtual memory is not free either, it's actually quite complicated to implement this efficiently.
    Addressing off ebp (or any other register) costs having a mux (to select the register) and an adder (to add the offset to the register). The time taken for this can often be overlapped with other operations.

  2. If you want to be able to modify the static value you have to copy it to the stack. If you don't (saying it's 'const') then a good C compiler will no copy it to the stack.

augustss