views:

346

answers:

4

I have read a lot of posts about "string literals" on SO, most of which have been about best-practices, or where the literal is NOT located in memory.

I am interested in where the string DOES get allocated/stored, etc.

I did find one intriguing answer here, saying:

Defining a string inline actually embeds the data in the program itself and cannot be changed (some compilers allow this by a smart trick, don't bother).

but, it had to do with C++, not to mention that it says not to bother.

I am bothering. =D

So my question is, again, where and how is my string literal kept? Why should I not try to alter it? Does the implementation vary by platform? Does anyone care to elaborate on the "smart trick?"

Thanks for any explanations.

+5  A: 

A common technique is for string literals to be put in "read-only-data" section which gets mapped into the process space as read-only (which is why you can't change it).

It does vary by platform. For example, simpler chip architectures may not support read-only memory segments so the data segment will be writable.

Rather then try to figure out a trick to make string literals changeable (it will be highly dependent on your platform and could change over time), just use arrays:

char foo[] = "...";

The compiler will arrange for the array to get initialized from the literal and you can modify the array.

R Samuel Klatchko
+1 for arrays! thx
Norman Ramsey
Yes, I use arrays when I want to have mutable strings. I was just curious. Thanks.
Chris Cooper
A: 

It depends on the format of your executable. One way to think about it is that if you were assembly programming, you might put string literals in the data segment of your assembly program. Your C compiler does something like that, but it all depends on what system you're binary is being compiled for.

Parappa
+2  A: 

There is no one answer to this. The C and C++ standards just say that string literals have static storage duration, any attempt at modifying them gives undefined behavior, and multiple string literals with the same contents may or may not share the same storage.

Depending on the system you're writing for, and the capabilities of the executable file format it uses, they may be stored along with the program code in the text segment, or they may have a separate segment for initialized data.

Determining the details will vary depending on the platform as well -- most probably include tools that can tell you where it's putting it. Some will even give you control over details like that, if you want it (e.g. gnu ld allows you to supply a script to tell it all about how to group data, code, etc.)

Jerry Coffin
+2  A: 

gcc makes a .rodata section that gets mapped "somewhere" in address space and is marked read only,

Visual C++ (cl.exe) makes a .rdata section for the same purpose.

You can look at the output from dumpbin or objdump (on Linux) to see the sections of your executable.

E.g.

>dumpbin vec1.exe
Microsoft (R) COFF/PE Dumper Version 8.00.50727.762
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file vec1.exe

File Type: EXECUTABLE IMAGE

  Summary

        4000 .data
        5000 .rdata  <-- here are strings and other read-only stuff.
       14000 .text
Alex