ansaurus

Question

NULL definition problem on 64 bit system

Answer 1

A:

I previously replied with the below answer. But then I saw that I misinterpreted several information and gave an incorrect answer. Just out of curiosity, I did the same test with VS2008 and I got different results. This is just brain exercise...

Why do you need the second one ? Both headers say the same thing.
And it does not even matter if you write 0 or NULL or ((void *)0)
All of them will take 8 bytes.

I did a quick test on a 64-bit platform with GCC 4.1.3

#include <string.h>

void str_concat(char *po_buf, int pi_max, ...)
{
    strcpy(po_buf, "Malkocoglu"); /* bogus */
}

int main()
{
    char buf[100];
    str_concat(buf, 100, "abc", 1234LL, "def", 5678LL, "ghi", 2345LL, "jkl", 6789LL, "mno", 3456LL, 0, "pqx", 0);
    return 1;
}

And this is the assembly generated by the compiler...

main:
.LFB3:
    pushq %rbp
.LCFI3:
    movq %rsp, %rbp
.LCFI4:
    subq $192, %rsp
.LCFI5:
    leaq -112(%rbp), %rdi
    movl $0, 64(%rsp)                          0
    movq $.LC2, 56(%rsp)                       "pqx"
    movl $0, 48(%rsp)                          0
    movq $3456, 40(%rsp)                       3456LL
    movq $.LC3, 32(%rsp)                       "mno"
    movq $6789, 24(%rsp)                       6789LL
    movq $.LC4, 16(%rsp)                       "jkl"
    movq $2345, 8(%rsp)                        2345LL
    movq $.LC5, (%rsp)                         "ghi"
    movl $5678, %r9d                           5678LL
    movl $.LC0, %r8d                           "def"
    movl $1234, %ecx                           1234LL
    movl $.LC1, %edx                           "abc"
    movl $100, %esi                            100
    movl $0, %eax
    call str_concat
    movl $1, %eax
    leave
    ret

Notice all the stack displacements are 8 byte...

Compiler treats 0 as it was a 32-bit data-type.
Although it does the correct displacement on the
stack pointer, the value pushed should not be 32-bit !

I did the same test with VS2008 , the assembly output is as follows :

mov QWORD PTR [rsp+112], 0
lea rax, OFFSET FLAT:$SG3597
mov QWORD PTR [rsp+104], rax
mov QWORD PTR [rsp+96], 0
mov QWORD PTR [rsp+88], 3456  ; 00000d80H
lea rax, OFFSET FLAT:$SG3598
mov QWORD PTR [rsp+80], rax
mov QWORD PTR [rsp+72], 6789  ; 00001a85H
lea rax, OFFSET FLAT:$SG3599
mov QWORD PTR [rsp+64], rax
mov QWORD PTR [rsp+56], 2345  ; 00000929H
lea rax, OFFSET FLAT:$SG3600
mov QWORD PTR [rsp+48], rax
mov QWORD PTR [rsp+40], 5678  ; 0000162eH
lea rax, OFFSET FLAT:$SG3601
mov QWORD PTR [rsp+32], rax
mov r9d, 1234    ; 000004d2H
lea r8, OFFSET FLAT:$SG3602
mov edx, 100    ; 00000064H
lea rcx, QWORD PTR buf$[rsp]
call ?str_concat@@YAXPEADHZZ   ; str_concat

This time compiler generates different code and it treats 0 as an 64-bit data-type (notice the QWORD keyword). Both the value and stack displacement is correct. VS and GCC behaves differently.

Malkocoglu 2009-11-04 14:51:48

Actually I just perked up, is `int` 4 bytes on a 64-bits platform right ?

Matthieu M. 2009-11-04 14:56:10

It depends on system/compiler/ABI but yes generally int is 4 bytes but 0 is not. 0 is a number, it is not a type (like int/long/char) and it is promoted to whatever compiler thinks it should be...

Malkocoglu 2009-11-04 14:58:20

Downvoted because 0 is *not* 8 bytes. I ran into this with some Gnome software in the very early days of Gentoo AMD64. In a variadic argument list, C does not know to convert 0 to a pointer so you must cast to void*.

Zan Lynx 2009-11-04 14:58:38

constant 0 on 64 system is 4 bytes. Just try to print sizeof(0) and sizeof(NULL) and you'll see. And in my case the size do matters, since I pass NULL on the stack

dimba 2009-11-04 15:01:52

@Zan Lynx - the function str_concate is used in heavily used in application and I would happy with solution where I can NULL, which is promised by C++ by a null pointer of size 8 byte.

dimba 2009-11-04 15:04:08

In the AMD64 ABI it says, "INTEGER This class consists of integral types that ﬁt into one of the general purpose registers.". As the general purpose register is 64-bit wide, 0 is promoted to 64-bit. What GCC version is this ?

Malkocoglu 2009-11-04 15:16:57

gcc 4.1.2 (see 1st line of the question)

dimba 2009-11-04 15:25:54

Neither C nor C++ make an absolute requirement, but the "spirit" of both standards would be that `sizeof(int)*CHAR_BIT==64` on a 64-bit system. Problem is, neither provided a `short short` so compiler vendors don't have reasonable names for 32 and 16 bits types, but they have plenty of names for types bigger than `int`. As a result `int` ends up as 32 bits. Note that NULL may also be defined as `0ULL` in which case its type _would_ be 64 bits on those systems.

MSalters 2009-11-04 15:42:15

@Malkocoglu: Not only it is absolutely useless to "look at the assembly output" in this case, but your analysis of the output is incorrect. Displacements in stack do not matter. What matters is that your `0` are placed into the stack by a `movl`. `movl` is a 4-byte copy, not 8 (`movq` would copy 8), meaning that compiler treats your 0's as an ordinary 4-byte `int`, as it should. I.e. your compiler reserves 8-bytes for each argument, but initializes only as much as really necessary (4 in case of `0`'). If you interpret this argument as 8-byte pointer, you'll get garbage in high-order bytes.

AndreyT 2009-11-05 14:57:27

@AndreyT: http://www.x86-64.org/documentation/assembly.html look at the section named Implicit zero extend...

Malkocoglu 2009-11-05 15:07:32

@Malkocoglu: You are pretty lost. "Zero extension" is completely irrelevant here. There's no such thing as "implicit zero extension" when writing into *memory*. It only works with registers. The calling code puts exactly 4 bytes into memory and only 4 bytes will be initialized. The upper 4 bytes will be garbage. The function code will think that it need to read 8 bytes, and it will read all 8, including garbage. No "zero extension" in that case as well.

AndreyT 2009-11-05 15:22:27

@Malkocoglu: ... And all that on top of the fact that looking into some specific assembly output demonstrates absolutely nothing relevant to the original question, even if some particlar compiler somehow happened to generate some particular code that would somehow manage to work in this case.

AndreyT 2009-11-05 15:24:26

I have a tendency to turn a snowball into an avalanche. I feel I am doomed...

Malkocoglu 2009-11-05 16:34:10

No, MSVC does not treat `0` as a 64-bit datatype, which you can easily verify by looking at the result of `sizeof(0)`. What you see in the disassembly is just that MSVC decided to clean-up the garbage that would be left in 8-byte emory word after placing a 4-byte `0` into it. It just felt like it. This means that the code might actually "work" on MSVC just because of that charitable behavior of MSVC. Nevertheless, it does make the code even remotely correct.

AndreyT 2009-11-05 16:37:37

Okey, I accept; it is incorrect. But does not "It just felt like it" translate to Windows ABI ? I could not find a solid reference but maybe Windows (VS) ABI mandates that if it does not know the type, it treats it as 64-bit. Because of what it says here http://msdn.microsoft.com/en-us/library/ms235286.aspx

Malkocoglu 2009-11-05 17:09:46

Answer 2

+3 A:

Removing the __GNUG__case, and inverting the ifdef/endif in the second file, BOTH files do:

#undef NULL
#if defined(__cplusplus)
#define NULL 0
#else
#define NULL ((void *)0)
#endif

Which is to say that they define NULL as ((void *)0) for C compilations and 0 for C++.

So the simple answer is "Don't compile as C++".

Your real problem is your desire to use NULL in your variadic arugment list, combined with your compiler's unpredictable argument sizing. What you MIGHT try is writing "(void *)0" instead of NULL to terminate your list, and force the compiler pass an 8-byte pointer instead of a 4-byte int.

John R. Strohm 2009-11-04 14:56:54

Since it is a list of 'const char *' values that is passed, it seems logical to use '(char *)0' rather than '(void *)0', though I agree the end result is indistinguishable.

Jonathan Leffler 2009-11-04 15:10:31

Answer 3

+1 A:

You may not be able to fix the includes because system includes are a twisty maze.

You might fix the problem by using (void*)0 or (char*)0 instead of NULL.

After considering it I am rejecting my previous idea of redefining NULL. That would be a bad thing to do and could mess up a lot of other code.

Zan Lynx 2009-11-04 15:01:23

Redefining NULL leads to madness greater than that which caused the question to be asked.

Jonathan Leffler 2009-11-04 15:07:31

Answer 4

+5 A:

One solution - possibly even the best, but certainly very reliable - is to pass an explicit null char pointer to your function calls:

str_concat(buffer, sizeof(buffer), "str1", "str2", ..., (char *)0);

or:

str_concat(buffer, sizeof(buffer), "str1", "str2", ..., (char *)NULL);

This is standard recommended practice for the execl() function in POSIX systems, for example, and for precisely the same reason - the trailing arguments of a variable-length argument list are subject to usual promotions (char or short to int; float to double), but cannot otherwise be type safe.

It is also why C++ practitioners generally avoid variable-length argument lists; they are not type safe.

Jonathan Leffler 2009-11-04 15:01:35

Answer 5

+12 A:

There's no "NULL definiton problem" in this case. There's a problem with how you are trying to use NULL in your code.

NULL cannot be portably passed to variadic functions in C/C++ by itself. You have to explicitly cast it before passing, i.e. in your case you have to pass (const char*) NULL as the terminator of the argument list.

Your question is tagged as C++. In any case, regardless of size, in C++ NULL will always be defined as an integer constant. It is illegal in C++ to define NULL as a pointer. Since your function expects a pointer (const char *), no definition of NULL will ever work for it in C++ code.

For cleaner code you can define your own constant, like

const char* const STR_TERM = NULL;

and use it in the calls to your function. But you will never be able to meaningfully use just NULL for that purpose. Whenever a plain NULL is passed as a variadic argument, it is a blatant portability bug that has to be fixed.

Added: your update claims that "C++ standard promises NULL of 8 byte size" (on a 64-bit platform I presume). This just doesn't make any sense. C++ standard does not promise anything like that about NULL.

NULL is intended to be used as an rvalue. It has no specific size and there's no valid use of NULL where its actual size might even remotely matter.

Quoting from ISO/IEC 14882:1998, section 18.1 'Types', paragraph 4:

The macro NULL is an implementation defined C++ null pointer constant in this International Standard (4.10).¹⁸⁰⁾

¹⁸⁰⁾ Possible definitions include 0 and 0L, but not (void*)0.

AndreyT 2009-11-04 15:04:10

Perfectly acceptable to pass a non-const null pointer - and shorter.

Jonathan Leffler 2009-11-04 15:08:07

"NULL cannot be portably passed to variadic functions in C/C++ by itself." - No, it can't be done in C++. I'm pretty sure it's fine in C.

Chris Lutz 2009-11-05 08:00:50

@Chris Lutz: No, it is not fine in C. How could you know whether you are passing `0`, `0L`, `(void *) 0` or something else?

AndreyT 2009-11-05 14:47:54

ansaurus

tags:

views:

answers:

NULL definition problem on 64 bit system

related questions