views:

183

answers:

5

From the man page on my system:

void *memmove(void *dst, const void *src, size_t len);

DESCRIPTION
The memmove() function copies len bytes from string src to string dst.
The two strings may overlap; the copy is always done in a non-destructive
manner.

From the C99 standard:

6.5.8.5 When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object or incomplete types both point to the same object, or both point one past the last element of the same array object, theycompare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.

The emphasis is mine.

The arguments dst and src can be converted to pointers to char so as to alleviate strict aliasing problems, but is it possible to compare two pointers that may point inside different blocks, so as to do the copy in the correct order in case they point inside the same block?

The obvious solution is if (src < dst), but that is undefined if src and dst point to different blocks. "Undefined" means you should not even assume that the condition returns 0 or 1 (this would have been called "unspecified" in the standard's vocabulary).

An alternative is if ((uintptr_t)src < (uintptr_t)dst), which is at least unspecified, but I am not sure that the standard guarantees that when src < dst is defined, it is equivalent to (uintptr_t)src < (uintptr_t)dst). Pointer comparison is defined from pointer arithmetic. For instance, when I read section 6.5.6 on addition, it seems to me that pointer arithmetic could go in the direction opposite to uintptr_t arithmetic, that is, that a compliant compiler might have, when p is of type char*:

((uintptr_t)p)+1==((uintptr_t)(p-1)

This is only an example. Generally speaking very little seems to be guaranteed when converting pointers to integers.

This is a purely academic question, because memmove is provided together with the compiler. In practice, the compiler authors can simply promote undefined pointer comparison to unspecified behavior, or use the relevant pragma to force their compiler to compile their memmove correctly. For instance, this implementation has this snippet:

if ((uintptr_t)dst < (uintptr_t)src) {
            /*
             * As author/maintainer of libc, take advantage of the
             * fact that we know memcpy copies forwards.
             */
            return memcpy(dst, src, len);
    }

I would still like to use this example as proof that the standard goes too far with undefined behaviors, if it is true that memmove cannot be implemented efficiently in standard C. For instance, no-one ticked when answering this SO question.

+3  A: 

For two memory areas to be valid and overlapping, I believe you would need to be in one of the defined situations of 6.5.8.5. That is, two areas of an array, union, struct, etc.

The reason other situations are undefined are because two different objects might not even be in the same kind of memory, with the same kind of pointer. On PC architectures, addresses are usually just 32-bit address into virtual memory, but C supports all kinds of bizarre architectures, where memory is nothing like that.

The reason that C leaves things undefined is to give leeway to the compiler writers when the situation doesn't need to be defined. The way to read 6.5.8.5 is a paragraph carefully describing architectures that C wants to support where pointer comparison doesn't make sense unless it's inside the same object.

Also, the reason memmove and memcpy are provided by the compiler is that they are sometimes written in tuned assembly for the target CPU, using a specialized instruction. They are not meant to be able to be implemented in C with the same efficiency.

Lou Franco
Yes, but the specification of `memmove` does not say that it is invalid to call it with pointers inside different blocks. If it is invalid, the specification should say it so that programmers can use `memcpy` instead in this case.
Pascal Cuoq
@Pascal- It shouldn't be invalid to do that. Your implementation of memmove should be smart enough to detect this and run memcpy instead. The programmer shouldn't have to worry about which function to call.
bta
There is no way to detect that in just C, but yes, it needs to work in both cases. It isn't a requirement that memcpy and memmove need to be implementable with just C -- in fact, perfect versions probably cannot be, which is all the more reason why the standard wants compiler writers to provide it.
Lou Franco
+6  A: 

I think you're right, it's not possible to implement memmove efficiently in standard C.

The only truly portable way to test whether the regions overlap, I think, is something like this:

for (size_t l = 0; l < len; ++l) {
    if (src + l == dst) || (src + l == dst + len) {
      // they overlap, so now we can use comparison,
      // and copy forwards or backwards as appropriate.
      ...
      return dst;
    }
}
// No overlap, doesn't matter which direction we copy
return memcpy(dst, src, len);

You can't implement either memcpy or memmove all that efficiently in portable code, because the platform-specific implementation is likely to kick your butt whatever you do. But a portable memcpy at least looks plausible.

C++ introduced a pointer specialization of std::less, which is defined to work for any two pointers of the same type. It might in theory be slower than <, but obviously on a non-segmented architecture it isn't.

C has no such thing, so in a sense, the C++ standard agrees with you that C doesn't have enough defined behaviour. But then, C++ needs it for std::map and so on. It's much more likely that you'd want to implement std::map (or something like it) without knowledge of the implementation than that you'd want to implement memmove (or something like it) without knowledge of the implementation.

Steve Jessop
Oh, beautiful (comparison for equality of only valid pointers).
Pascal Cuoq
I hope that by "beautiful", you mean "My eyes! The goggles do nothing!" ;-)
Steve Jessop
No, I mean "beautiful" because the pseudo-code is technically, as far as I can see, an answer to the question, rather than a discussion about why the C standards are the way they are.
Pascal Cuoq
+2  A: 

For starters, the C standard is notorious for having problems in the details like this. Part of the problem is because C is used on multiple platforms and the standard attempts to be abstract enough to cover all current and future platforms (which might use some convoluted memory layout that's beyond anything we've ever seen). There is a lot of undefined or implementation-specific behavior in order for compiler writers to "do the right thing" for the target platform. Including details for every platform would be impractical (and constantly out-of-date); instead, the C standard leaves it up to the compiler writer to document what happens in these cases. "Unspecified" behavior only means that the C standard doesn't specify what happens, not necessarily that the outcome cannot be predicted. The outcome is usually still predictable if you read the documentation for your target platform and your compiler.

Since determining if two pointers point to the same block, memory segment, or address space depends on how the memory for that platform is laid out, the spec does not define a way to make that determination. It assumes that the compiler knows how to make this determination. The part of the spec you quoted said that result of pointer comparison depends on the pointers' "relative location in the address space". Notice that "address space" is singular here. This section is only referring to pointers that are in the same address space; that is, pointers that are directly comparable. If the pointers are in different address spaces, then the result is undefined by the C standard and is instead defined by the requirements of the target platform.

In the case of memmove, the implementor generally determines first if the addresses are directly comparable. If not, then the rest of the function is platform-specific. Most of the time, being in different memory spaces is enough to ensure that the regions don't overlap and the function turns into a memcpy. If the addresses are directly comparable, then it's just a simple byte copy process starting from the first byte and going forward or from the last byte and going backwards (whichever one will safely copy the data without clobbering anything).

All in all, the C standard leaves a lot intentionally unspecified where it can't write a simple rule that works on any target platform. However, the standard writers could have done a better job explaining why some things are not defined and used more descriptive terms like "architecture-dependent".

bta
I understand the compromises of defining a standard for existing and future implementations. My question was not a complaint about C (millions of programmers can't be all wrong), but having to make one own's classification between undefined-but-acceptable constructs and plain undefined constructs is a burden on each low-level C programmer (and on me!). And I can imagine that sometimes, you have to write your own memmove-like function because standard memmove is not exactly what you need; you'd like to write it in C, and you'd like it to avoid the intermediate copy.
Pascal Cuoq
Also, the C standard leaves things to define and document to the compiler writer, but some compilers are written by people who do not care to provide additional guarantees to those of the standard. I have heard of compiler writers wriggling out of penalties for generating incorrect code when accessing an `union` field other than the last one written to by invoking the standard. That shouldn't be allowed (gcc promotes this from undefined to implementation-defined, as an example of what compiler writers should typically document).
Pascal Cuoq
@Pascal: there are potential good reasons for that union example - in order to make cast-through-union work, you might have to make exceptions to optimisations that rely on the strict aliasing rules. I don't think that compiler writers "should" do this (except that because they want a lot of existing incorrect code to work, they have to): I think programmers shouldn't use the idiom.
Steve Jessop
@Steve Jessop That was an embedded compiler. It **had** to provide a way to convert some bits from a type to another (again, without requiring an intermediate copy) or it was unusable for its intended purpose. Gcc's choice to allow this when done through an union and disallowing it through pointer casts at least gives you a way to do it.
Pascal Cuoq
@Pascal: why does it have to provide a way to do it without a copy? That said, I think there are some cases where a union cast is valid in C99, assuming that you know the representations of the type involved. I can just never remember what they are.
Steve Jessop
+1  A: 

I would still like to use this example as proof that the standard goes too far with undefined behaviors, if it is true that memmove cannot be implemented efficiently in standard C

But it's not proof. There's absolutely no way to guarantee that you can compare two arbitrary pointers on an arbitrary machine architecture. The behaviour of such a pointer comparison cannot be legislated by the C standard or even a compiler. I could imagine a machine with a segmented architecture that might produce a different result depending on how the segments are organised in RAM or might even choose to throw an exception when pointers into different segments are compared. This is why the behaviour is "undefined". The exact same program on the exact same machine might give different results from run to run.

The oft given "solution" of memmove() using the relationship of the two pointers to choose whether to copy from the beginning to the end or from the end to the beginning only works if all memory blocks are allocated from the same address space. Fortunately, this is usually the case although it wasn't in the days of 16 bit x86 code.

JeremyP
I think it is proof that even if the language retains the fast comparison for pointers known to be in the same block, it must also provide a comparison that works as long as the pointers are of the same type. Steve Jessop's answer convinced me to switch to C++ :)
Pascal Cuoq
@Pascal Cuoq: Why restrict all of the pointers to objects of the same type to coming from the same address space? It works for modern flat address space architectures, but in the general case, may not even be logically possible.
JeremyP
"it must also provide a comparison" - well, what's actually needed for memmove is a way to test whether the memory regions overlap. If one points to "red" segment, and the other points to "blue" segment, then you know they don't overlap, which is enough that memmove can delegate to memcpy. You don't need there to be an order relationship between "red" and "blue" (although an implementation presumably could choose one, perhaps arbitrarily).
Steve Jessop
@Steve: I think the complaint most people make is that the implementation is not forced to impose such an order, and some people would like it to be forced to. Of course that would impose a huge performance hit to pointer comparison on such implementations. On the other hand, *my* complaint is that C still allows legacy things like non-flat memory. Thankfully POSIX adds some sanity.
R..
@R.. Agree with you except that the non flat memory model is not quite as legacy as those of us who program modern general purpose computers would like to think.
JeremyP
+1  A: 

Here's another idea, but I don't know if it's correct. To avoid the O(len) loop in Steve's answer, one could put it in the #else clause of an #ifdef UINTPTR_MAX with the cast-to-uintptr_t implementation. Provided that cast of unsigned char * to uintptr_t commutes with adding integer offsets whenever the offset is valid with the pointer, this makes the pointer comparison well-defined.

I'm not sure whether this commutativity is defined by the standard, but it would make sense, as it works even if only the lower bits of a pointer are an actual numeric address and the upper bits are some sort of black box.

R..
It doesn't necessarily work though if the upper bits of the integer value resulting from the converted pointer are the address, and the lower bits are the black box. Imagine a hypothetical system on which the smallest addressable unit of memory from assembly code is 4 bits. Then I *think* it's valid, and it's certainly fairly natural, for the conversion to `uintptr_t` to result in `(uintptr_t)(ptr+1) == ((uintptr_t)ptr) + 2`. The "black box" at the bottom here is just a single 0 bit, and could result in a false negative of the test "do they overlap?"
Steve Jessop
@Steve: excellent example.
R..