In case of integer overflows what is the result of (unsigned int) * (int) ? unsigned or int?
I was auditing the following function, and suddenly I came out with that question. In the below function, it is vulnerable at line 17.
1. // Create a character array and initialize it with init[]
2. // repeatedly. The size of this character array is specified by
3. // w*h.
4. char *function4(unsigned int w, unsigned int h, char *init)
5. {
6. char *buf;
7. int i;
8.
9. if (w*h > 4096)
10. return (NULL);
11.
12. buf = (char *)malloc(4096+1);
13. if (!buf)
14. return (NULL);
15.
16. for (i=0; i<h; i++)
17. memcpy(&buf[i*w], init, w);
18.
19. buf[4096] = '\0';
20.
21. return buf;
22. }
Consider both w
and h
are very large unsigned integers. The multiplication at line 9 have a chance to pass the validation.
Now the problem is at line 17. Multiply int i
with unsigned int w
: if the result is int
, it is possible that the product is negative, resulting in accessing a position that is before buf
. If the result is unsigned int
, the product will always be positive, resulting in accessing a position that is after buf
.
It's hard to write code to justify this: int
is too large. Does anyone has ideas on this?
Added:
The question here is not to discuss how bad
the function is, or how to improve the function to make it better.
The question is to ask:
`what is the result of (unsigned int) * (int) ? unsigned or int?`
Is there any documentation that specifies this? I am searching for it.
Added:
I guess there is no need to discuss whether (unsigned int) * (int)
produces unsigned int
or int
. Because from C's perspective, they are bytes. Therefore, the following code holds:
unsigned int x = 10;
int y = -10;
printf("%d\n", x * y); // print x * y in signed integer
printf("%u\n", x * y); // print x * y in unsigned integer
Therefore, it does not matter what type the multiplication returns. It matters that whether the consumer function takes int
or unsigned
.
So, now the question becomes,
"Does the array indexer `somearray[value]` takes an `int` as input,
or an `unsigned ` as input?"