views:

1333

answers:

20

I want to make a dummy Win32 EXE file that is much larger than it should be. So by default a boiler plate Win32 EXE file is 80 KB. I want a 5 MB one for testing some other utilities.

The first idea is to add a resource, but as it turns out embedded resources are not the same as 5 MB of code when it comes to memory allocation. I am thinking I can reference a large library and end up with a huge EXE file? If not, perhaps scripting a few thousand similar methods like AddNum1, AddNum2, etc., etc.?

Any simple ideas are very appreciated.

+5  A: 

You can create big static arrays of dummy data. That would bump your exe size, would not be real code though.

jv42
That does seem like the simplest and easiest way to control method to do something like this.
Andrew Barber
I thought of this too when I saw the question, but won't it be optimized out?
legends2k
@legends2k: The simple answer to that is to turn off optimizations...
sth
I was thinking of including a boatload of windows libraries to make it bigger. Any merit to that?
Phil
@Phil maybe if you can link them statically. Otherwise it's just a bunch of exports/references.
enriquein
@sth: I know one can turn optimizations off, but I think there should be some other way to do it, without losing optimizations; like adding a resource binary say a 5 MB res via a .rc to the binary.
legends2k
Reason is, when testing, optimizations might be needed i.e. to match the actual non-bloated code. Also rc should work since OP said it's Win32.
legends2k
Resource will not work. Somehow the Win32 CreateProcess method KNOWS resources are not allocated in the same memory space.
Phil
Powerbasic has a compiler macro named #BLOAT that does this. Maybe other compilers can do this as well? I know people use this technique on trojans to attempt to match the real app's size.
enriquein
+14  A: 

What about simply defining a large static char array?

char const bigarray[5*1024*1024] = { 1 };

See also my other answer in this thread where I suggest statically linking to big libraries. This surely will pull in real code if you just reference enough code of the libraries.

EDIT: Added a non-zero initialization, as data containing zeros only is treated in an optimized fashion by the compiler/linker.

EDIT: Added reference to my other answer.

EDIT: Added const qualifier, so bigarray will be placed amongst code by many compilers.

Peter G.
Not quite what I want to do. i want the exe on disk to be larger, not the memory usage. Thank you though.
Phil
If you never use it, it should never get loaded into physical memory. So unless you're concerned about the impact it has on the available virtual address space, don't worry about it.
Tyler McHenry
@Phil: You say you want the size larger only on the disk and not the memory usage but then in the actual question, you say memory allocation should be 5MiB. Am I missing something?
legends2k
Tyler, i retract my initial response :) This looks like it may work well. testing...
Phil
@legends2k: Sorry, the size on disk is important too, I am using Createprocess which allocates memory. I need it to allocate as much memory as the size of the file on disk.
Phil
Does it matter whether it's 5MB of code or 5MB of data? Generating 5MB of code is a lot harder.
Ferruccio
AFAIUnderstand, this code will not make the executable bigger, only the memory allocated at runtime?
Klaim
@Klaim, static POD objects are allocated at link time which means they are in the executable.
Peter G.
Is it true for all compilers?
Klaim
@Klaim I know of no exception. It's also true that many compilers will place const static POD objects together with code in a read-only section. I added the const in my code example now.
Peter G.
Thanks, I thought it was not guaranteed but now that I think about my experience in embedded software, I remember that the size of the executable was dependent on the number of elements in a const static table... Thanks for the confirmation.
Klaim
+9  A: 
char big[5*1024*1024] = {1};

You need to initialize it to something other than 0 or the compiler/linker may optimize it.

Ferruccio
This will only initialize the first element, the rest will be zero. http://stackoverflow.com/questions/201101/how-to-initialize-an-array-in-c
SuperJames
That's true, but for the purposes of this question it doesn't matter exactly what it's initialized to. Setting the first element to a non-zero value seems to be enough to prevent the compiler from optimizing that variable. In other words when you set it to all zeros the compiler simply says "there should be 5 million zeroes here". Whereas this forces it to say "there's a one, followed by a zero, followed by a zero..."
Ferruccio
+1  A: 

Write a program that generates a lot of code.

printf("000000000");
printf("000000001");
// ...
printf("010000000");
Amnon
Yes that's the most obvious way to produce a lot of extra code (as opposed to just static data). You can also do it using copy-and-paste, leaning on the paste key.
ChrisW
@ChrisW: if you're using copy-and-paste, exponential copy-and-paste is better than leaning on a key: Ctrl-A,C,V,V, repeat log(n) times
Amnon
+3  A: 

Use a big array of constant data, like explicit strings:

char *dummy_data[] = {
    "blajkhsdlmf..(long script-generated random string)..",
    "kjsdfgkhsdfgsdgklj..(etc...)...jldsjglkhsdghlsdhgjkh",
};

Unlike variable data, constant data often falls in the same memory section as the actual code, although this may be compiler- or linker-dependent.

Edit: I tested the following and it works on Linux:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int i, j;

    puts("char *dummy_data[] = {");
    for (i = 0; i < 5000; i++) {
        fputs("    \"", stdout);
        for (j = 0; j < 1000; j++) putchar('a' + rand() % 26);
        puts("\",");
    }
    puts("};");
    return 0;
}

Both this code and its output compile cleanly.

Edgar Bonet
I tried something like this and ended up with a C2026 error. Looks like there is a 16K limit on arrays?
Phil
If your strings are 1K long, then you only need 5K elements in the array, which makes the array size 20K (it's an array of pointers to constant strings).
Edgar Bonet
+6  A: 

How about just adding binary zeroes to the end of the .exe?

kotlinski
Why not add some hex zeros? Those ones are bigger :P.
notJim
A: 

If all else fails, you could still create an assembly language source file where you have an appropriate number of db statements emitting bytes into the code segment, and link the resulting code object to your program as extern "C" { ... }.

You might need to play with the compiler/linker to prevent the linker from optimizing away that dummy "code" object.

ndim
+3  A: 

I've found that even with optimizations, raw strings are kept as is in the compiled executable file.

So the way to go is :

  • go to http://lipsum.org/
  • generate a lot of text
  • add a cpp in your program
  • add a static const string that will have the generated text as value
  • compile
  • check the size.

If your compiler have a limit of raw string size (?) then just make a paragraph per static string.

The added size should be easy to guess.

Klaim
+2  A: 

You could try creating some sort of recursive template that would generate a lot of different instantiations. This could possibly cause a big increase in code size.

Mark B
Also compilation time; templates are one of the biggest reasons C++ compiles so slowly.
imgx64
A: 

Use #define to define lots of macros which holds string with huge length, and use those macros inside your program in many places.

sadananda salam
+7  A: 

Fill the EXE file with NOPs in assembler.

phpMyID
I was about to suggest that.
George Edison
Damn, if i knew i would get voted up for this i would have used my real profile =|
phpMyID
+1  A: 

I admit, I'm a Linux/UNIX guy. Is it possible to statically link an executable in Windows? You then could reference some heavy libs and blow up your code size as much as you want without writing to much code by yourself.

Another idea I pondered while reading your comment to my first answer is appending zeros to your file. As said, I'm no Windows expert, so this might not work.

Peter G.
"Is it possible to statically link an executable in Windows?" -- Yes it is, but the linker will only link/include the objects from the library which are needed (referenced) by the application.
ChrisW
@ChrisW : There may be an option like "--whole-archive" for ld in linux to force the linker to include everything ??
Elenaher
Yes this works. Try statically linking wxWidgets.
Quandary
+8  A: 

If it's the file size you want to increase then append a text file to the end of the exe of the required size.

I used to do this when customers would complain of small exes. They didn't realize that small exes are just as professional as larger exes. In fact in some languages there is a bloat() command to increase the size of exes, usually in BASIC compilers.

EDIT: Found an old link to a piece of code that people use: http://www.purebasic.fr/english/viewtopic.php?f=12&amp;t=38994

An example: http://programmers.stackexchange.com/questions/2051/what-is-the-craziest-stupidest-silliest-thing-a-client-boss-asked-you-to-do/2698#2698

Gary Willoughby
What??!! Customers complaining of small EXEs? I don't think I've ever dealt with a customer that dumb.
ptomato
Yep, believe it or not! it's similar to a heavy camera. The heavier it is, the 'better' it must be! Bloat initial program releases and with each successive update claim smaller memory footprints due to further optimizations! ;)
Gary Willoughby
Isn't there some checksum validation for EXEs that will fail if you append a file?
Amnon
Not unless you've programmed the exe to check itself.
Gary Willoughby
There is one advantage to heavy cameras: they are less prone to camera shake (Newton's F=ma and all that!). Can't really say the same about large EXEs though :-)
psmears
+2  A: 

Use Boost and compile the executable with debug information.

tstenner
+1  A: 

Add a 5MB (bmp) image.

JackN
A: 

You could do this:

REM generate gibberish of the desired size
dd if=/dev/random of=entropy count=5000k bs=1
REM copy the entropy to the end of the file
copy /b someapp.exe + entropy somefatapp.exe

If it were a batch file, you could even add it as a post compilation step so it happened automatically.

You can generally copy as much information as you want to the end of an exe. All the code / resources are stored as offsets from the beginning of the file, so increasing it's size shouldn't affect it.

(I'm assuming you have dd in Windows. If not, get it).

Seth
A: 

Write a code generator that generates arbitrary random functions. The only trick then is making sure that it doesn't get optimized out and with separate compilation that shouldn't be hard.

BCS
+1  A: 

After you do all the methods listed here, compile with the debug flag and with the highest optimization flag (gcc -g -O3).

Aboelnour
A: 

Statically link wxWidgets to your application. It will instantly become 5 MB large.

Quandary
A: 

Add lots of resources

myeviltacos