tags:

views:

516

answers:

10

Hello all, complete newbie here,

I need to find a way to store 250 KB of plain text numbers inside my program's executable file.

Usually, I would put the data in a separate file and let the program read it while it is running, but that's not an option here. Instead, the program and the data need to be in one executable file.

I have absolutely no idea how to do it (except writing 250.000 #defines :-) and I'd appreciate any suggestions.

Thank you very much!

+4  A: 

Store it as a const array:

/* Maximum number of digits in a number, adjust as necessary */
#define NUMBER_MAX_LENGTH 16

/* How many numbers you have (in this case 250K), adjust as necessary */
#define NUMBER_OF_NUMBERS (250 * (1 << 10))

const char data[NUMBER_OF_NUMBERS][NUMBER_MAX_LENGTH+1] =
 { "12345", "2342841", "129131", "18317", /* etc */ };

Presumably you know your data set so you can come up with the appropriate value for NUMBER_MAX_LENGTH in your case.

You can also of course write a script that transforms a flat file of numbers into this format. If you want, you could even keep the numbers in a plain-text data file and have the script generate the corresponding C code as above during your build.

I wrote it that way because you said "plain text numbers", indicating that you need them as strings for some reason. If you'd rather have them as integers, it's even simpler:

/* How many numbers you have (in this case 250K), adjust as necessary */
#define NUMBER_OF_NUMBERS (250 * (1 << 10))

const int data[NUMBER_OF_NUMBERS] =
 { 12345, 2342841, 129131, 18317, /* etc */ };

Assuming that none of your numbers is too large to store in an int.

Tyler McHenry
You can just use `data[]` instead of using a number of numbers macro.
Billy ONeal
+8  A: 

How about an array of some sort. Just put that definition in a file and compile it into your program:

int external_data[] =
{
    ...
};

you can have the compiler tell you how many elements are in external data:

size_t external_data_max_idx = sizeof(external_data) / sizeof(*external_data);
R Samuel Klatchko
+1 for using the empty bracket syntax.
Billy ONeal
+1 for suggesting the definition in an *separate* file. I'm currently using this technique and I only have to change the file and rebuild, especially when the data changes.
Thomas Matthews
A: 

It sounds like you're trying to avoid putting it in a source file, but that's exactly what I'd do:

int numbers[250000] = {1, 2, ...};

It's technically possible to keep them as a plain file and write a linker directive file that creates a new data section of the proper size and combines them, but there's really no reason. Put that definition in a separate file and #include it into the file that needs it

Michael Mrozek
Note you could just make that "numbers[]` and the compiler will count for you.
Billy ONeal
Yeah, but if I know how many things are supposed to be in the array I like to include it, both so people looking at it know the size immediately, and so if I mess up and forget/duplicate one I'll get a compile-time error
Michael Mrozek
A: 

You could adapt this solution to numbers:

static const wchar_t *systemList[] = {
    L"actskin4.ocx",
    L"advpack.dll",
    L"asuninst.exe",
    L"aswBoot.exe",
    L"AvastSS.scr",
    L"avsda.dll",
    L"bassmod.dll",
    L"browseui.dll",
    L"CanonIJ Uninstaller Information",
    L"capicom.dll",
    L"cdfview.dll",
    L"cdm.dll",
    L"d3dx9_24.dll",
    L"d3dx9_25.dll",
    L"d3dx9_27.dll",
    L"d3dx9_28.dll",
    L"d3dx9_29.dll",
    L"d3dx9_30.dll",
    L"danim.dll",
    L"dfrgntfs.exe",
    L"dhcpcsvc.dll",
    L"dllhost.exe",
    L"dnsapi.dll",
    L"drivers\\aavmker4.sys",
    L"drivers\\apt.sys",
    L"drivers\\aswFsBlk.sys",
    L"drivers\\aswmon.sys",
    L"drivers\\aswmon2.sys",
    L"drivers\\aswRdr.sys",
    L"drivers\\aswSP.sys",
    L"drivers\\aswTdi.sys",
    L"drivers\\avg7core.sys",
    L"drivers\\avg7rsw.sys",
    L"drivers\\avg7rsxp.sys",
    L"drivers\\avgclean.sys",
    L"drivers\\avgmfx86.sys",
    L"drivers\\avgntdd.sys",
    L"drivers\\avgntmgr.sys",
    L"drivers\\avgtdi.sys",
    L"drivers\\avipbb.sys",
    L"drivers\\cmdmon.sys",
    L"drivers\\gmer.sys",
    L"drivers\\inspect.sys",
    L"drivers\\klick.sys",
    L"drivers\\klif.sys",
    L"drivers\\klin.sys",
    L"drivers\\pxcom.sys",
    L"drivers\\pxemu.sys",
    L"drivers\\pxfsf.sys",
    L"drivers\\pxrd.sys",
    L"drivers\\pxscrmbl.sys",
    L"drivers\\pxtdi.sys",
    L"drivers\\rrspy.sys",
    L"drivers\\rrspy64.sys",
    L"drivers\\ssmdrv.sys",
    L"drivers\\UMDF",
    L"drivers\\USBSTOR.SYS",
    L"DRVSTORE",
    L"dxtmsft.dll",
    L"dxtrans.dll",
    L"en-us",
    L"extmgr.dll",
    L"fntcache.dat",
    L"hal.dll",
    L"icardie.dll",
    L"ie4uinit.exe",
    L"ieakeng.dll",
    L"ieaksie.dll",
    L"ieakui.dll",
    L"ieapfltr.dat",
    L"ieapfltr.dll",
    L"iedkcs32.dll",
    L"ieframe.dll",
    L"iepeers.dll",
    L"iernonce.dll",
    L"iertutil.dll",
    L"ieudinit.exe",
    L"ieui.dll",
    L"imon1.dat",
    L"inseng.dll",
    L"iphlpapi.dll",
    L"java.exe",
    L"javaw.exe",
    L"javaws.exe",
    L"jgdw400.dll",
    L"jgpl400.dll",
    L"jscript.dll",
    L"jsproxy.dll",
    L"kbdaze.dll",
    L"kbdblr.dll",
    L"kbdbu.dll",
    L"kbdkaz.dll",
    L"kbdru.dll",
    L"kbdru1.dll",
    L"kbdtat.dll",
    L"kbdur.dll",
    L"kbduzb.dll",
    L"kbdycc.dll",
    L"kernel32.dll",
    L"legitcheckcontrol.dll",
    L"libeay32_0.9.6l.dll",
    L"Macromed",
    L"mapi32.dll",
    L"mrt.exe",
    L"msfeeds.dll",
    L"msfeedsbs.dll",
    L"msfeedssync.exe",
    L"msftedit.dll",
    L"mshtml.dll",
    L"mshtmled.dll",
    L"msrating.dll",
    L"mstime.dll",
    L"netapi32.dll",
    L"occache.dll",
    L"perfc009.dat",
    L"perfh009.dat",
    L"pncrt.dll",
    L"pndx5016.dll",
    L"pndx5032.dll",
    L"pngfilt.dll",
    L"px.dll",
    L"pxcpya64.exe",
    L"pxdrv.dll",
    L"pxhpinst.exe",
    L"pxinsa64.exe",
    L"pxinst.dll",
    L"pxmas.dll",
    L"pxsfs.dll",
    L"pxwave.dll",
    L"rasadhlp.dll",
    L"rasmans.dll",
    L"riched20.dll",
    L"rmoc3260.dll",
    L"rrsec.dll",
    L"rrsec2k.exe",
    L"shdocvw.dll",
    L"shell32.dll",
    L"shlwapi.dll",
    L"shsvcs.dll",
    L"sp2res.dll",
    L"spmsg.dll",
    L"ssiefr.EXE",
    L"STKIT432.DLL",
    L"streamhlp.dll",
    L"SWSC.exe",
    L"tzchange.exe",
    L"url.dll",
    L"urlmon.dll",
    L"vsdata.dll",
    L"vsdatant.sys",
    L"vsinit.dll",
    L"vsmonapi.dll",
    L"vspubapi.dll",
    L"vsregexp.dll",
    L"vsutil.dll",
    L"vswmi.dll",
    L"vsxml.dll",
    L"vxblock.dll",
    L"webcheck.dll",
    L"WgaLogon.dll",
    L"wgatray.exe",
    L"wiaservc.dll",
    L"windowspowershell",
    L"winfxdocobj.exe",
    L"wmp.dll",
    L"wmvcore.dll",
    L"WREGS.EXE",
    L"WRLogonNtf.dll",
    L"wrlzma.dll",
    L"wuapi.dll",
    L"wuauclt.exe",
    L"wuaueng.dll",
    L"wucltui.dll",
    L"wups.dll",
    L"wups2.dll",
    L"wuweb.dll",
    L"x3daudio1_0.dll",
    L"xactengine2_0.dll",
    L"xactengine2_1.dll",
    L"xactengine2_2.dll",
    L"xinput1_1.dll",
    L"xinput9_1_0.dll",
    L"xmllite.dll",
    L"xpsp3res.dll",
    L"zlcomm.dll",
    L"zlcommdb.dll",
    L"ZPORT4AS.dll"
};
Billy ONeal
+1  A: 

I agree with the previous answers. The best way is to simply store it in the code and then compile it into the program. For the sake of argument you could look at the format for an executable and add some data/code in there (This is how a lot of viruses work) and simply read from the executable and get the data. http://refspecs.freestandards.org/elf/elf.pdf has the format for an executable. Once again this is for the sake of argument and is not recommended.

Romain Hippeau
+2  A: 

Lets assume the numbers are constants. Lets assume, that you can compute this list once, in "pre-compilation" stage. Lets assume that there is a function that can "return" that list.

Stage one: write an application that calls getFooNumber() and works perfectly. Nice.

Stage two: Take that function, and put it in another project. Now, lets write a small application that will generate the 250,000 lines of C code.

#include <stdlib>
#define MAX_BLABLA 2500000

int main(int argc, char *argv[] )
{
  FILE *f fopen("fooLookupTable.h");
  long i;
  fprintf( f, "#ifndef FOO_HEADER\n");
  fprintf( f, "#define FOO_HEADER\n");

  fprintf( f, "char [] blabla = {\n\t");
  for( i=0; i<MAX_BLABLA; i ++ )
  {
     fprintf(f, "%d", getFooNumber(i) );
     if (n+1 != MAX_BLABLA)
         fprintf(f, ",");
     if (n%10 == 0)
         fprintf(f, "\n\t");
  }
  fprintf( f, "};\n\n");
  fprintf( f, "#endif // FOO_HEADER\n");
}

This will create the list Billy ONeal talked about.

Stage 3: The use the header file you just created in stage 2, and use it inside the first project to return from the new getFooNumber() the value from the lookup table.

Stage 4: Learn to use Qt, and understand that you can embed the file directly and load it using QFile(":application/numberz.txt").

Notes: * The C code is probably broken. I did not test it. * If you are usign Windows or Mac, you can probably do something similar with the resource system (MAC has a similar thing no?)

elcuco
+1 , creating a little program to generate C code from data is very handly.
nos
+5  A: 

You could just generate an array definition. For example, suppose you have numbers.txt:

$ head -5 numbers.txt
0.99043748698114
0.0243802034269436
0.887296518349228
0.0644020236531517
0.474582201929554

I've generated it for the example using:

$ perl -E'say rand() for (1..250_000)' >numbers.txt

Then to convert it to C array definition you could use a script:

$ perl -lpE'BEGIN{ say "double data[] = {"; }; 
>     END{ say "};" }; 
>     s/$/,/' > data.h < numbers.txt 

It produces:

$ head -5 data.h
double data[] = {
0.99043748698114,
0.0243802034269436,
0.887296518349228,
0.0644020236531517,

$ tail -5 data.h
0.697015237317363,
0.642250552146166,
0.00577098769553785,
0.249176256744811,
};

It could be used in your program as follows:

#include <stdio.h>    
#include "data.h"

int main(void) {
  // print first and last numbers
  printf("%g %g\n", data[0], data[sizeof(data)/sizeof(*data)-1]);
  return 0;
}

Run it:

$ gcc *.c && ./a.out
0.990437 0.249176
J.F. Sebastian
++ for code generation
Eli Bendersky
A: 

What platform are you running at? If you are on Windows and the numbers won't change in time, then just put your text file to program resources using resource linker, and read it in your code.

Eugene Mayevski 'EldoS Corp
+2  A: 

You can use the xxd command with the -i option to convert any file to a char vector in C. If you are on Windows you can look into using it in Cygwin.

epatel
A: 

Not the solution (this was given before) but: don't put it in a header file. Write a header, which defines a function that returns an array. Then implement this in a .c file. Otherwise, you will end up in a compilation mess...

Markus Pilman