Did you compile with debugging symbols enabled? That could account for a large portion of the size. Also how are you determining the size of the binary? Assuming you're on a UNIX-like platform are you using a straight "ls -l
" or the "size
" command. The two could give greatly different results if the binary contains debugging symbols. For example, here are the results I get when building the Boost.Regex "credit_card_example.cpp" example.
$ g++ -g -O3 foo.cpp -lboost_regex-mt
$ ls -l a.out
-rwxr-xr-x 1 void void 483801 2010-05-20 10:36 a.out
$ size a.out
text data bss dec hex filename
73330 492 336 74158 121ae a.out
Similar results occur when just generating the object file:
$ g++ -c -g -O3 foo.cpp
$ ls -l foo.o
-rw-r--r-- 1 void void 622476 2010-05-20 10:40 foo.o
$ size foo.o
text data bss dec hex filename
49119 4 40 49163 c00b foo.o
EDIT: Added some static linking results ...
Here's the binary size when statically linking. It's closer to what you're getting:
$ g++ -static -g -O3 foo.cpp -lboost_regex-mt -lpthread
$ ls -l a.out
-rwxr-xr-x 1 void void 2019905 2010-05-20 11:16 a.out
$ size a.out
text data bss dec hex filename
1204517 5184 41976 1251677 13195d a.out
It's also possible that much of the large size is coming from other libraries the Boost.Regex library depends on. On my Ubuntu box, the dependencies for the Boost.Regex shared library are the following:
$ ldd /usr/lib/libboost_regex-mt.so.1.38.0
linux-gate.so.1 => (0x0053f000)
libicudata.so.40 => /usr/lib/libicudata.so.40 (0xb6a38000)
libicui18n.so.40 => /usr/lib/libicui18n.so.40 (0x009e0000)
libicuuc.so.40 => /usr/lib/libicuuc.so.40 (0x00672000)
librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0x001e2000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x001eb000)
libm.so.6 => /lib/tls/i686/cmov/libm.so.6 (0x00110000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x009be000)
libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0x00153000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0x002dd000)
/lib/ld-linux.so.2 (0x00e56000)
The ICU libraries can get quite large. Besides debugging symbols, perhaps they are the primary contributors to the size of your binary. Furthermore, in the statically linked case, it looks like the Boost.Regex library itself is comprised of large object files:
$ size --totals /usr/lib/libboost_regex-mt.a | sort -n
0 0 0 0 0 regex_debug.o (ex /usr/lib/libboost_regex-mt.a)
0 0 0 0 0 usinstances.o (ex /usr/lib/libboost_regex-mt.a)
0 0 0 0 0 w32_regex_traits.o (ex /usr/lib/libboost_regex-mt.a)
text data bss dec hex filename
435 0 0 435 1b3 regex_raw_buffer.o (ex /usr/lib/libboost_regex-mt.a)
480 0 0 480 1e0 static_mutex.o (ex /usr/lib/libboost_regex-mt.a)
1543 0 36 1579 62b cpp_regex_traits.o (ex /usr/lib/libboost_regex-mt.a)
3171 632 0 3803 edb regex_traits_defaults.o (ex /usr/lib/libboost_regex-mt.a)
5339 8 13 5360 14f0 c_regex_traits.o (ex /usr/lib/libboost_regex-mt.a)
5650 8 16 5674 162a wc_regex_traits.o (ex /usr/lib/libboost_regex-mt.a)
9075 4 32 9111 2397 regex.o (ex /usr/lib/libboost_regex-mt.a)
17052 8 4 17064 42a8 fileiter.o (ex /usr/lib/libboost_regex-mt.a)
61265 0 0 61265 ef51 wide_posix_api.o (ex /usr/lib/libboost_regex-mt.a)
61787 0 0 61787 f15b posix_api.o (ex /usr/lib/libboost_regex-mt.a)
80811 8 0 80819 13bb3 icu.o (ex /usr/lib/libboost_regex-mt.a)
116489 8 112 116609 1c781 instances.o (ex /usr/lib/libboost_regex-mt.a)
117874 8 112 117994 1ccea winstances.o (ex /usr/lib/libboost_regex-mt.a)
131104 0 0 131104 20020 cregex.o (ex /usr/lib/libboost_regex-mt.a)
612075 684 325 613084 95adc (TOTALS)
You could get up to ~600K coming from Boost.Regex alone if some or all of those object files get linked into your binary.