views:

148

answers:

2

Hi,

I'm using CentOS 5.4 x86_64 and Boost 1.42.0 on a cluster that uses Open-MPI 1.3.3. I'm writing a shared library that uses shared memory to store large amounts of data for multiple processes to use. There's also a loader application that will read in the data from the files and load them into the shared memory.

When I run the loader application, it determines the amount of memory that it needs to store the data exactly then adds 25% for overhead. For just about every file, it'll be over 2 gigs worth of data. When I make the memory request using Boost's Interprocess library, it says it has successfully reserved the requested amount of memory. But when I use start to use it, I get a "Bus error". From what I can tell, the bus error is a result of accessing memory outside the range that is available for the memory segment.

So I started looking into how the shared memory is on Linux and what to check to make sure my system is correctly configured to allow that large amount of shared memory.

  1. I looked at the "files" at /proc/sys/kernel/shm*:
    • shmall - 4294967296 (4 Gb)
    • shmmax - 68719476736 (68 Gb)
    • shmmni - 4096

  2. I called the ipcs -lm command:
    ------ Shared Memory Limits --------
    max number of segments = 4096
    max seg size (kbytes) = 67108864
    max total shared memory (kbytes) = 17179869184
    min seg size (bytes) = 1

From what I can tell, those settings indicate that I should be able to allocate enough shared memory for my purposes. So I created a stripped down program that created large amounts of data in shared memory:


#include <iostream>

#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/interprocess/containers/vector.hpp>

namespace bip = boost::interprocess;

typedef bip::managed_shared_memory::segment_manager segment_manager_t;
typedef bip::allocator<long, segment_manager_t> long_allocator;
typedef bip::vector<long, long_allocator> long_vector;

int main(int argc, char ** argv) {
    struct shm_remove  { 
        shm_remove()    { bip::shared_memory_object::remove("ShmTest"); } 
        ~shm_remove()   { bip::shared_memory_object::remove("ShmTest"); } 
    } remover; 

    size_t szLength = 280000000;
    size_t szRequired = szLength * sizeof(long);
    size_t szRequested = (size_t) (szRequired * 1.05);
    bip::managed_shared_memory segment(bip::create_only, "ShmTest", szRequested); 

    std::cout << 
        "Length:       " << szLength << "\n" <<
        "sizeof(long): " << sizeof(long) << "\n" <<
        "Required:     " << szRequired << "\n" <<
        "Requested:    " << szRequested << "\n" <<
        "Allocated:    " << segment.get_size() << "\n" <<
        "Overhead:     " << segment.get_size() - segment.get_free_memory() << "\n" <<
        "Free:         " << segment.get_free_memory() << "\n\n";

    long_allocator alloc(segment.get_segment_manager()); 
    long_vector vector(alloc);

    if (argc > 1) {
        std::cout << "Reserving Length of " << szLength << "\n";
        vector.reserve(szLength);
        std::cout << "Vector Capacity: " << vector.capacity() << "\tFree: " << segment.get_free_memory() << "\n\n";
    }

    for (size_t i = 0; i < szLength; i++) {
        if ((i % (szLength / 100)) == 0) {
            std::cout << i << ": " << "\tVector Capacity: " << vector.capacity() << "\tFree: " << segment.get_free_memory() << "\n";
        }
        vector.push_back(i);    
    }
    std::cout << "end: " << "\tVector Capacity: " << vector.capacity() << "\tFree: " << segment.get_free_memory() << "\n";

    return 0;
}

Compiled it with the line:

g++ ShmTest.cpp -lboost_system -lrt

Then ran it with the following output (edited to make it smaller):

Length:       280000000
sizeof(long): 8
Required:     2240000000
Requested:    2352000000
Allocated:    2352000000
Overhead:     224
Free:         2351999776

0:      Vector Capacity: 0      Free: 2351999776
2800000:        Vector Capacity: 3343205        Free: 2325254128
5600000:        Vector Capacity: 8558607        Free: 2283530912
8400000:        Vector Capacity: 8558607        Free: 2283530912
11200000:       Vector Capacity: 13693771       Free: 2242449600
14000000:       Vector Capacity: 21910035       Free: 2176719488
...
19600000:       Vector Capacity: 21910035       Free: 2176719488
22400000:       Vector Capacity: 35056057       Free: 2071551312
...
33600000:       Vector Capacity: 35056057       Free: 2071551312
36400000:       Vector Capacity: 56089691       Free: 1903282240
...
56000000:       Vector Capacity: 56089691       Free: 1903282240
58800000:       Vector Capacity: 89743507       Free: 1634051712
...
89600000:       Vector Capacity: 89743507       Free: 1634051712
92400000:       Vector Capacity: 143589611      Free: 1203282880
...
142800000:      Vector Capacity: 143589611      Free: 1203282880
145600000:      Vector Capacity: 215384417      Free: 628924432
...
212800000:      Vector Capacity: 215384417      Free: 628924432
215600000:      Vector Capacity: 293999969      Free: 16
...
260400000:      Vector Capacity: 293999969      Free: 16
Bus error

If you run the program with the a parameter (any will work, just need to increase the argc), it preallocate the vector but will still result in a bus error at the same array index.

I checked the size of the "files" at /dev/shm using the ls -ash /dev/shm command:

total 2.0G
   0 .     0 ..  2.0G ShmTest

And just like with my original application it the size of the allocated shared memory is capped at 2 gigs. Given that it "successfully" allocated 2352000000 bytes of memory, in gigabytes (using 1024*1024*1024) it should be 2.19 Gb.

When I run my actual program to load data using MPI, I get this error output:

Requested: 2808771120
Recieved: 2808771120

[c1-master:13894] *** Process received signal ***
[c1-master:13894] Signal: Bus error (7)
[c1-master:13894] Signal code:  (2)
[c1-master:13894] Failing at address: 0x2b3190157000
[c1-master:13894] [ 0] /lib64/libpthread.so.0 [0x3a64e0e7c0]
[c1-master:13894] [ 1] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost12interprocess26uninitialized_copy_or_moveINS0_10offset_ptrIlEEPlEET0_T_S6_S5_PNS_10disable_ifINS0_11move_detail16is_move_iteratorIS6_EEvE4typeE+0x218) [0x2b310dcf3fb8]
[c1-master:13894] [ 2] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost9container6vectorIlNS_12interprocess9allocatorIlNS2_15segment_managerIcNS2_15rbtree_best_fitINS2_12mutex_familyENS2_10offset_ptrIvEELm0EEENS2_10iset_indexEEEEEE15priv_assign_auxINS7_IlEEEEvT_SG_St20forward_iterator_tag+0xa75) [0x2b310dd0a335]
[c1-master:13894] [ 3] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost9container17containers_detail25advanced_insert_aux_proxyINS0_6vectorIlNS_12interprocess9allocatorIlNS4_15segment_managerIcNS4_15rbtree_best_fitINS4_12mutex_familyENS4_10offset_ptrIvEELm0EEENS4_10iset_indexEEEEEEENS0_17constant_iteratorISF_lEEPSF_E25uninitialized_copy_all_toESI_+0x1d7) [0x2b310dd0b817]
[c1-master:13894] [ 4] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost9container6vectorINS1_IlNS_12interprocess9allocatorIlNS2_15segment_managerIcNS2_15rbtree_best_fitINS2_12mutex_familyENS2_10offset_ptrIvEELm0EEENS2_10iset_indexEEEEEEENS3_ISD_SB_EEE17priv_range_insertENS7_ISD_EEmRNS0_17containers_detail23advanced_insert_aux_intISD_PSD_EE+0x771) [0x2b310dd0d521]
[c1-master:13894] [ 5] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost12interprocess6detail8Ctor3ArgINS_9container6vectorINS4_IlNS0_9allocatorIlNS0_15segment_managerIcNS0_15rbtree_best_fitINS0_12mutex_familyENS0_10offset_ptrIvEELm0EEENS0_10iset_indexEEEEEEENS5_ISF_SD_EEEELb0EiSF_NS5_IvSD_EEE11construct_nEPvmRm+0x157) [0x2b310dd0d9a7]
[c1-master:13894] [ 6] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost12interprocess15segment_managerIcNS0_15rbtree_best_fitINS0_12mutex_familyENS0_10offset_ptrIvEELm0EEENS0_10iset_indexEE28priv_generic_named_constructIcEEPvmPKT_mbbRNS0_6detail18in_place_interfaceERNS7_INSE_12index_configISB_S6_EEEENSE_5bool_ILb1EEE+0x6fd) [0x2b310dd0c85d]
[c1-master:13894] [ 7] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN5boost12interprocess15segment_managerIcNS0_15rbtree_best_fitINS0_12mutex_familyENS0_10offset_ptrIvEELm0EEENS0_10iset_indexEE22priv_generic_constructEPKcmbbRNS0_6detail18in_place_interfaceE+0xf8) [0x2b310dd0dd58]
[c1-master:13894] [ 8] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN7POP_LTL16ExportPopulation22InitializeSharedMemoryEPKc+0x1609) [0x2b310dceea99]
[c1-master:13894] [ 9] ../LookupPopulationLib/Release/libLookupPopulation.so(_ZN7POP_LTL10InitializeEPKc+0x349) [0x2b310dd0ebb9]
[c1-master:13894] [10] MPI_Release/LookupPopulation.MpiLoader(main+0x372) [0x4205d2]
[c1-master:13894] [11] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3a6461d994]
[c1-master:13894] [12] MPI_Release/LookupPopulation.MpiLoader(__gxx_personality_v0+0x239) [0x420009]
[c1-master:13894] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 13894 on node c1-master exited on signal 7 (Bus error).
--------------------------------------------------------------------------

I'm really not sure where to go with this. Does anyone have any suggestions of what to try?


Posted to the Boost bug trac at: https://svn.boost.org/trac/boost/ticket/4374

A: 

Well hard to tell. Just my 2 cents : From my experience, accessing memory outside of correct range get you a SegFault (I'm not sure with shared mem). I have encountered "Bus error" more with alignement problems like accessing a 4 bytes long on a 32 bits platform at an address that is not a multiple of 4.

My suggestion is to check if you do not access a mis-aligned address. As I've not spotted you use low level cast or things like that,, it seems unlikely, but one never knows ...

my2c

neuro
+2  A: 

Well, if you keep looking for the answer long enough...

On Linux the shared memory mechanisms it uses (tmpfs) by default limits it to half the system RAM. So on the my cluster it’s 2 Gb because we have 4 Gb system RAM. So when it tried to allocate the shared memory segment, it allocated up to the max size left on the /dev/shm.

But issue came when the Boost library didn’t indicate an error or even the report correct amount of free memory when it couldn’t allocate the requested amount of memory. It was just happy to apparently chug along until it reached the end of the segment and then errored.

The long term solution is to update the /etc/fstab file to make the change permanently, but a command line call can be run to increase the size of the available shared memory on each node until reboot.

mount -o remount,size=XXX /dev/shm

Where the XXX is the amount of memory to make available (for example size=4G).

This was figured out/taken from http://www.cyberciti.biz/tips/what-is-devshm-and-its-practical-usage.html

CuppM
Interesting. You should post something on the boost dev list. by the way, I think you can accept your own answer :) Thanks for the answer.
neuro
Yeah, a bit frustrating too. :) I posted it to the Boost bug tracking at https://svn.boost.org/trac/boost/ticket/4374 .
CuppM