tags:

views:

109

answers:

5

This is a basic question about what options are available for defining the scope of some parameters. I'd particularly like to have a sense of the tradeoffs in speed, memory, code clarity, ease of implementation, and code safety. My programming experience is clearly modest, and I have little formal training. I suspect one or two options will be obvious 'right' answers.

I have a simulation encapsulated in its own class called Simulation. Some of the parameters used in the simulation are read in from a set of files when the Simulation is initialized, and these parameters are currently stored in a few two-dimensional arrays. These arrays, demPMFs and serotypePars, are currently both private members of Simulation:

// Simulation.h

class Simulation
{
 Simulation( int simID );
 ~Simulation();

 public:
  // ...[code snipped]...

 private:
  double demPMFs[ NUM_SOCIODEM_FILES ][ INIT_NUM_AGE_CATS ];
  double serotypePars[ NUM_EPID_FILES ][ INIT_NUM_STYPES ];
  // ...[code snipped]...
};

The simulations mostly involve modifying instances of class Host. Member functions of class Host frequently need access to parameter values stored in serotypePars. Previously, I had all the data now in serotypePars stored as const numerics interpreted at compile time, but now that these parameters are read from a file, things are a little more complicated (for me, anyway).

What is the fastest and safest way for me to grant all members of class Host 'read-only' access to serotypePars? Fastest deserves priority: There are no other classes in my program, so global is a (crude) potential solution; all simulations will run from identical parameter sets. I'm reluctant to pass the array to individual functions that act on Host members, since there are probably a dozen and some are quite nested. (For example, I have intermediate wrapper structs that cannot take two-dimensional arrays as arguments, and I'm unsure what syntactical work-arounds there might be. I would like to stick with arrays for speed purposes, and my non-uniform prng generators all expect arrays.)

I would greatly appreciate knowing what would require the least modification of the existing code while not introducing huge dangers.

Thanks for any insight you can offer.


A related challenge I have is not knowing exactly how to access Simulation members from the Host class, unless these members are passed to the Host function. Here's basically what happens:

// main.cpp

int main() {
  for ( int s = 1; s < NUM_SIMS+1; s++ ) {
   Simulation thisSim(s);
   thisSim.runDemSim();
   thisSim.runEpidSim();
   // ... code snipped ...
 }
}

Function Simulation::runDemSim() and Simulation::runEpidSim() create, modify, and destroy many instances of class Host. Pointers to these instances are stored in a Boost MultiIndex container, which is where the intermediate wrapper functions come in. One of the many modifications involves ultimately calling Host::calcRecovery, which needs access to serotypePars:

// Host.cpp

// ... code snipped ...

double Host::calcRecovery( int s, double currentTime, double infectionTime, boost::mt19937& rng ) {
  // ...code snipped...
  effectiveMean = serotypePars[ MEAN_DURATION_INDEX ][ s ] * currentInfections * pastInfections;
  double rt = rgamma( effectiveMean, serotypePars[ VAR_DURATION_INDEX ][ s ], rng );
}

(Apologies if TMI.) Simply declaring serotypePars public in the Simulation class definition resulted in a "serotypePars was not declared in this scope" error in Host.cpp.


Solution summary

GMan's suggestion that I package all the simulation parameters in a private class, e.g., SimulationPars, seems like the most elegant and extensible route. An instance of SimulationPars could belong in every Simulation, and a pointer to SimulationPars can be passed to the constructor of every Host within a given Simulation. Thanks to everyone for the thoughtful discussions.

+1  A: 
public:    
const double** GetSerotypes() { return serotypePars; }
DeadMG
I would return a reference to the array.
GMan
How would a Host created by (within?) a Simulation use this member function? In main.cpp, I create the simulation with Simulation thisSim; and then run it with another Simulation member function. Calling thisSim.getSerotypes() from Host.cpp return a "'thisSim' was not declared in this scope" error.
Sarah
@Sarah: Only you know what you're talking about right now; show us code.
GMan
If thisSim is not accessible from whatever scope you are talking about, then how are you "running" it?
tlayton
This isn't correct. `double foo[X][Y]` is a flat 2D array of doubles, not an array of double pointers. For the record, the correct version would be: `const double (*GetSerotypes())[INIT_NUM_STYPES] { return serotypePars; }` , but I recommend following GMan's solution and using a typedef and reference, as it is much clearer.
Joey Adams
@GMan and @tlayton: Edited description.
Sarah
Sorry. I upgraded to use auto whenever possible because I'm lazy and have lost some smarts.
DeadMG
+1  A: 

You could write inline accessor methods, such as:

public:
    inline double readDemPMFs(int x, int y) {
        return demPMFs[x][y];
    }
    inline double readSerotypePars(int x, int y) {
        return serotypePars[x][y];
    }

As long as the compiler honors your inline usage, this should be just as fast as direct access, since it tells the compiler to embed the function at every occurrence rather than generating function calls.

Joey Adams
If it's defined in the definition it's already `inline`. And for what it's worth, `inline` really has nothing to do with code inlining anymore, compilers are way better at programmers at figuring out what should be inlined.
GMan
+2  A: 

This is normal:

class Simulation
{ 
public:
    // typedef's should be used liberally, it's easier to read
    typedef double pmf_type[NUM_SOCIODEM_FILES][INIT_NUM_AGE_CATS];
    typedef double sero_type[NUM_EPID_FILES][INIT_NUM_STYPES];

    // accessors
    const pmf_type& get_pmf(void) const
    {
        return demPMFs;
    }

    const sero_type& get_sero(void) const
    {
        return serotypePars;
    }

private:
    pmf_type demPMFs;
    sero_type serotypePars;
};
GMan
This looks nice but I'm still stymied by how an instance of class Host, created by a Simulation, would be able to call a Simulation public member function. The specific instance of the Simulation is only defined in main.cpp. (See the edit in the question description.)
Sarah
Then you need to add a reference or pointer to creating Simulation in the Host.
DeadMG
@DeadMG: I'm afraid I don't follow. A Simulation creates many Hosts. Hosts do not create Simulations.
Sarah
@Sarah: He means the constructor of `Host` should take in a pointer or reference to a `Simulation` instance to store and refer to; namely the one that made the `Host`, so the host knows what simulation it's simulating. (Your design seems rather unorthodox.)
GMan
@GMan: I would like to learn what orthodox is. What setup would you have used? Do I use "this" to try to pass a reference to the Simulation to every Host it creates?
Sarah
@Sarah: Yup, `this` works good. In honesty, I don't know what you're trying to do so it's tough to say how a design should change. I'd keep what you have since it's working for you though.
GMan
@Gman: Thanks. This is a large agent-based model; each Simulation involves creating/infecting/destroying hundreds of thousands of Hosts. Simulations are stochastic and hence many must be run for each parameter set. Eventually I will run them in parallel, but now each is run serially. All I'm trying to do is allow Hosts to read serotypePars (now read in from text file, initialized in main.cpp, and passed to each Simulation) without having to pass these arrays through several nested Simulation member functions to get to the one Host member function that actually needs the data.
Sarah
@Sarah: Ah, I see. I was thinking Host hosted simulations, which is why I thought it was weird for the simulations to be making the hosts. This is normal design, then. :P You might factor out the parameters, into a SimulationSettings class of some sort. That class has those two arrays and these accessor methods, and it's a private member of a Simulation. Then the Host doesn't need a pointer to the entire Simulation, but just to the Settings.
GMan
But fundamentally, the answer appears to be that I need to thread a pointer to the Simulation or the SimulationSettings class all the way through the functions involved in creating or modifying Hosts; there's no way, short of declaring these things global, that I can allow Hosts automatic access to serotypePars.
Sarah
+1  A: 

Write a global function to read your configuration into two local arrays and construct your Simulation with those local arrays:

double demPMFs[ NUM_SOCIODEM_FILES ][ INIT_NUM_AGE_CATS ];
double serotypePars[ NUM_EPID_FILES ][ INIT_NUM_STYPES ];
loadConfig(&demPMFs, &serotypePars);

Simulation s(demPMFs, serotypePars);

Make the data members public and const and modify your constructor to take the values of these arrays as arguments and initialize them as part of the constructor's initialization list:

public:
    Simulation(double const** demPMFs, double const** serotypePars)
        : demPMFs(demPMFs), serotypePars(serotypePars)
    {}

    double const** demPMFs;
    double const** serotypePars;

You end up with a simulation containing constant configuration. You can safely expose these constant arrays to the public without wrapping them in any accessors.

lrm
This makes sense and avoids having to reread the parameter files upon every instantiation of a Simulation.
Sarah
Having those `const` doesn't provide any benefit, and now the class is non-assignable. Might as well leave them private and provide const accessors. @Sarah: If that's a problem, you need to refactor. Make a `Config` class or whatever, and feed that into the `Simluation`.
GMan
If the data is read once and never modified this method provides convenient access without the risk of external modification. You don't have to protect the data with accessors because the compiler will not let you overwrite or modify the the data.
lrm
@lrm: Right, but you get the same effect with const functions, with no loss.
GMan
A: 

I'm not sure if there's a way to make the values read-only available to ONLY instances of the Host class, but in terms of making them generally read-only at the public access level, I would suggest using const references:

private:
double demPMFs[ NUM_SOCIODEM_FILES ][ INIT_NUM_AGE_CATS ];
double serotypePars[ NUM_EPID_FILES ][ INIT_NUM_STYPES ];

public:
const double (& demPMFsReadOnly)[ NUM_SOCIODEM_FILES ][ INIT_NUM_AGE_CATS ];
const double (& serotypeParsReadOnly)[ NUM_EPID_FILES ][ INIT_NUM_STYPES ];

Then add a pair of initializers to the Simulation constructor:

Simulation(int simId) : demPMFsReadOnly(demPMFs), serotypeParsReadOnly(serotypePars){
    ...
}

This way, you can access demPMFsReadOnly and serotypeParsReadOnly as regular read-only values, rather than pointers or functions, at the public level, but you can change the values of demPMFs and serotypePars at the private level.

It takes a bit more code in the Simulation class itself than some other methods, but it saves on time (both yours and the program's) on the long run.

P.S. It's important that the private arrays are declared before the public ones; they are initialized in that order, and the private arrays must be initialized first so that the public references can be initialized to refer to them.

P.P.S. The parentheses in the public reference declarations are important too. Without them, the compiler will try to interpret these statements as declaring arrays of references (which is ILLEGAL).

tlayton
How does this save program time? Versus just a normal const function, you get nothing and lose assignability.
GMan
Its program speed should be just as good as an inline function returning a const reference to the private array. What do you mean by assignability that isn't already forbidden by constness?
tlayton
The assignment operator can't be generated for classes with references as members. So what I mean is, why take that away and gain nothing over the function version?
GMan
If use of the default-generated assignment operator is desired, then you're right; throwing in a pair of parentheses here and there is a small price to pay. However, in any situation where these sort of syntactic details are of such concern, I would write my own assignment operator anyway to insure proper functionality. I certainly acknowledge that that is personal preference though, and by no means necessary.
tlayton
Writing things the compiler will do for you is generally bad practice, you're just introducing a way to mess up. :/
GMan