views:

144

answers:

1

So I'm trying to create a simple multi-threaded game engine for the game I want to write. So far, everything has worked without any problems, I even know what steps I have to take to finish it.

There is only one thing I don't know (well, technically, I know a solution for it, but I'm hoping there is something more elegant and faster): Basically, I have a seperate thread for every part of my engine - Graphics, Input, Physics, Audio, etc.

The physics thread has a complete scene node structure of the world, where it simulates everything. However, I now have to get this structure over to my graphics thread, with the least overhead possible. Ideally, it should only transfer the parts which changed since the last update.

I have components in place for transfering this data, only problem is generating it.

So far, I have thought of two different approaches:

  • copy the whole structure for every update - very simple, but possibly time and memory intensife (I don't have experience with large engines - would this be viable?)
  • Keep track of which parts of the scene changed by marking the scene nodes with some flags, and then only copying over the changed parts

Approach one would copy a big amount of memory, but without much processing power, approach two would do the reverse: plenty of processing power, less memory copied.

Is there some general answer which approach would be faster in a typical gaming environment?

+3  A: 

No, there is not an accepted general answer, it is a current area of research in games development.

My 2 cents is the conventional wisdom - which one to use really depends on your specific use case - if your game has lots of data (ie it's very memory intensive, like most blockbuster titles), you'll probably want to just transmit changes. If your game isn't memory intensive (eg, arcade games), you can probably get away with copying the entire object.

I'd suggest implementing both and hooking up performance timers to see which works better for you; it is possible to implement an architecture which can handle both methods transparently.

Not Sure
some theory: Peak transfer rate of DDR2-1066 RAM is 8533 MB/s, and I want at least 100 updates per second, therefore, one update has to be smaller than 85 MB. One physics node would have to be at least 32 bytes ( 28 for position and orientation an at least 4 for an unique ID), if using 32 bit ints as IDs. Asuming 10.000 objects, this would mean 312,5 kB to copy per update (0,3% of the maximum) However, this doesn't include the overhead for the node hierarchy.Should be easily possible.
Mononofu