views:

193

answers:

5

A piece of code I'm working on has to analyze a foreign file format produced by another software - a "replay" from a game to be more exact. In this replay, every actions produced by the players are saved with a variable number of arguments.
My software produces an analysis of user's action, doing stuff like producing a graph of their actions per minute throughout the game, ect ... And to give detailled informations internally every action is tranformed into an object with it's own methods, but with tens of thousand of actions even for the simplest games, this analysis takes time, and I'm now looking for a way to fasten it when the replay has already been analyzed once.

I had a couple of ideas, but I'm not sure which one I should apply:
1 - some kind of serialization to save the actions' objects state on disk, so that the object can be reloaded straight from it ? I'm not sure this would have a significant impact on performance since it would still have to do all the objects creation
2 - creating a large pool of every object type before hand and reusing them when the user move from replay to replay, avoiding the creating time ?

I'm not sure how to proceed here so if you have any good idea on how to design this in a fast way, please feel free to share. Note that taking disk space to save a replay status once analyzed is not an issue, and these are "high end" gamers' computers so i can take some liberties as to how much ressources I consume as long as it speeds up the process.

Thanks in advance for any help

+7  A: 
  • derive each object from TComponent
  • make all properties you want to save published
  • create one root component as the owner of the others
  • use a TFileStream or TMemoryStream to store and load the root
Uwe Raabe
+1. Perfect idea.
Fabricio Araujo
+3  A: 

You currently have

GameRecordOnDisk {contains many action defintions } 
                             ---> RepresentionOfActionsInMemory

Do you have any idea where the time is going in making that transformation? Reading from disk? Parsing the data? Creating the objects? Setting up linkages between actions (perhaps searching lists of things?).

I think you need to get some performance tools and analyse what's going on. Performance tuning is notoriously unintuitive. You quite often find an apparently innocuous line of code is amazingly expensive.

You might then be driven to devise a more optimised on-disk representation, or make your data structures more efficient or whatever. But without facts you run the risk of carefully improving the performance of a piece of code by 1000% only to find you just removed 1% of the total overhead.

djna
+1 for not proposing any "solution" without knowing anything about the details, and for trying to get the questioner to find out those details first. Your advise regarding when and how to optimize is very sound. This would have made a much better accepted answer.
mghie
thanks, nice to know it makes sense :-)
djna
A: 

Great idea expressed by Uwe Raabe.

As another option:

if you know the number of objects that would be created beforehand: create them all in one pack and then just access.

if it is variable number each time but you know that it is on order of 10,000 then create objects in packs of 100 at once. It will boost up your productivity a bit but still. I do not think that creation of objects is your major bottleneck.

Andrew
How would you create objects in packs of 100? You have to execute the constructor calls so that's just not possible IMHO. The only way I see is to use an object pool and reuse objects. But I'm pretty sure that object creation is not the problem and it is senseless to implement something like that without having actually measured.
Smasher
A: 

Once your program analyzes the game file, save all your analysis information to a file, with the same name as the game file, but with a different suffix.

e.g. You read in X40938.log and you output X40938.ana (assuming you want ana as your suffix).

Then whenever someone uses your program to analyze a game file, check for the associated .ana file. If it exists, then load it (fast), else analyze the game file (slow) and save the .ana file so it will be fast next time.

If the game files can be updated by the program, then you can compare the timestamp (last changed date) of the game file to that of your .ana file, and if game file's timestamp is later than the timestamp of the .ana file, then you'll have to reanalyze.

Your concern is that creating the objects again will be slow. I doubt it. I'm sure it is the analysis of the game data that is slow. You should find loading the pre-analyzed data to be much faster.

lkessler
Why save the last changed date in the .ana file? That's what file modification timestamps are for. Just compare the timestamps of the two files to see whether the analyzed data is up-to-date.
mghie
That's a good idea, mghie. If the timestamp of the .ana file is after that of the game file, then the .ana file will be up to date. So the timestamp need not be stored in the .ana file. I'll update my answer.
lkessler
A: 

In MFC there is this notion of "Serialization".

You can do it in any language. Basically you just write routines to walk your data structure, and as they go, they write the essential data, in binary, to your file.

When writing out an array, make sure you write the array size before its elements.

When writing a structure based on a pointer, first write a boolean telling if the pointer is not-null.

Then you write a reader that does the same thing, except at the point where it would read an array, it first reads the size, allocates the array and reads the elements. At the point where you read a pointer, first read the boolean. If it is 0, just make the pointer null and skip it. If not, allocate the pointer and proceed to read its contents.

In MFC these two functions are actually coded into common routines called "Serialize". You're basically doing a depth-first tree walk when you write, and the same thing when you read.

Since all I/O is binary it is about as fast as it can possibly be.

Mike Dunlavey