views:

159

answers:

1

I have a (C#) genetic program that uses financial time-series data and it's currently working but I want to re-design the architecture to be more robust. My main goals are:

  • sequentially present the time-series data to the expression trees.
  • allow expression trees to access previous data rows when needed.
  • to optimize performance of the data access while evaluating the expression trees.
  • keep a common interface so various types of data can be used.

Here are the possible approaches I've thought about:

  1. I can evaluate the expression tree by passing in a data row into the root node and let each child node use the same data row.
  2. I can evaluate the expression tree by passing in the data row index and letting each node get the data row from a shared DataSet (currently I'm passing the row index and going to multiple synchronized arrays to get the data).
  3. Hybrid: an immutable data set is accessible by all of the expression trees and each expression tree is evaluated by passing in a data row.

The benefit of the first approach is that the data row is being passed into the expression tree and there is no further query done on the data set (which should increase performance in a multithreaded environment). The drawback is that the expression tree does not have access to the rest of the data (in case some of the functions need to do calculations using previous data rows).

The benefit of the second approach is that the expression trees can access any data up to the latest data row, but unless I specify what that row is, I'll have to iterate through the rows and figure out which one is the last one.

The benefit of the hybrid is that it should generally perform better and still provide access to the earlier data. It supports two basic "views" of data: the latest row and the previous rows.

Do you guys know of any design patterns or do you have any tips that can help me build this type of system? Should I use a DataSet to hold and present the data, or are there more efficient ways to present rows of data while maintaining a simple interface?

FYI: All of my code is written in C#.

+1  A: 

What you said mostly are all about operations, which should not be the first initiative for OO design. I suggest you create RowObject which maps to the every row of the data table and create another class RowObjectManager which contains a collection of RowObject and related operations like calling the algorithm. This is pretty much like Facade pattern and the you can encapsulate the algorithm in another class and call the algorithm using dependency injection way, which can be decoupled from the RowObjectManager class.

Then you should pass OBJECT rather then the properties of the object like index to the algorithm, and the algorithm can return the result to the caller.

sza
So I should pass a RowObjectManager into the evaluation function of the expression tree and it should call the appropriate method to either get the last RowObject or any of the previous RowObject(s)?
Lirik
Yes. You can use a static method in RowObjectManager to maintain the current index and create next() and pre() to return the RowObject next to or before the current one. And you pass the object you got into the algorithm and call the method in algorithm class to process and then return the result.
sza