views:

129

answers:

1

Overview
I have a multivariate timeseries of "inputs" of dimension N that I want to map to an output timeseries of dimension M, where M < N. The inputs are bounded in [0,k] and the outputs are in [0,1]. Let's call the input vector for some time slice in the series "I[t]" and the output vector "O[t]".

Now if I knew the optimal mapping of pairs <I[t], O[t]>, I could use one of the standard multivariate regression / training techniques (such as NN, SVM, etc) to discover a mapping function.

Problem
I do not know the relationship between specific <I[t], O[t]> pairs, rather have a view on the overall fitness of the output timeseries, i.e. the fitness is governed by a penalty function on the complete output series.

I want to determine the mapping / regressing function "f", where:

     O[t] = f (theta, I[t]) 

Such that penalty function P(O) is minimized:

     minarg P( f(theta, I) )
       theta

[Note that the penalty function P is being applied the resultant series generated from multiple applications of f to the I[t]'s across time. That is f is a function of I[t] and not the whole timeseries]

The mapping between I and O is complex enough that I do not know what functions should form its basis. Therefore expect to have to experiment with a number of basis functions.

Have a view on one way to approach this, but do not want to bias the proposals.

Ideas?

A: 

... depends on your definition of optimal mapping and penalty function. I'm not sure if this is the direction you're taking, but here's a couple of suggestions:

  • For example you can find a mapping of the data from the higher dimensional space to a lower dimension space that tries to preserve the original similarity between data points (something like Multidimensional Scaling [MDS]).

  • Or you can prefer to map the data to a lower dimension that accounts for as much of the variability in the data as possible (Principal Component Analysis [PCA]).

Amro