ansaurus

Question

Representing a 2D map of doubles in as few "parameters" as possible.

Answer 1

+2 A:

Basically, any graphics compression algorithm is going to do what you want. They're heavily optimized for compressing 2D arrays of numbers into the smallest possible footprint.

Edited to add:

The other thing to consider, since you're looking to use compression to reduce processing time, is that getting really high compression ratios generally involves more calculation to compress and decompress the array. You may reach a point where you're spending more time compressing and decompressing the array than running the neural network.

Edited again to add:

Based on your comments, it sounds like what you may want is a space-filling curve. Use the curve to turn your 50x50* array into a 1x2500 line and then come up with a formula that approximates the values you want for each cell of the array.

*Does the array have to be 50x50? It may be much easier to fill with a space filling curve if it's a square of slightly different dimensions. The Hilbert curve works nicely for dimensions that are powers of two, for instance.

Chris Upchurch 2009-04-12 14:59:57

Can I compress it down to two or preferrably one number though? I thought most graphics compression algorithms focused on keeping the same rough pixel size, just using smarter ways of storing the colour information.

Aidos 2009-04-12 15:13:55

I don't think you're going to be able to get it down to one or two doubles and have it be reversible to something recognizable as your original 50x50 array. That's a 1250:1 compression ratio.

Chris Upchurch 2009-04-12 15:22:51

I think I may have been misleading by using the term compression. Basically what I need is some sort of calculation that can give me some values that describe a "map". I could convert these to a simple line graph, I need a way of determing the "function" for this line, like a cubic-spline's params?

Aidos 2009-04-12 15:34:39

The Hilbert curve definately sounds interesting, do you know of any decent examples of how to use it for what I am trying to do (i.e. a line approximation)?

Aidos 2009-04-12 16:24:48

It's used to translate 2d data into 1d data in Geographic Information Systems (GIS). Unfortunately, I can't really find any good online examples. Everything in the first few pages of Googling is either dead tree books or journal articles behind paywalls.

Chris Upchurch 2009-04-12 16:47:55

Chris Upchurch 2009-04-12 16:48:46

Answer 2

+1 A:

One thing you can try is taking the FFT of your 1D line and then removing later (high-frequency) terms. For example, in MATLAB I did the following:

x = [1:1000];
y = rand(1,1000);
f = fft(y, 250); % truncate to 250 terms
plot(x,y, x,abs(ifft(f), 1000));

What tended to happen was that the peaks of the iFFT of f were very close to peaks of y. They weren't necessarily the highest points of y, but they were peaks. For instance, this run, there were peaks at x=424, 475, and 725 in the inverted FFT of f, and there were also peaks in y at x=423, 475, and 726. However, y's global max was at x=503, which was a peak in ifft(f), but not a very high one.

However, this only really cuts your data usage in half, because I converted 1000 doubles into 250 complex values. A further increase can be obtained by only using the real part of the FFT:

x = [1:1000];
y = rand(1,1000);
f = real(fft(y, 250)); % only uses 1/4 the space now
plot(x,y, x,abs(ifft(f, 1000)));

This still yielded pretty good results, with each major peak of ifft(f) corresponding to a peak in y that was only at most a distance of 2 away most of the time, and you use 1/4 the space of storing the double directly.

However, this still doesn't get you the results of "one or two double values". You are now packing 2500 doubles into 625. You can experiment by cutting more terms, but you will have to test more values "up close" by cutting more terms. Maybe you can keep the first 10% of terms, and find the maximum, and then look within a distance of 3 or 4; this would reduce your 2500 doubles to a "mere" 250. Only testing will find out what works best for your application.

If you're really desperate, you can go as low as the lowest 1% frequencies, and search 5 or 6 in either direction for the true peak. But this still leaves you with 25 doubles.

I don't think there's any way to convert 2500 doubles into only 1 or 2, and have it reversible into anything meaningful. Take a look at information theory texts to see why. I suggest you get MATLAB, GNU Octave, or even Excel, and play around with something like this and find what results are best for you.

rlbond 2009-04-12 18:10:29

ansaurus

tags:

views:

answers:

Representing a 2D map of doubles in as few "parameters" as possible.

related questions