views:

154

answers:

2

Hi Guys, just wondering if anyone has any ideas about an issue I'm having.

I have a fair amount of data that needs to be displayed on one graph. Two theoretical lines that are bold and solid are displayed on top, then 10 experimental data sets that converge to these lines are graphed, each using a different identifier (eg the + or o or a square etc). These graphs are on a log scale that goes up to 1e6. The first few decades of the graph (< 1e3) look fine, but as all the datasets converge (> 1e3) it's really difficult to see what data is what.

There's over 1000 data points points per decade which I can prune linearly to an extent, but if I do this too much the lower end of the graph will suffer in resolution.

What I'd like to do is prune logarithmically, strongest at the high end, working back to 0. My question is: how can I get a logarithmically scaled index vector rather than a linear one?

My initial assumption was that as my data is lenear I could just use a linear index to prune, which lead to something like this (but for all decades):

//%grab indicies per decade
ind12 = find(y >= 1e1 & y <= 1e2);
indlow = find(y < 1e2);
indhigh = find(y > 1e4);
ind23 = find(y >+ 1e2 & y <= 1e3);
ind34 = find(y >+ 1e3 & y <= 1e4);

//%We want ind12 indexes in this decade, find spacing
tot23 = round(length(ind23)/length(ind12));
tot34 = round(length(ind34)/length(ind12));

//%grab ones to keep
ind23keep = ind23(1):tot23:ind23(end);
ind34keep = ind34(1):tot34:ind34(end);

indnew = [indlow' ind23keep ind34keep indhigh'];

loglog(x(indnew), y(indnew));

But this causes the prune to behave in a jumpy fashion obviously. Each decade has the number of points that I'd like, but as it's a linear distribution, the points tend to be clumped at the high end of the decade on the log scale.

Any ideas on how I can do this?

+1  A: 

The way I understand the problem is that your x-values are linearly spaced, so that if you plot them logarithmically, there are way more data points in 'higher' decades, so that markers lie extremely close to one another. For example, if x goes from 1 to 1000, there are 10 points in the first decade 90 in the second, and 900 in the third. You want to have, say, 3 points per decade instead.

I see two ways to solve the problem. The easier one is to use differently colored lines instead of different markers. Thus, you don't sacrifice any data points, and you can still distinguish everything.

The second solution is to create an unevenly spaced index. Here's how you can do that.

%# create some data
x = 1:1000;
y = 2.^x;

%# plot the graph and see the dots 'coalesce' very quickly
figure,loglog(x,y,'.')

%# for the example, I use a step size of 0.7, which is `log(1)`
xx = 0.7:0.7:log(x(end)); %# this is where I want the data to be plotted

%# find the indices where we want to plot by finding the closest `log(x)'-values
%# run unique to avoid multiples of the same index
indnew = unique(interp1(log(x),1:length(x),xx,'nearest'));

%# plot with fewer points
figure,loglog(x(indnew),y(indnew),'.')
Jonas
The journal I'm submitting to doesn't allow colour, hence the markers - but your solution worked perfectly; thanks!
Geodesic
Good luck with the publication!
Jonas
+3  A: 

I think the easiest way to do this would be to use the LOGSPACE function to generate a set of indices into your data. For example, to create a set of 100 points logarithmically spaced from 1 to N (the number of points in your data), you can try the following:

indnew = round(logspace(0,log10(N),100));  %# Create the log-spaced index
indnew = unique(indnew);                   %# Remove duplicate indices
loglog(x(indnew),y(indnew));               %# Plot the indexed data

Creating a logarithmically-spaced index like this will result in fewer values being chosen from the end of the vector relative to the start, thus pruning values more severely towards the end of the vector and improving the appearance of the log plot. It would therefore be most effective with vectors that are sorted in ascending order.

gnovice
Very nice! I learn something new every day.
Jonas
Yes! I was trying to use logspace initially, but had no idea how to impliment it correctly for this task. Thanks, although Jonas' solution works; this is probably more eloquent.
Geodesic