views:

195

answers:

7

I have a LINQ expression that's slowing down my application. I'm drawing a control, but to do this, I need to know the max width of the text that will appear in my column.

The way I'm doing that is this:

return Items.Max(w => TextRenderer.MeasureText((w.RenatlUnit == null)? "" : 
w.RenatlUnit.UnitNumber, this.Font).Width) + 2;

However, this iterates over ~1000 Items, and takes around 20% of the CPU time that is used in my drawing method. To make it worse, there are two other columns that this must be done with, so this LINQ statement on all the items/columns takes ~75-85% of the CPU time.

TextRenderer is from System.Windows.Forms package, and because I'm not using a monospaced font, MeasureText is needed to figure out the pixel width of a string.

How might I make this faster?

+6  A: 

I don't believe that your problem lies in the speed of LINQ, it lies in the fact that you're calling MeasureText over 1000 times. I would imagine that taking your logic out of a LINQ query and putting it into an ordinary foreach loop would yield similar run times.

A better idea is probably to employ a little bit of sanity checking around what you're doing. If you go with reasonable inputs (and disregard the possibility of linebreaks), then you really only need to measure the text of strings that are, say, within 10% or so of the absolute longest (in terms of number of characters) string, then use the maximum value. In other words, there's no point in measuring the string "foo" if the largest value is "paleontology". There's no font that has widths THAT variable.

Adam Robinson
Yes, I believe the MeasureText is also the problem, but what do you suggest to remedy this situation? I've considered searching for the longest string, and then calling MeasureText once on it, but the longest string character wise may not be the longest string pixel wise
Malfist
@Malfist: See my edit.
Adam Robinson
@Malfist Take the strings with lengths at 90%-100% of the longest string, then find the MeasureText value of those.
Jake
You're assuming some sort of uniform distribution. If all of the strings are close to the longest length this doesn't really buy you anything. You'd need to take a fixed number and then it would be dependent on which strings you took.
tvanfosson
@tvanfosson: Every solution here is a compromise and will be better suited to a particular set of circumstances. Without more information, there's no reason to believe that this--or any other of the suggested options--is any better or worse.
Adam Robinson
Downvoter care to comment?
Adam Robinson
@Adam -- no quibble here. Just wanted to point out an unstated assumption.
tvanfosson
I voted you down when you originally posted the answer with just your first paragraph, since it doesn't actually propose any solution. I'd take it away now, but it won't let me -- I guess you can't change votes after a few minutes anymore. I agree that your second paragraph sounds like a reasonable course of action.
mquander
My profiler says this resulted in a 1.2% increase
Malfist
It's frankly pretty silly for you to post that result without even telling us basic facts about your data, like how long your strings are and the distribution of the lengths, or about your methodology, like how many `MeasureText` calls you eliminated.
mquander
The strings can be between 0-512 characters in length and vary substantially. I used linq to sort and select the top 10% of strings in length, and then did the MeasureText on it.
Malfist
@Malfist: You should post this code. It's likely that your performance degradation is being caused by repeated operations.
Adam Robinson
int top = (int)(Items.Count * .1); var x = (from w in Items orderby (((Tenant)w).RenatlUnit == null ? 0 : ((Tenant)w).RenatlUnit.UnitNumber.Length) select w).Take(top); return x.Max(w => TextRenderer.MeasureText((w.RenatlUnit == null)? "" : w.RenatlUnit.UnitNumber, this.Font).Width) + 2;
Malfist
I redid the testing, that code is slightly different than my earlier testing, it reduced the time spent in the functions to 12.7% of the original.
Malfist
Which is fast enough.
Malfist
+1  A: 

Unfortunately, it doesn't look like LINQ is your problem. If you ran a for loop and did this same calculation, the amount of time would be the same order of magnitude.

Have you considered running this calculation on multiple threads? It would work nicely with Parallel LINQ.

Edit: It seems Parallel LINQ won't work because MeasureText is a GDI function and will simply be marshaled back to the UI thread (thanks @Adam Robinson for correcting me.)

Bryan Watts
@Bryan: `MeasureText` is a GDI function. The calls would just have to be marshaled back to the UI thread.
Adam Robinson
@Adam Robinson: Good point. I will leave this answer to show it is not a valid option.
Bryan Watts
A: 

If you can't figure out how to make MeasureText faster, you could precalculate the width of all the characters in your font size and style and estimate the width of a string like that, although kerning of character pairs would suggest that it would probably be only an estimate and not precise.

mquander
+6  A: 

It's the MeasureText method that takes time, so the only way to increase the speed is to do less work.

You can cache the results of the call to MeasureText in a dictionary, that way you don't have to remeasure strings that already has been measured before.

You can calculate the values once and keep along with the data to display. Whenever you change the data, you recalculate the values. That way you don't have to measure the strings every time the control is drawn.

Guffa
+1; Caching is also a good option, assuming the values change infrequently compared to the number of times the same string is measured.
Adam Robinson
You'll get some initial delay and it doesn't really scale -- say you're displaying 1000 out of maybe 10,000,000 or 100,000,000 items (like tweets) -- but caching would be acceptable under a lot of conditions.
tvanfosson
The values change frequently, but less frequently than they are searched. I'll see what I can do with this.
Malfist
+1  A: 

My guess is the issues is not the LINQ expression but calling the MeasureText several thousand times.

I think you could work around the non-monospaced font issue by breaking the problem into 4 parts.

  1. Find the biggest number in terms of render size
  2. Find the apartment unit with the most digits
  3. Create a string with all values being the value determined in #1 and having size in #2.
  4. Pass the value created in #3 to MeasureText and use that as your basis

This won't yield a perfect solution but it will ensure that you reserve at least enough space for your item and avoids the pitfall of calling MeasureText far too many times.

JaredPar
A: 

You might want to consider as an approximation taking the length of the longest string and then finding the width of a string of that length of 0's (or whatever the widest digit is, I can't remember). That should be a much faster method, but it would only be an approximation and probably longer than necessary.

var longest = Items.Max( w => w.RenatlUnit == null
                                  || w.RenatlUnit.UnitNumber == null)
                              ? 0
                              : w.RenatlUnit.UnitNumber.Length );
if (longest == 0)
{
    return 2;
}
return TextRenderer.MeasureText( new String('0', longest ) ).Width + 2;
tvanfosson
The widest is usually `W`, though there's nothing in particular that *requires* that. `i` and `l` are, incidentally, usually the narrowest (most narrow?)
Adam Robinson
@Adam -- just noticed that it's unit numbers.
tvanfosson
You could probably just do something like this, given that: `var colWidth = "0123456789".Max(d => TextRenderer.MeasureText(d.ToString(), this.Font).Width) * Items.Max(w => w.ReatlUnit == null || w.RenatlUnit.UnitNumber == null ? 0 : w.RenatlUnit.UnitNumber.Length);`
Adam Robinson
UnitNumber is just one of the columns being measured, TenantName and PropertyName are also being measured
Malfist
@Malfist - then use the same technique on each one of those with the appropriate widest letter.
tvanfosson
+3  A: 

Step 0: Profile. Assuming you find that most of the execution time is indeed in MeasureText, then you can try the following to reduce the number of calls:

  1. Compute the lengths of all individual characters. Since it sounds like you're rendering a number, this should be a small set.
  2. Estimate the length numstr.Select(digitChar=>digitLengthDict[digitChar]).Sum()
  3. Take the strings with the top N lengths, and measure only those.
  4. To avoid even most of the cost of the lookup+sum, also filter to include only those strings within 90% of the maximum string-length, as suggested.

e.g. Something like...

// somewhere else, during initialization - do only once.
var digitLengthDict = possibleChars.ToDictionary(c=>c,c=>TextRenderer.MeasureText(c.ToString()));

//...

var relevantStringArray = Items.Where(w=>w.RenatlUnit!=null).Select(w.RenatlUnit.UnitNumber).ToArray();

double minStrLen = 0.9*relevantStringArray.Max(str => str.Length);

return (
    from numstr in relevantStringArray 
    where str.Length >= minStrLen
    orderby numstr.Select(digitChar=>digitLengthDict[digitChar]).Sum() descending
    select TextRenderer.MeasureText(numstr)
    ).Take(10).Max() + 2;

If we knew more about the distribution of the strings, that would help.

Also, MeasureText isn't magic; it's quite possible you can duplicate it's functionality entirely quite easily for a limited set of inputs. For instance, it would not surprise me to learn that the Measured length of a string is precisely equal to the sum of the length of all characters in the string, minus the kerning overhang of all character bigrams in the string. If your string then consists of, say, 0-9, +, -, ,, ., and a terminator symbol, then a lookup table of 14 character widths and 15*15-1 kernel corrections might be enough to precisely emulate MeasureText at a far greater speed, and without much complexity.

Finally, the best solution is to not solve the problem at all - perhaps you can rearchitect the application to not require such a precise number - if a simpler estimate were to suffice, you could avoid MeasureText almost completely.

Eamon Nerbonne
This is a sufficiently complete summary that I upvoted it even though most of the suggestions are covered in other responses.
mquander