views:

101

answers:

2

Given the following input against 60% toleration

"STACKOVERflow is a quesTions and ANSwers weBSITE"

I expect the following output

// Extra spaces just to show %s
// 69%          50%  100%  22%!      33% 42%     14%
"Stackoverflow  Is   A     QuesTions And ANSwers Website"

Questions and Answers have uppercase characters but they represent less than 60% of the string, so it should be kept. And then I want to convert the first character of each string to uppercase.

I'm currently doing with this method

public static class StringExtender
{
    public static string ToTitleCase(this string str, double preserve)
    {
        return String.Join(" ",
            str.Split(' ')
            .Select(x => (x.Count(y => y.ToString() == y.ToString().ToUpper()) / (double)x.Length * 100) > preserve ? x.ToLower() : x)
            .Select(x =>
                String.Join(String.Empty,
                    x.Select((y, z) => z == 0 ? y.ToString().ToUpper() : y.ToString()).ToArray()
                )
            ).ToArray()
        );
    }
}

The first time it runs I get 15000 ticks (Stopwatch.EllapsedTicks) and the next ones runs at 300. It seems the first time it does some kind of compilation...

  • Is there some way to compile it not in runtime, so the first time it runs it uses full speed just like the next?
  • Is there a way to optimize this code even more?

Full code (to include measurement methods)

using System;
using System.Diagnostics;
using System.Linq;

public static class StopwatchExtender
{
    public static void Timer(this Stopwatch sw, Action x, int iterations, string name)
    {
        sw.Start();
        for (int i = 0; i < iterations; ++i)
        {
            x();
        }
        sw.Stop();

        Console.WriteLine("Name: {0}\nTicks: {1}\n", name, sw.ElapsedTicks);

        sw.Reset();
    }
}

public static class StringExtender
{
    public static string OP(this string str, double preserve)
    {
        return String.Join(" ",
            str.Split(' ')
            .Select(x => (x.Count(y => y.ToString() == y.ToString().ToUpper()) / (double)x.Length * 100) > preserve ? x.ToLower() : x)
            .Select(x =>
                String.Join(String.Empty,
                    x.Select((y, z) => z == 0 ? y.ToString().ToUpper() : y.ToString()).ToArray()
                )
            ).ToArray()
        );
    }

    public static string A01(this string str, double preserve)
    {
        return string.Join(" ",
            str.Split(' ')
                .Select(s => char.ToUpper(s[0]) + ((s.Count(c => char.IsUpper(c)) / (double)s.Length * 100) > preserve ? s.Substring(1).ToLower() : s.Substring(1)))
                .ToArray()
            );
    }
}

public class Program
{
    static void Main()
    {
        var sw = new Stopwatch();

        var str = "STACKOVERflow is a quesTions and ANSwers weBSITE";

        sw.Timer(() =>
        {
            str.OP(60);
            str.A01(60);
        }, 1, "Starup takes more time");

        sw.Timer(() =>
        {
            str.OP(60);
        }, 1000000, "OP solution");

        sw.Timer(() =>
        {
            str.A01(60);
        }, 1000000, "LukeH's answer");

        Console.ReadLine();
    }
}

Results

results

+1  A: 

This isn't really answering your question but a TitleCase() method has been in the .NET Framework for some time... check out the TextInfo class: http://msdn.microsoft.com/en-us/library/system.globalization.textinfo.totitlecase.aspx

Kane
For the string `TITLECASEIT` it doesn't work. For `tITLEcaseit` it would lower everything and uppercase the first letter. So in short, I guess this method is useless to me, unless the tolerance is `0%`
BrunoLM
+2  A: 
public static string ToTitleCase(this string str, double preserve)
{
    return string.Join(" ",
        str.Split(' ')
           .Select(s => s.Length == 0 ? s : char.ToUpper(s[0]) + ((s.Count(c => char.IsUpper(c)) / (double)s.Length * 100) > preserve ? s.Substring(1).ToLower() : s.Substring(1)))
           .ToArray());
}

(And don't forget to remove that final ToArray call if you're using .NET 4.)

LukeH
Worse, I'm getting `3000` ticks (after the first time). My method runs at `300` ticks after the first time.
BrunoLM
@Bruno: Really? In my tests it runs 4x *faster* than your version (even after I added the extra check to stop it crashing on empty strings).
LukeH
@Luke: How are you testing? I've posted the my full code, which shows an implementation of `Stopwatch with lambda` http://stackoverflow.com/questions/232848/wrapping-stopwatch-timing-with-a-delegate-or-lambda/232870#232870
BrunoLM
@Luke: Oops, my mistake, I didn't change my method, I've tested it inline which fell in the problem of `first time run`. Sorry. Thanks for the code.
BrunoLM
@Luke: Do you have any idea why the first time takes so much time to execute?
BrunoLM
@Bruno: It's almost certainly because the jitter is compiling the method on first use. Is that really a problem for you? If so you could consider calling the method once at app startup, causing it to be jitted then. Or take a look at NGen: http://msdn.microsoft.com/en-us/library/6t9t5wcf.aspx
LukeH
I don´t believe it´s the JIT. I believe it´s the LINQ building the expression for the first time.
Augusto Radtke
@Augusto: This is LINQ-to-Objects. There is no expression that needs building at run-time.
LukeH
Since I can remove extra spaces I changed the split to `.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)` which fixed crashes due to empty string `IndexOutOfRangeException`. Thanks for the hints.
BrunoLM