views:

131

answers:

6

Hello, I am attempting to build a string extension method to trim a string to a certain length but with not breaking a word. I wanted to check to see if there was anything built into the framework or a more clever method than mine. Here's mine so far (not thoroughly tested):

public static string SmartTrim(this string s, int length)
        {
            StringBuilder result = new StringBuilder();

            if (length >= 0)
            {
                if (s.IndexOf(' ') > 0)
                {
                    string[] words = s.Split(' ');
                    int index = 0;

                    while (index < words.Length - 1 && result.Length + words[index + 1].Length <= length)
                    {
                        result.Append(words[index]);
                        result.Append(" ");
                        index++;
                    }

                    if (result.Length > 0)
                    {
                        result.Remove(result.Length - 1, 1);
                    }
                }
                else
                {
                    result.Append(s.Substring(0, length));
                }
            }
            else
            {
                throw new ArgumentOutOfRangeException("length", "Value cannot be negative.");
            }

            return result.ToString();
        }
+2  A: 

How about a Regex based solution ? You will probably want to test some more, and do some bounds checking; but this is what spring to my mind:

using System;
using System.Text.RegularExpressions;

namespace Stackoverflow.Test
{
    static class Test
    {
        private static readonly Regex regWords = new Regex("\\w+", RegexOptions.Compiled);

        static void Main()
        {
            Console.WriteLine("The quick brown fox jumped over the lazy dog".SmartTrim(8));
            Console.WriteLine("The quick brown fox jumped over the lazy dog".SmartTrim(20));
            Console.WriteLine("Hello, I am attempting to build a string extension method to trim a string to a certain length but with not breaking a word. I wanted to check to see if there was anything built into the framework or a more clever method than mine".SmartTrim(100));
        }

        public static string SmartTrim(this string s, int length)
        {
            var matches = regWords.Matches(s);
            foreach (Match match in matches)
            {
                if (match.Index + match.Length > length)
                {
                    int ln = match.Index + match.Length > s.Length ? s.Length : match.Index + match.Length;
                    return s.Substring(0, ln);
                }
            }
            return s;
        }
    }
}
driis
+1  A: 

Try this out. It's null-safe, won't break if length is longer than the string, and involves less string manipulation.

Edit: Per recommendations, I've removed the intermediate string. I'll leave the answer up as it could be useful in cases where exceptions are not wanted.

public static string SmartTrim(this string s, int length)
{
    if(s == null || length < 0 || s.Length <= length)
        return s;

    // Edit a' la Jon Skeet. Removes unnecessary intermediate string. Thanks!
    // string temp = s.Length > length + 1 ? s.Remove(length+1) : s;
    int lastSpace = s.LastIndexOf(' ', length + 1);
    return lastSpace < 0 ? string.Empty : s.Remove(lastSpace);
}
kbrimington
Not bad, but still creates one intermediate string in some cases :)
Jon Skeet
I think you can do this too:`s.LastIndexOf(' ', length);` And you don't have to do your `string temp = ...` line.
mlsteeves
@mlsteeves: Agreed. @Jon's solution handles `LastIndexOf` better. I hadn't known about the other override.
kbrimington
+7  A: 

I'd use string.LastIndexOf - at least if we only care about spaces. Then there's no need to create any intermediate strings...

As yet untested:

public static string SmartTrim(this string text, int length)
{
    if (text == null)
    {
        throw new ArgumentNullException("text");
    }
    if (length < 0)
    {
        throw new ArgumentOutOfRangeException();
    }
    if (text.Length <= length)
    {
        return text;
    }
    int lastSpaceBeforeMax = text.LastIndexOf(' ', length);
    if (lastSpaceBeforeMax == -1)
    {
        // Perhaps define a strategy here? Could return empty string,
        // or the original
        throw new ArgumentException("Unable to trim word");
    }
    return text.Substring(0, lastSpaceBeforeMax);        
}

Test code:

public class Test
{
    static void Main()
    {
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(20));
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(3));
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(4));
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(5));
        Console.WriteLine("'{0}'", "foo bar baz".SmartTrim(7));
    }
}

Results:

'foo bar baz'
'foo'
'foo'
'foo'
'foo bar'
Jon Skeet
So how do you refactor if the requirement is any word break, not just a space? Specifically the most common (where a word could break, but the character not have white-space around it) is the hyphen... Just curious.
AllenG
@AllenG: If it's still in a small set, `text.LastIndexOfAny(Delimiters)` would be the best option.
Jon Skeet
A: 
string strTemp = "How are you doing today";
int nLength = 12;
strTemp = strTemp.Substring(0, strTemp.Substring(0, nLength).LastIndexOf(' '));

I think that should do it. When I ran that, it ended up with "How are you".

So your function would be:

public static string SmartTrim(this string s, int length) 
{  
    return s.Substring(0, s.Substring(0, length).LastIndexOf(' '));; 
} 

I would definitely add some exception handling though, such as making sure the integer length is no greater than the string length and not less than 0.

XstreamINsanity
This will fail in various cases, e.g. if the length is longer than you need, or is one word of exactly the right length, or can't be successfully trimmed.
Jon Skeet
Yeah, you put that comment as I was making the edit. :) I figured I woudl leave the exception handling to him.
XstreamINsanity
+1  A: 

Obligatory LINQ one liner, if you only care about whitespace as word boundary:

return new String(s.TakeWhile((ch,idx) => (idx < length) || (idx >= length && !Char.IsWhiteSpace(ch))).ToArray());
driis
A: 

I'll toss in some Linq goodness even though others have answered this adequately:

public string TrimString(string s, int maxLength)
{
    var pos = s.Select((c, idx) => new { Char = c, Pos = idx })
        .Where(item => char.IsWhiteSpace(item.Char) && item.Pos <= maxLength)
        .Select(item => item.Pos)
        .SingleOrDefault();

    return pos > 0 ? s.Substring(0, pos) : s;
}

I left out the parameter checking that others have merely to accentuate the important code...

joshperry