views:

100

answers:

5

Here is what I'm trying to accomplish. I have an object coming back from the database with a string description. This description can be up to 1000 characters long, but we only want to display a short view of this. So I coded up the following, but I'm having trouble in actually removing the number of words after the regular expression finds the total count of words. Does anyone have good way of dispalying the words which are less than the Regex.Matches?

Thanks!

if (!string.IsNullOrEmpty(myObject.Description))
{
    string original = myObject.Description;
    MatchCollection wordColl = Regex.Matches(original, @"[\S]+");
    if (wordColl.Count < 70) // 70 words?
    {
        uxDescriptionDisplay.Text = 
             string.Format("<p>{0}</p>", myObject.Description);
    }
    else
    {                        
        string shortendText = original.Remove(200); // 200 characters?
        uxDescriptionDisplay.Text = 
              string.Format("<p>{0}</p>", shortendText);
    }
 }

EDIT:

So this is what I got working on my own:

else 
{
    int count = 0;
    StringBuilder builder = new StringBuilder();
    string[] workingText = original.Split(' ');
    foreach (string word in workingText)
    {
        if (count < 70)
        {
            builder.AppendFormat("{0} ", word);
        }
        count++;
    }
        string shortendText = builder.ToString();
}

It's not pretty, but it worked. I would call it a pretty naive way of doing this. Thanks for all of the suggestions!

+5  A: 

I would opt to go by a strict character count rather than a word count because you might happen to have a lot of long words.

I might do something like (pseudocode)

if text.Length > someLimit
   find first whitespace after someLimit (or perhaps last whitespace immediately before)
   display substring of text 
else 
   display text

Possible code implementation:

string TruncateText(string input, int characterLimit)
{
    if (input.Length > characterLimit)
    {
        // find last whitespace immediately before limit
        int whitespacePosition = input.Substring(0, characterLimit).LastIndexOf(" ");

        // or find first whitespace after limit (what is spec?)
        // int whitespacePosition = input.IndexOf(" ", characterLimit); 

        if (whitespacePosition > -1)
            return input.Substring(0, whitespacePosition);
    }
    return input;
}
Anthony Pegram
Probably more than what I wanted to do, but well worth knowing about.
Chris
+3  A: 

One method, if you're using at least C#3.0, would be a LINQ like the following. This is provided you're going strictly by word count, not character count.

if (wordColl.Count > 70)
{
    foreach (var subWord in wordColl.Cast<Match>().Select(r => r.Value).Take(70))
    {
        //Build string here out of subWord
    }
}

I did a test using a simple Console.WriteLine with your Regex and your question body (which is over 70 words, it turns out).

ccomet
This is actually the one I went with to replace my kludgey code. Much shorter than what I was working with, and still worked with my collection of words.
Chris
+1  A: 

You can use Regex Capture Groups to hold the match and access it later.

For your application, I'd recommend instead simply splitting the string by spaces and returning the first n elements of the array:

if (!string.IsNullOrEmpty(myObject.Description))
{
    string original = myObject.Description;
    string[] words = original.Split(' ');
    if (words.Length < 70)
    {
        uxDescriptionDisplay.Text = 
             string.Format("<p>{0}</p>", original);
    }
    else
    {                        
        string shortDesc = string.Empty;
        for(int i = 0; i < 70; i++) shortDesc += words[i] + " ";
        uxDescriptionDisplay.Text = 
             string.Format("<p>{0}</p>", shortDesc.Trim());
     }
 }
JYelton
Nice score and badges :)
serg
A: 

Are you wanting to remove 200 characters or start truncating at the 200th character? When you call original.Remove(200) you are indexing the start of the truncation at the 200th character. This is how you use Remove() for a certain number of characters to remove:

string shortendText = original.Remove(0,200);

This starts at the first character and removes 200 starting with that one. Which I imagine that's not what you're trying to do since you're shortening a description. That's merely the correct way to use Remove().

Instead of using Regex matchcollections why not just split the string? It's a lot easier and straight forward. You can set the delimiter to a space character and split that way. Not sure if that completely fixes your need but it just might. I'm not sure what your data looks like in the description. But you split this way:

String[] wordArray = original.Split(' ');

From there you can determine the word count with wordArray's Length property value.

jlafay
Actually, I wasn't wanting to deal with the character count. I was really trying to work with the physical word count. But thanks for the suggestion.
Chris
After you split into the wordArray the Length property gets you the word count.
jlafay
A: 

If I was you I would go by characters as you may have many one letter words or many long words in your text.

Go through until characters <= your limit, then either find the next space and then add these characters to a new string (possibly using the SubString method) or take these characters and add a few full stops, then make a new string The later could be unproffessional I suppose.

Sir Graystar