tags:

views:

1140

answers:

11

Hi,

I have this string: " Mimi loves Toto and Tata hate Mimi so Toto killed Tata"

I want to write a code that print only the words that begin with capital letters, avoiding repetition

the Output should be like

Mimi
Toto
Tata

I tried to do so but I'm sure its wrong even though no errors are showing.

The code i wrote :

static void Main(string[] args)
        {
            string s = "Memi ate Toto and she killed Tata Memi also hate Biso";
            Console.WriteLine((spliter(s)));
        }



        public static string spliter(string s)
        {

            string x = s;
            Regex exp = new Regex(@"[A-Z]");
            MatchCollection M = exp.Matches(s);

            foreach (Match t in M)
            {

                while (x != null)
                {
                    x = t.Value;  
                }

            }
            return x;
        }


    }
}

Idea:

What if i split the string into an array, then apply a regex to check them word by word and then print the results ? I don't know - can any one help me in making this code work?

+6  A: 

I'm not sure why I'm posting this...

   string[] foo = "Mimi loves Toto and Tata hate Mimi so Toto killed Tata".Split(' ');
            HashSet<string> words = new HashSet<string>();
            foreach (string word in foo)
            {
                if (char.IsUpper(word[0]))
                {
                    words.Add(word);
                }
            }

            foreach (string word in words)
            {
                Console.WriteLine(word);
            }
BFree
this might be faster, too. regex engine is powerful, but it's bloat loaded.
Haoest
+2  A: 

I'd suggest do a string.split to seperate the string into words, and then just print words where char.IsUpper(word[0]) is true.

Something like this

GWLlosa
+1  A: 

use this regex

([A-Z][a-z]+)

explanation:

[A-Z]    [a-z]+
  |        |
Single   Multiple(+)
  |        |
  C      apital   -> Capital

Try out regex here

rizzle
Won't find McDonalds.
David B
Actually, it would, sort of. It would match the Mc and the Donalds.
Joel Coehoorn
+1  A: 

Solution. Notice use of built in string splitter. You could replace the toupper stuff by checking if the first character is between 'A' and 'Z'. Removing duplicates I leave to you (use a hashset if you want).

static void Main(string[] args)
    {
        string test = " Mimi loves Toto and Tata hate Mimi so Toto killed Tata";
        foreach (string j in test.Split(' '))
        {
            if (j.Length > 0)
            {
                if (j.ToUpper()[0] == j[0])
                {
                    Console.WriteLine(j);
                }
            }
        }
        Console.ReadKey(); //Press any key to continue;
    }
Brian
I like this, but it will print duplicates.
xan
duplicates,duplicates,duplicates
+7  A: 

I don't know the C#/.net regex lib at all, but this this regex pattern will do it:

\b[A-Z][a-z]+

the \b means the match can only start at the beginning of a word. change + to * if you want to allow single-word capitals.

Edit: You want to match "McDonald's"?

\b[A-Z][A-Za-z']+

If you don't want to match ' if it only appears at the end of a string, then just do this:

\b[A-Z][A-Za-z']+(?<!')
ʞɔıu
Like DavidB pointed out for Rizzle above, it won't match McDonald's correctly, though getting closer.
Joel Coehoorn
for McDonnalds \b[A-Z][A-Za-z]+ would be ok, I think
rkj
A: 
string foo = "Mimi loves Toto and Tata hate Mimi so Toto killed Tata";
char[] separators = {' '};
IList<string> capitalizedWords = new List<string>();
string[] words = foo.Split(separators);
foreach (string word in words)
{
    char c = char.Parse(word.Substring(0, 1));

    if (char.IsUpper(c))
    {
        capitalizedWords.Add(word);
    }
}

foreach (string s in capitalizedWords)
{
    Console.WriteLine(s);
}
Click the code button (010101) when editting your post.
David B
Tried that but it doesn't come out colorized. Thanks anyway.
It's helps if you get rid of all those extra html codes: the system handles that for you.
Joel Coehoorn
highlight your code in Visual Studio(or text editor capable of indenting highlighted block), then press tab, then copy it. then paste here
Michael Buen
+1  A: 

Since others have already posted so much of the answer, I don't feel I'm breaking any homework rules to show this:

//set up the string to be searched
string source =
"First The The Quick Red fox jumped oveR A Red Lazy BRown DOg";

//new up a Regex object.
Regex myReg = new Regex(@"(\b[A-Z]\w*)");

//Get the matches, turn then into strings, de-dupe them
IEnumerable<string> results =
    myReg.Matches(source)
    .OfType<Match>()
    .Select(m => m.Value)
    .Distinct();

//print out the strings.
foreach (string s in results)
    Console.WriteLine(s);
  • For learning the Regex type, you should start here.
  • For learning the Linq in-memory query methods, you should start here.
David B
Your regex will miss the first word, if it's capitalized. Instead of "\s" (whitespace) it should be "\b" (word boundary).
P Daddy
Thanks. Made the change. I missed this case because my first word was repeated. Regex seems to have a lot of special cases.
David B
+4  A: 

C# 3

        string z = "Mimi loves Toto and Tata hate Mimi so Toto killed Tata";
        var wordsWithCapital = z.Split(' ').Where(word => char.IsUpper(word[0])).Distinct();
        MessageBox.Show( string.Join(", ", wordsWithCapital.ToArray()) );

C# 2

        Dictionary<string,int> distinctWords = new Dictionary<string,int>();
        string[] wordsWithInitCaps = z.Split(' ');
        foreach (string wordX in wordsWithInitCaps)
            if (char.IsUpper(wordX[0]))
                if (!distinctWords.ContainsKey(wordX))
                    distinctWords[wordX] = 1;
                else
                    ++distinctWords[wordX];                       


        foreach(string k in distinctWords.Keys)
            MessageBox.Show(k + ": " + distinctWords[k].ToString());
Michael Buen
Excellent linq-foo :)
Mark Maxham
This code better demonstrates what I meant to say in my answer :)
GWLlosa
A: 

David B's answer is the best one, he takes into account the word stopper. One vote up.

To add something to his answer:

        Func<string,bool,string> CaptureCaps = (source,caseInsensitive) => string.Join(" ", 
                new Regex(@"\b[A-Z]\w*").Matches(source).OfType<Match>().Select(match => match.Value).Distinct(new KeisInsensitiveComparer(caseInsensitive) ).ToArray() );


        MessageBox.Show(CaptureCaps("First The The  Quick Red fox jumped oveR A Red Lazy BRown DOg", false));
        MessageBox.Show(CaptureCaps("Mimi loves Toto. Tata hate Mimi, so Toto killed TaTa. A bad one!", false));


        MessageBox.Show(CaptureCaps("First The The  Quick Red fox jumped oveR A Red Lazy BRown DOg", true));
        MessageBox.Show(CaptureCaps("Mimi loves Toto. Tata hate Mimi, so Toto killed TaTa. A bad one!", true));


class KeisInsensitiveComparer : IEqualityComparer<string>
{
    public KeisInsensitiveComparer() { }

    bool _caseInsensitive;
    public KeisInsensitiveComparer(bool caseInsensitive) { _caseInsensitive = caseInsensitive; }


    // Products are equal if their names and product numbers are equal.
    public bool Equals(string x, string y)
    {

        // Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) return true;

        // Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;



        return _caseInsensitive ? x.ToUpper() == y.ToUpper() : x == y;
    }

    // If Equals() returns true for a pair of objects,
    // GetHashCode must return the same value for these objects.

    public int GetHashCode(string s)
    {
        // Check whether the object is null.
        if (Object.ReferenceEquals(s, null)) return 0;

        // Get the hash code for the Name field if it is not null.
        int hashS = s == null ? 0 : _caseInsensitive ? s.ToUpper().GetHashCode() : s.GetHashCode();

        // Get the hash code for the Code field.
        int hashScode = _caseInsensitive ? s.ToUpper().GetHashCode() : s.GetHashCode();

        // Calculate the hash code for the product.
        return hashS ^ hashScode;
    }

}
Michael Buen
A: 
    static Regex _capitalizedWordPattern = new Regex(@"\b[A-Z][a-z]*\b", RegexOptions.Compiled | RegexOptions.Multiline);

    public static IEnumerable<string> GetDistinctOnlyCapitalizedWords(string text)
    {
        return _capitalizedWordPattern.Matches(text).Cast<Match>().Select(m => m.Value).Distinct();
    }
Konstantin Spirin
A: 

function capitalLetters() { var textAreaId = "textAreaId"; var resultsArray = $(textAreaId).value.match( /\b[A-Z][A-Za-z']+/g ); displayResults(textAreaId, resultsArray); }

Sam
This isn't a JavaScript question.
icktoofay