tags:

views:

49

answers:

2

Hi

I have a text file which contains a long list of words. Some of them are overlapped by means of case such as:

  • Honesty
  • honesty

I want to remove upper case and leave the lower one counted as one word. How can I do that?

Thank you.

A: 

I tried and got it.

> cat test
Honesty
World
Hello
world
Hello
honesty

> sort -uf test
Hello
Honesty
World

> sort -uf test | tr A-Z a-z
 hello
 honesty
 world

Thanks for your helping mind.

jack
A: 
  1. Read a word
  2. Convert it to lower case
  3. Check for duplicates with some sort of set/hashtable sort of thing.

For example, in C++, you could use something like this:

#include <set>
#include <string>
#include <iostream>
#include <algorithm>
#include <ctype.h>

struct lowercase { 
    std::string operator()(std::string const &s) const { 
        std::string ret(s);
        std::transform(&s[0], &s[s.length()-1], &ret[0], tolower);
        return ret;
    }
};

int main() {
    std::set<std::string> items;

    std::transform(
        std::istream_iterator<std::string>(std::cin), 
        std::istream_iterator<std::string>(), 
        std::inserter(items, items.begin()),
        lowercase());

    std::copy(items.begin(), items.end(), 
        std::ostream_iterator<std::string>(std::cout, "\n"));
    return 0;
}
Jerry Coffin