ansaurus

Question

Algorithm for finding characters in the same positions in a list of strings?

Answer 1

+3 A:

Finding a list of characters that are common in ALL strings at a certain position is trivially simple. Just iterate on each string for each character position 1 character position at a time. If any string's character is not the match of it's closest neighbor string's character, then the position does not contain a common character.

For any i = 0 to length -1... Once you find Si[x] != Si+1[x] you can skip to the next position x+1.

Where Si is the ith string in the list. And [x] is the character at position x.

Brian R. Bondy 2008-09-16 01:54:57

Answer 2

+1 A:

Some generic code that has pretty poor performance O(n^2)

str[] = { "Toby", "Tiny", "Tory", "Tily" };
result = null;
largestString = str.getLargestString(); // Made up function
str.remove(largestString)
for (i = 0; i < largestString.length; i++) {
   hits = 0;
   foreach (str as value) {
      if (i < value.length) {
         if (value.charAt(i) == largestString.charAt(i))
            hits++;
      }
   }
   if (hits == str.length)
      result += largestString.charAt(i);
}
print(str.items);

Josh Smeaton 2008-09-16 02:03:49

Answer 3

+1 A:

I can't think of anything especially optimized.

You can do something like this, which shouldn't be too hard:

  //c# -- assuming your strings are in a List<string> named Names
  int shortestLength = Names[0].Length, j;
  char[] CommonCharacters;
  char single;

  for (int i = 1; i < Names.Count; i++)
  {
   if (Names[i].Length < shortestLength) shortestLength = Names[i].Length;
  }

  CommonCharacters = new char[shortestLength];
  for (int i = 0; i < shortestLength; i++)
  {
   j = 1;
   single = Names[0][i];
   CommonCharacters[i] = single;
   while (j < shortestLength)
   {
    if (single != Names[j][i])
    {
     CommonCharacters[i] = " "[0];
     break;
    }
    j++;
   }
  }

This would give you an array of characters that are the same across everything in the list.

theo 2008-09-16 02:07:01

Answer 4

+1 A:

What about something like this?

strings = %w(Tony Tiny Tory Tily)
positions = Hash.new { |h,k| h[k] = Hash.new { |h,k| h[k] = 0 } }
strings.each { |str| 
  0.upto(str.length-1) { |i| 
    positions[i][str[i,1]]+=1 
  }
}

At the end of execution, the result will be:

positions = {
  0=>{"T"=>4},
  1=>{"o"=>2, "i"=>2}, 
  2=>{"l"=>1, "n"=>2, "r"=>1},
  3=>{"y"=>4}
}

hoyhoy 2008-09-16 02:12:24

And that's the reason I love the sound of ruby. Letting you get things done quick.

Josh Smeaton 2008-09-16 02:16:10

Answer 5

+1 A:

Here's an algorithm in 5 lines of ruby:

#!/usr/bin/env ruby
chars = STDIN.gets.chomp.split("")
STDIN.each do |string|
  chars = string.chomp.split("").zip(chars).map {|x,y| x == y ? x : nil }
end
chars.each_index {|i| puts "#{chars[i]}  #{i}" if chars[i] }

Put this in commonletters.rb. Sample usage:

$ commonletters.rb < input.txt
T  0
y  3

Assuming that input.txt contains:

Toby
Tiny
Tory
Tily

This should work with whatever inputs you throw at it. It will break if the input file is empty, but you can probably fix that yourself. This is O(n) (n is total number of chars in the input).

tgamblin 2008-09-16 02:24:45

Answer 6

+1 A:

And here's a trivial version in Python:

items = ['Toby', 'Tiny', 'Tory', 'Tily']
tuples = sorted(x for item in items for x in enumerate(item))
print [x[0] for x in itertools.groupby(tuples) if len(list(x[1])) == len(items)]

Which prints:

[(0, 'T'), (3, 'y')]

Edit: Here's a better version that doesn't require creating a (potentially) huge list of tuples:

items = ['Toby', 'Tiny', 'Tory', 'Tily']
minlen = min(len(x) for x in items)
print [(i, items[0][i]) for i in range(minlen) if all(x[i] == items[0][i] for x in items)]

Nick Johnson 2008-09-19 17:16:16

Answer 7

A:

In lisp:

CL-USER> (defun common-chars (&rest strings)
           (apply #'map 'list #'char= strings))
COMMON-CHARS

Just pass in the strings:

CL-USER> (common-chars "Toby" "Tiny" "Tory" "Tily")
(T NIL NIL T)

If you want the characters themselves:

CL-USER> (defun common-chars2 (&rest strings)
           (apply #'map
                  'list
                  #'(lambda (&rest chars)
                      (when (apply #'char= chars)
                        (first chars))) ; return the char instead of T
                  strings))
COMMON-CHARS2

CL-USER> (common-chars2 "Toby" "Tiny" "Tory" "Tily")
(#\T NIL NIL #\y)

If you don't care about posiitons, and just want a list of the common characters:

CL-USER> (format t "~{~@[~A ~]~}" (common-chars2 "Toby" "Tiny" "Tory" "Tily"))
T y 
NIL

I admit this wasn't an algorithm... just a way to do it in lisp using existing functionality

If you wanted to do it manually, as has been said, you loop comparing all the characters at a given index to each other. If they all match, save the matching character.

Morikal 2008-09-19 18:35:16

Answer 8

+1 A:

#include <iostream>

int main(void)
{
    char words[4][5] = 
    {
        "Toby",
        "Tiny",
        "Tory",
        "Tily"
    };

    int wordsCount = 4;
    int lettersPerWord = 4;

    int z;
    for (z = 1; z < wordsCount; z++)
    {
        int y;
        for (y = 0; y < lettersPerWord; y++)
        {
            if (words[0][y] != words[z][y])
            {
                words[0][y] = ' ';
            }
        }
    }

    std::cout << words[0] << std::endl;

    return 0;
}

EvilTeach 2008-09-21 18:50:03

ansaurus

tags:

views:

answers:

Algorithm for finding characters in the same positions in a list of strings?

related questions