ansaurus

Question

How to determine whether a string is composed only of letters given by a second string.

Answer 1

+1 A:

> If the program is always return false, how to optimize this program?

return false

> If it is always return true, how to optimize it?

return true

EDIT: Seriously, it's a good question what algorithm optimizes for the case of failure, and what algorithm optimizes for the case of success. I don't know what algorithm strstr uses, it may be a generally-good algorithm which isn't optimal for either of those assumptions.

Maybe you'll catch someone here who knows offhand. If not, this looks like a good place to start reading:Exact String Matching Algorithms

profjim 2010-02-16 22:13:43

wow! beat me by 2 seconds.

Alexander Gessler 2010-02-16 22:14:06

Perhaps this is a facetious answer, but I don't think this is really what @skydoor was asking. The example and question title indicates pretty clearly that he wants to determine if all of a particular string's characters appear in another string.

John Feminella 2010-02-16 22:16:44

Answer 2

+3 A:

[How do I] determine [if] the first string [is composed of characters that appear in] the second string?

Here's a quick algorithm:

Treat the first and second strings as two sets of characters S and T.
Perform the set difference S - T. Call the result U.

If U is nonempty, return false. Otherwise, return true.

John Feminella 2010-02-16 22:14:38

@skydoor, you should clarify whether order is important: do the chars of B have to appear in A in the same sequence they appear in B? Or just anywhere.

profjim 2010-02-16 22:20:02

@profjim: Yes, that's a good point. If order is important, you can still do a "set" difference, but now you have to check the elements of the set in a specific order.

John Feminella 2010-02-16 22:21:14

You have essentially reworded the question without answering it

BlueRaja - Danny Pflughoeft 2010-02-17 02:47:13

Answer 3

+5 A:

Sort both strings. Then go through A, and have a pointer going through B. If the character in A is the same as what the B pointer points to, keep looking through A. If the character in A is later in the alphabet than what the B pointer points to, advance the B pointer. If the character in A is earlier in the alphabet than what the B pointer points to, return false. If you run out of A, return true.

David Thornley 2010-02-16 22:21:37

Answer 4

+2 A:

Here is one simple way.

def isComposedOf(A, B):
    bset = set(B)
    for c in A:
        if c not in bset:
            return False
    return True

This algorithm walks each string once, so it runs in O(len(A) + len(B)) time.

When the answer is yes, you cannot do better than len(A) comparisons even in the best case, because no matter what you must check every letter. And in the worst case one of the characters in A is hidden very deep in B. So O(len(A) + len(B)) is optimal, as far as worst-case performance is concerned.

Similarly: when the answer is no, you cannot do better than len(B) comparisons even in the best case; and in the worst case the character that isn't in B is hidden very deep in A. So O(len(A) + len(B)) is again optimal.

You can reduce the constant factor by using a better data structure for bset.

You can avoid scanning all of B in some (non-worst) cases where the answer is yes by building it lazily, scanning more of B each time you find a character in A that you haven't seen before.

Jason Orendorff 2010-02-18 00:02:54

Depending on the implementation of set (is 'in' O(n)?) you may want to delete the character from bset after each success.

Andrew Jaffe 2010-02-18 00:38:29

That would give the wrong answer for the second example in the question (A="aab", B="ab"). Anyway, `set` is a hash set, so queries are O(1) except in when there are extremely many hash collisions.

Jason Orendorff 2010-02-18 14:37:25

ansaurus

tags:

views:

answers:

How to determine whether a string is composed only of letters given by a second string.

related questions