ansaurus

Question

[Haskell] Pattern Matching and Recursion

Answer 1

+8 A:

  sublist [1, 2, 3] [1, 2, 4, 1, 2, 3]
= sublist [2, 3] [2, 4, 1, 2, 3]
= sublist [3] [4, 1, 2, 3]
= sublist [3] [4, 1, 2, 3]
= sublist (3:[]) (4:[1,2,3])     -- Since 3 /= 4, we take sublist (x:xs) ys
= sublist (3:[]) [1,2,3]
= sublist (3:[]) (1:[2,3])
= sublist (3:[]) [2,3]
= sublist (3:[]) (2:[3])
= sublist (3:[]) [3]
= sublist [] []
= True

sublist checks if heads of lists are equal. If yes, then it removes them and proceeds (sublist xs ys). If no, it removes head from the second list (sublist (x:xs) ys). This way it "finds" the following association:

 1 2 3
 | | |
 | | \-----\
 | |       |
 1 2 4 1 2 3

In other words, to check sublist [1,2,3] ys for some list ys it pops elements from ys as long as they are not 1. Then it pops elements as long as they are not 2. Then it pops elements as long as they are not 3. If [1,2,3] is exhausted, then it reports True; if ys is exhausted, it reports False.

sdcvvc 2010-07-09 13:52:05

Yes, this makes sense. My "sublist" function is operating like a set membership function, e.g., 1, 2, 3 are members of {1, 2, 4, 1, 2, 3} (although lists obviously may contain duplicate elements).

danportin 2010-07-09 17:02:08

It's not exactly set membership, for example 1,2 are members of {2,1} but sublist [1,2] [2,1] returns False. See http://en.wikipedia.org/wiki/Subsequence.

sdcvvc 2010-07-09 17:05:01

Answer 2

+11 A:

It is working because:

[3] is matched as x:xs as 3:[],
[4, 1, 2, 3] is matched as y:ys as 4:[1,2,3]
3/=4 so sublist (x:xs) ys is evaluated, which eventually is True

trace:

sublist [1, 2, 3] [1, 2, 4, 1, 2, 3]
   = sublist [2, 3] [2, 4, 1, 2, 3]
   = sublist [3] [4, 1, 2, 3]
   = sublist [3] [1, 2, 3]
   = sublist [3] [2, 3]
   = sublist [3] [3]
   = sublist [] [] = True

YuppieNetworking 2010-07-09 13:56:04

Answer 3

+1 A:

YuppieNetworking and sdcwc have already explained how the matching works. So your sublist searches for a sublist in the same sense as subsequence (not necessarily the same items in a row, there can be anything in between).

I want to note that you can often avoid explicit recursion to remove unnecessary items from the front of the list with dropWhile. Also, I wanted to give an example how to check if two lists have the same prefixes (you need this to check if the second list contains items of the first one in a row).

The first example is similar to your function, it allows for items in between, but it uses dropWhile to remove items in front of ys:

-- Test:
-- map ("foo" `subListOf`) ["kungfoo", "f-o-o!", "bar"] == [True,True,False]

[] `subListOf` _ = True
(x:xs) `subListOf` ys =
  let ys' = dropWhile (/= x) ys     -- find the first x in ys
  in  case ys' of
      (_:rest) -> xs `subListOf` rest
      [] -> False

The second example looks for a "dense" sublist:

-- Test:
-- map ("foo" `denseSubListOf`) ["kungfoo!", "-f-o-o-"] == [True,False]

[] `denseSubListOf` _ = True
_  `denseSubListOf` [] = False
xs `denseSubListOf` ys =
  let ps = zip xs ys in
  (length ps == length xs && all (uncurry (==)) ps) -- same prefix of xs and ys
  || xs `denseSubListOf` (tail ys)                  -- or search further

Please note that to check that the second list contains all the first one in the beginning I compare elements pair by pair (I zip them together to do it).

It is easier to explain by example:

zip [1,2,3] [1,2,3,4,5]  -- gives [(1,1), (2,2), (3,3)], 4 and 5 are truncated
uncurry (==)             -- an equivalent of (\(a,b) -> a == b)
all                      -- gives True iff (uncurry (==)) is True for all pairs
length ps == length xs   -- this is to ensue that the right list is long enough

jetxee 2010-07-09 14:50:27

Answer 4

+3 A:

Debug.Trace is your friend. With sublist instrumented as

sublist [] ys = trace ("A: [] "           ++ show ys) True
sublist xs [] = trace ("B: " ++ (show xs) ++ " []")   False
sublist (x:xs) (y:ys)
   | x == y = trace (info "C" "==")
              sublist xs ys
   | x /= y = trace (info "D" "/=")
              sublist (x:xs) ys
   where info tag op =
           tag ++ ": " ++ (show x) ++ " " ++ op ++ " " ++ (show y) ++
           "; xs=" ++ (show xs) ++ ", ys=" ++ show ys

you see what's happening, namely that it's repeatedly throwing away the head of the second list until it finds a match:

*Main> sublist [1, 2, 3] [1, 2, 4, 1, 2, 3]
C: 1 == 1; xs=[2,3], ys=[2,4,1,2,3]
C: 2 == 2; xs=[3], ys=[4,1,2,3]
D: 3 /= 4; xs=[], ys=[1,2,3]
D: 3 /= 1; xs=[], ys=[2,3]
D: 3 /= 2; xs=[], ys=[3]
C: 3 == 3; xs=[], ys=[]
A: [] []
True

Another tool that will help you implement sublist correctly is Test.QuickCheck, a library that automatically creates test data for use in verifying properties that you specify.

For example, say you want sublist to treat xs and ys as sets and determine whether the former is a subset of the latter. We can use Data.Set to specify this property:

prop_subsetOf xs ys =
  sublist xs ys == fromList xs `isSubsetOf` fromList ys
  where types = (xs :: [Int], ys :: [Int])

This says sublist should be equivalent to the definition on the right. The prop_ prefix is a popular convention for naming test properties to be used with QuickCheck.

Running it also identifies a failure case:

*Main> quickCheck prop_subsetOf 
*** Failed! Falsifiable (after 6 tests):                  
[0,0]
[0]

Greg Bacon 2010-07-09 14:59:16

Answer 5

+2 A:

I think where you may be misunderstanding, is that (when you first wrote the function) you assumed that in your check x /= y = sublist (x:xs) ys you (sub-consciously?) assumed that the function would backtrack and re-do your function with the tail of the original second list, not the tail of whichever piece of the list you're using when it doesn't match.

One nice (and unsettling) thing about Haskell is just how ridiculously powerful it is.

For an example, here's how what you wanted should have looked:

sublist   []     ys   = True
sublist   xs     []   = False
sublist (x:xs) (y:ys) |   x /= y  = False
                      | otherwise = sublist xs ys || sublist (x:xs) ys

which will check for all of the pieces of the first list. The "official" definition for the function (look up "isInfixOf" in your documentation) has a few extra functions that basically means the same thing.

Here's another way write it, that looks more "explanatory" to my eyes:

sublist [] ys = True
sublist xs [] = False
sublist xs ys =  (xs == take (length xs) ys) || sublist xs (tail ys)

BMeph 2010-07-09 18:03:32

This was incredibly helpful. However, in the first piece of code, won't "sublist list_1 list_2" evaluate to False if (x /= y) evaluates to True, i.e., the function won't recurse properly if x and y aren't equivalent?

danportin 2010-07-09 18:53:29

ansaurus

tags:

views:

answers:

[Haskell] Pattern Matching and Recursion

related questions