views:

130

answers:

4

I am trying to parse an input stream where the first line tells me how many lines of data there are. I'm ending up with the following code, and it works, but I think there is a better way. Is there?

main = do
    numCases <- getLine
    proc $ read numCases

proc :: Integer -> IO ()
proc numCases
     | numCases == 0 = return ()
     | otherwise = do
         str <- getLine
         putStrLn $ findNextPalin str
         proc (numCases - 1)

Note: The code solves the Sphere problem https://www.spoj.pl/problems/PALIN/ but I didn't think posting the rest of the code would impact the discussion of what to do here.

+7  A: 

Use replicate and sequence_.

main, proc :: IO ()

main = do numCases <- getLine
          sequence_ $ replicate (read numCases) proc

proc = do str <- getLine
          putStrLn $ findNextPalin str

sequence_ takes a list of actions, and runs them one after the other, in sequence. (Then it throws away the results; if you were interested in the return values from the actions, you'd use sequence.)

replicate n x makes a list of length n, with each element being x. So we use it to build up the list of actions we want to run.

Dave Hinton
+1; I'm annoyed that some of you people are so darn fast, but I'm happy I actually knew the answer (in abstract terms) in this case.
Daniel Pratt
Yeah, I was halfway done writing almost the same answer when I saw this... my habit of running code in GHCi before answering slows me down, I guess.
camccann
Notice you (the op) can clean this up a little once you understand the common functions. Look at replicateM_ (or my answer) for an example.
TomMD
+2  A: 

Dave Hinton's answer is correct, but as an aside here's another way of writing the same code:

import Control.Applicative

main = (sequence_ . proc) =<< (read <$> getLine)

proc x = replicate x (putStrLn =<< (findNextPalin <$> getLine))

Just to remind everyone that do blocks aren't necessary! Note that in the above, both =<< and <$> stand in for plain old function application. If you ignore both operators, the code reads exactly the same as similarly-structured pure functions would. I've added some gratuitous parentheses to make things more explicit.

Their purpose is that <$> applies a regular function inside a monad, while =<< does the same but then compresses an extra layer of the monad (e.g., turning IO (IO a) into IO a).

The interesting part of looking at code this way is that you can mostly ignore where the monads and such are; typically there's very few ways to place the "function application" operators to make the types work.

camccann
A: 

You (and the previous answers) should work harder to divide up the IO from the logic. Make main gather the input and separately (purely, if possible) do the work.

import Control.Monad -- not needed, but cleans some things up
main = do
    numCases <- liftM read getLine
    lines <- replicateM numCases getLine
    let results = map findNextPalin lines
    mapM_ putStrLn results
TomMD
Good point about `replicateM`, that's nicer than `sequence`. I'm not really sure what you mean by dividing up the logic, though. All the real work is clearly happening in `findNextPalin` regardless, and the only additional pure function you've separated out is a single `map`, at the cost of (to my eye) making the function a bit more verbose. You've also changed the behavior of the code in the process.
camccann
A: 

When solving SPOJ problems in Haskell, try not to use standard strings at all. ByteStrings are much faster, and I've found you can usually ignore the number of tests and just run a map over everything but the first line, like so:

{-# OPTIONS_GHC -O2 -optc-O2 #-}

import qualified Data.ByteString.Lazy.Char8 as BS

main :: IO ()
main = do
    (l:ls) <- BS.lines `fmap` BS.getContents
    mapM_ findNextPalin ls

The SPOJ page in the Haskell Wiki gives a lot of good pointers about how to read Ints from ByteStrings, as well as how to deal with a large quantities of input. It'll help you avoid exceeding the time limit.

rtperson
I managed ok with Strings on this problem. I did my math/ordering directly on the strings. But I will learn about ByteStrings just to enhance my skill set. Thanks for the tip. I like your technique "(l:ls) <- ..." to separate out the first line. I should have thought of that....
Tim Perry
I added {-# OPTIONS_GHC -O2 -optc-O2 #-} to my last submission and went from 5.31 seconds to 5.51 seconds. Is that random noise or does the optimization not make much difference?
Tim Perry
@Tim Perry - It's very possibly not doing anything, but I honestly don't know enough about the GHC internals to say for sure. It was just a snippet I picked up from the Haskell SPOJ article I linked to. BTW, I saw your submission for PALIN on SPOJ -- 35M of memory and 5.31 seconds seems bloated to me. Since it's a large I/O problem, you'll definitely want to check out the Haskell wiki and maybe try the mailing lists. (I haven't tried the problem myself, but I'll give it a shot and come back here and let you know how I did.)
rtperson
@Tim Perry - But since it is a large IO problem, you'll find that ByteStrings speed things up a lot. To quote from RWH: "Code written with bytestring can often match or exceed the performance and memory footprint of C, while maintaining Haskell's expressivity and conciseness." http://book.realworldhaskell.org/read/efficient-file-processing-regular-expressions-and-file-name-matching.html
rtperson
@rtperson - I guess I could just read it in, unpack the ByteString, run my existing logic, and then repack the result to write it out. I'll try that just to see what happens and post back. I guess I should fix the other obvious performance flaw in my code first.
Tim Perry
So I switched over to ByteStrings and I reduced my memory consumption by over 50% but didn't reduce my overall time much. I think that my two calls to append are killing me. Still, it was a useful exercise to use the ByteStrings. Thanks for the tip.
Tim Perry
@Tim Perry - I've heard that packing and unpacking ByteStrings is expensive, and ByteString actually has its own append call, so you might be able to squeeze some more performance out of it. I still haven't had a chance to play around with the problem, but I have some ideas about it. Still, a million digits is *a lot*. Looking at the Haskell contributions, I'm amazed at some of those times. Good luck.
rtperson