I have a question about whether or not a specific way of applying of the DRY principle is considered a good practice in Haskell.I'm going to present an example, and then ask whether the approach I'm taking is considered good Haskell style. In a nutshell, the question is this: when you have a long formula, and then you find yourself needing to repeat some small subsets of that formula elsewhere, do you always put that repeated subset of the formula into a variable so you can stay DRY? Why or why not?
The Example: Imagine we're taking a string of digits, and converting that string into its corresponding Int value. (BTW, this is an exercise from "Real World Haskell").
Here's a solution that works except that it ignores edge cases:
asInt_fold string = fst (foldr helper (0,0) string)
where
helper char (sum,place) = (newValue, newPlace)
where
newValue = (10 ^ place) * (digitToInt char) + sum
newPlace = place + 1
It uses foldr, and the accumulator is a tuple of the next place value and the sum so far.
So far so good. Now, when I went to implement the edge case checks, I found that I needed little portions of the "newValue" formula in different places to check for errors. For example, on my machine, there would be an Int overflow if the input was larger than (2^31 - 1), so the max value I could handle is 2,147,483,647. Therefore, I put in 2 checks:
- If the place value 9 (the billions place) and the digit value is > 2, there's an error.
- If sum + (10 ^ place) * (digitToInt char) > maxInt, there's an error.
Those 2 checks caused me to repeat part of the formula, so I introduced the following new variables:
- digitValue = digitToInt char
- newPlaceComponent = (10^place) * digitValue
The reason I introduced those variables is merely an automatic application of the DRY principle: I found myself repeating those portions of the formula, so I defined them once and only once.
However, I wonder if this is considered good Haskell style. There are obvious advantages, but I see disadvantages as well. It definitely makes the code longer, whereas much of the Haskell code I've seen is pretty terse.
So, do you consider this good Haskell style, and do you follow this practice, or not? Why / why not?
And for what it's worth, here's my final solution that deals with a number of edge cases and therefore has quite a large where block. You can see how large the block became due to my application of the DRY principle.
Thanks.
asInt_fold "" = error "You can't be giving me an empty string now"
asInt_fold "-" = error "I need a little more than just a dash"
asInt_fold string | isInfixOf "." string = error "I can't handle decimal points"
asInt_fold ('-':xs) = -1 * (asInt_fold xs)
asInt_fold string = fst (foldr helper (0,0) string)
where
helper char (sum,place) | place == 9 && digitValue > 2 = throwMaxIntError
| maxInt - sum < newPlaceComponent = throwMaxIntError
| otherwise = (newValue, newPlace)
where
digitValue = (digitToInt char)
placeMultiplier = (10 ^ place)
newPlaceComponent = placeMultiplier * digitValue
newValue = newPlaceComponent + sum
newPlace = place + 1
maxInt = 2147483647
throwMaxIntError =
error "The value is larger than max, which is 2147483647"