A pseudo-random number generator can of course be used outside of IO
, by simply storing the current generator value along with the probabilistic pure data structure and updating it when constructing modified versions. The downside to this is that the PRNG will be more obviously deterministic than in an impure program, since nothing outside of the single data structure will update it. If only the statistical properties matter this poses no issue, but could be cause for concern otherwise.
On the other hand, hiding an impure PRNG is arguably a justified use of unsafePerformIO
as in Ganesh Sittampalam's answer. This blatantly violates referential transparency, but only to the extent that the PRNG will return unpredictable, inconsistent values--which is the whole point! Caution is still required, however, as the compiler may make incorrect assumptions about the code because it looks pure.
But really, neither approach is terribly appealing. Using unsafePerformIO
is unsatisfying and potentially dangerous. Threading a PRNG state is easy, but imposes a (potentially spurious) strict sequencing on any computations that use it. Neither safety nor laziness are relinquished lightly by Haskell programmers (and rightly so!), and of course data structures restricted to IO
are of limited utility. So, to answer part of your question, that's why Haskell programmers are likely to avoid such structures.
As for "how Haskell can deal with" such things, I would suggest that this is the wrong question to ask.
What it really comes down to is that many data structures and algorithms implicitly assume (and optimize for) an imperative, impure, strict language, and while it's certainly possible to implement these in Haskell it's rarely desirable, because (even ignoring the internal implementation) using them imposes on your code a structure and approach that is very much not idiomatic. Furthermore, because Haskell violates those implicit assumptions, performance is often degraded (sometimes badly so).
The thing to realize is that algorithms and data structures are a means, not an end. It's rarely the case that a single specific implementation is required--what is required is generally certain performance characteristics. Finding data structures/algorithms that offer the desired characteristics while also being idiomatic Haskell is almost always a better plan, and is likely to perform better than trying to cram a strict imperative peg into a lazy functional hole.
This mistake is perhaps most commonly found in the subset of programmers who never met a problem they couldn't solve with a hash table, but the habit is easy to fall into for many of us. The correct approach is to stop thinking "how do I implement this solution in Haskell", but instead "what is the best way to solve my problem in Haskell". You might be surprised how often the answers differ; I know I often am!