ansaurus

Question

Why is determining if a function is pure difficult?

Answer 1

+3 A:

Here's the first thing that popped into my mind when I read your question.

Class Hierarchies

Determining if a variable is modified includes the act of digging through every single method which is called on the variable to determine if it's mutating. This is ... somewhat straight forward for a sealed type with a non-virtual method.

But consider virtual methods. You must find every single derived type and verify that every single override of that method does not mutate state. Determining this is simply not possible in any language / framework which allows for dynamic code generation or is simply dynamic (if it's possible, it's extremely difficult). The reason why is that the set of derived types is not fixed because a new one can be generated at runtime.

Take C# as an example. There is nothing stopping me from generating a derived class at runtime which overrides that virtual method and modifies state. A static verified would not be able to detect this type of modification and hence could not validate the method was pure or not.

JaredPar 2009-10-22 19:07:55

Oh, good point. Polymorphism would definitely complicate things. Although that could be fixed by putting a "pure constraint" on the base virtual function.

Mason Wheeler 2009-10-22 19:21:24

Answer 2

+10 A:

You also need to annotate every system call, every FFI, ...

And furthermore the tiniest 'leak' tends to leak into the whole code base.

It is not a theoretically intractable problem, but in practice it is very very difficult to do in a fashion that the whole system does not feel brittle.

As an aside, I don't think this makes a good PhD thesis; Haskell effectively already has (a version of) this, with the IO monad.

And I am sure lots of people continue to look at this 'in practice'. (wild speculation) In 20 years we may have this.

Brian 2009-10-22 19:09:10

Computer AI though is only 4-6 years away and sure it will be able to solve this problem for us :)

JaredPar 2009-10-22 19:47:42

In Haskell every function is pure. `IO` doesn't change that. Well, maybe except of those ones that do `unsafePerformIO`.

Andrey Vlasovskikh 2009-10-22 19:59:18

Anootating all the system class is that .NET is doing for their Code Contracts. When in doubt, it assumes the method is not pure.

Jonathan Allen 2009-10-22 20:08:30

Jared: Hasn't computer AI been "only 4-6 years away" for the last 30 years or so? :P

Mason Wheeler 2009-10-22 20:17:03

@Mason or maybe even 50 years or so %)

Vanya 2009-10-22 20:35:31

@Mason - I think Jared is joking.

Chris Lutz 2009-10-22 22:07:41

Answer 3

+5 A:

It is particularly hard in Python. Since anObject.aFunc can be changed arbitrarily at runtime, you cannot determine at compile time which function will anObject.aFunc() call or even if it will be a function at all.

Rafał Dowgird 2009-10-22 19:13:48

Ouch! All right, I can see how that would make it more difficult.

Mason Wheeler 2009-10-22 19:24:50

There's more - eval, setattr(), __gettattr()__ doing weird things, etc. Features like this make languages hard to analyze statically.

Rafał Dowgird 2009-10-22 21:20:38

Answer 4

+4 A:

In addition to the other excellent answers here: Your pseudocode looks only at whether a function modifies variables. But that's not really what "pure" means. "Pure" typically means something closer to "referentially transparent." In other words, the output is completely dependent on the input. So something as simple as reading the current time and making that a factor in the result (or reading from input, or reading the state of the machine, or...) makes the function non-pure without modifying any variables.

Also, you could write a "pure" function that did modify variables.

Craig Stuntz 2009-10-22 19:28:52

+1 Being side-effect free doesn't mean using only immutable values.

Andrey Vlasovskikh 2009-10-22 20:02:46

Answer 5

A:

I think the main problem would be doing it efficiently.

D-language has pure functions but you have to specify them yourself, so the compiler would know to check them. I think if you manually specify them then it would be easier to do.

egon 2009-10-22 21:58:41

Answer 6

A:

Deciding whether a given function is pure, in general, is reducible to deciding whether any given program will halt - and it is well known that the Halting Problem is the kind of problem that cannot be solved efficiently.

Justice 2009-10-22 22:04:17

I know about the halting problem, but why do you say it's equivalent?

Mason Wheeler 2009-10-22 22:25:46

Suppose I write a program P which simulates the execution of some input function F (i.e. P is an interpreter). Suppose P is written such that it halts upon complete evaluation of F and immediately after executing any impure step of F (with an output indicating why it halted). Because some F might have the form `f a = f a` - Haskell syntax for a function with one argument that simply calls itself with the same argument recursively ad infinitum - there is some F for which P executing F will not halt. Thus, the OP's question is reducible to the halting problem.

Justice 2009-10-23 01:32:51

Note that in Haskell, if we get rid of the language's loopholes permitting certain impure code inside otherwise pure code, the compiler can easily determine what code is for sure pure and what code is not for sure pure simply by looking at the types. Haskell represents impurity at the type level.

Justice 2009-10-23 01:35:23

Generally speaking, pure functions in computing don't exclude non-terminating (or exceptionally terminating) functions for this very reason. Haskell actually defines nontermination to be a special kind of return value (the 'bottom') which effectively poisons any operation that attempts to inspect it.

bdonlan 2009-10-23 03:52:53

Answer 7

A:

Note that the complexity depends on the language, too. For the more dynamic languages, it's possible to redefine anything at any time. For example, in Tcl

proc myproc {a b} {
    if { $a > $b } {
        return $a
    } else {
        return $b
    }
}

Every single piece of that could be modified at any time. For example:

the "if" command could be rewritten to use and update global variables
the "return" command, along the same lines, could do the same thing
the could be an execution trace on the if command that, when "if" is used, the return command is redefined based on the inputs to the if command

Admittedly, Tcl is an extreme case; one of the most dynamic languages there is. That being said, it highlights the problem that it can be difficult to determine the purity of a function even once you've entered it.

RHSeeger 2009-10-23 03:49:21

ansaurus

tags:

views:

answers:

Why is determining if a function is pure difficult?

related questions