tags:

views:

346

answers:

4

While playing around with D 2.0 I found the following problem:

Example 1:

pure string[] run1()
{
   string[] msg;
   msg ~= "Test";
   msg ~= "this.";
   return msg;
}

This compiles and works as expected.

When I try to wrap the string array in a class I find I can not get this to work:

class TestPure
{
    string[] msg;
    void addMsg( string s )
    {
       msg ~= s;
    }
};

pure TestPure run2()
{
   TestPure t = new TestPure();
   t.addMsg("Test");
   t.addMsg("this.");
   return t;
}

This code will not compile because the addMsg function is impure. I can not make that function pure since it alters the TestPure object. Am i missing something? Or is this a limitation?

The following does compile:

pure TestPure run3()
{
    TestPure t = new TestPure();
    t.msg ~= "Test";
    t.msg ~= "this.";
    return t;
}

Would the ~= operator not been implemented as a impure function of the msg array? How come the compiler does not complain about that in the run1 function?

+3  A: 

Hi,

Please review the definition of pure functions:

Pure functions are functions that produce the same result for the same arguments. To that end, a pure function:

  • has parameters that are all invariant or are implicitly convertible to invariant
  • does not read or write any global mutable state

One of the effects of using pure functions is that they can be safely parallelized. However, it's not safe to execute several instances of your function in parallel, because they could both modify the class instance simultaneously, causing a synchronization problem.

CyberShadow
"they could both modify the class instance simultaneously" .. what?? how?
hasen j
auto tp = new TestPure; void threadFunc() { tp.addMsg("Hello world"!); } new Thread( new Thread(It is impossible to guarantee that the two threads executing threadFunc will not attempt to write to the tp instance at the same time. Thus, addMsg can't be pure.
CyberShadow
Ugh... Above comment except not botched together: http://dump.thecybershadow.net/ca761137c80da1cee3f2657b215f758d/00000039.txt
CyberShadow
I agree that TestPure addMsg can not be pure in general since it changes its state.I was just hoping that the compiler could figure out somehow that run2 is actually valid as being pure since there is no way that it changes data that is visible to any other thread since it is the creator of the TestPure object.Is there some other qualifier that could be added to the addMsg function that could help the compiler here?
James Dean
the pure function in question is run2 not addMsg. Given that it is sequentially calling addMsg on a locally constructed object, this should be OK (both from a race condition and pure standpoint).
BCS
The problem is that there's no way for the compiler to know that run2 is the *only* thing calling addMsg; anything else in the program could call it on a different instance.Purity has to hold for all instances, not just a local one.
DK
A: 

Just a hunch, but this function doesn't always return the same result.

See, it returns a reference to some object, and while the object will always contain the same data, the objects returned by several calls to the same functions are not identical; that is, they don't have the same memory address.

When you return a reference to the object, you're essentially returning a memory address, which is going to be different across several calls.

Another way to think of it, part of the return value is the memory address of an object, which is dependent on some global state(s), and if the output of a function depends on global state, then it's not pure. Hell, it doesn't even have to depend on it; as long as a function reads a global state, then it's not pure. By calling "new", you are reading global state.

hasen j
Your logic implies that pure functions should only be able to return value types that can fit in the processor's registers...
CyberShadow
The following does compile and shows that different objects are returned for each call.pure int* test_pure2(){ int* p = new int; *p = 42; return p;}int* i1 = test_pure2();int* i2 = test_pure2();writeln( i1, i2 );
James Dean
The pure function will return different instances of the same value. It's obvious that the pointers can't be identical for the two instances. I think it's clear what is meant here - you're not supposed to use the results' addresses (otherwise - see my comment above).
CyberShadow
A: 

I think that your code is conceptually correct. However you may have found case where the compiler's semantic analysis is not as good as your brain's.

Consider the case where the class's source is not available. In that cases the compiler would have no way of telling that addMsg only modifies member variable so it can't allow you to call it from a pure function.

To allow it in your case, it would have to have special case handling for this type of usage. Every special case rule added makes the language more complicated (or, if left undocumented, makes it less portable)

BCS
+3  A: 

Others have already pointed out that addMsg is not pure and cannot be pure because it mutates the state of the object.

The only way to make it pure is to encapsulate the changes you're making. The easiest way to do this is via return mutation, and there are two ways to implement this.

Firstly, you could do it like this:

class TestPure
{
    string[] msg;
    pure TestPure addMsg(string s)
    {
        auto r = new TestPure;
        r.msg = this.msg.dup;
        r.msg ~= s;
        return r;
    }
}

You need to copy the previous array because inside a pure function, the this reference is actually const. Note that you could do the copy better by allocating a new array of the final size and then copying the elements in yourself. You would use this function like so:

pure TestPure run3()
{
    auto t = new TestPure;
    t = t.addMsg("Test");
    t = t.addMsg("this.");
    return t;
}

This way, the mutation is confined to each pure function with changes passed out via return values.

An alternate way of writing TestPure would be to make the members const and do all the mutation before passing it to the constructor:

class TestPure
{
    const(string[]) msg;
    this()
    {
        msg = null;
    }
    this(const(string[]) msg)
    {
        this.msg = msg;
    }
    pure TestPure addMsg(string s)
    {
        return new TestPure(this.msg ~ s);
    }
}

Hope that helps.

DK
I guess this is probably the best it gets. I was just hoping we could avoid generating all the garbage and time spend making copies somehow.
James Dean
It's funny you say that: most functional languages generate tremendous amounts of garbage for exactly the same reasons. It's why their garbage collectors tend to be so efficient.This isn't something that's easily solved. If you need to mutate, don't use pure.
DK