tags:

views:

213

answers:

5

There is a fairly common pattern used in .NET to test for the capabilities of a class. Here I'll use the Stream class as an example, but the issue applies to all classes that use this pattern.

The pattern is to supply a boolean property called CanXXX to indicate that capability XXX is available on the class. For example, the Stream class has CanRead, CanWrite and CanSeek properties to indicate that the Read, Write and Seek methods can be called. If the properties value is false, then calling the respective method will result in a NotSupportedException being thrown.

From the MSDN documentation on the stream class:

Depending on the underlying data source or repository, streams might
support only some of these capabilities. An application can query a stream for its capabilities by using the CanRead, CanWrite, and CanSeek properties.

And doco for the CanRead property:

When overridden in a derived class, gets a value indicating whether the current stream supports reading.

If a class derived from Stream does not support reading, calls to the Read, ReadByte, and BeginRead methods throw a NotSupportedException.

I see a lot of code written along the lines of the following:

if( stream.CanRead )
{
    stream.Read( ... )
}

Note that there is no synchronisation code, say, to lock the stream object in any manner - other threads may be accessing it or objects that it references. There is also no code to catch a NotSupportedException.

The MSDN documentation does not state that the property value can not change over time. In fact, the CanSeek property changes to false when the stream is closed, demonstrating the dynamic nature of these properties. As such, there is no contractual guarantee that call to Read() in the above code snippet will not throw a NotSupportedException.

I expect that there is a lot of code out there that suffers from this potential problem. I wonder how those who have identified this issue have addressed it. What design patterns are appropriate here?

I'd also appreciate comments on the validity of this pattern (the CanXXX, XXX() pairs). To me, at least in the case of the Stream class, this represents a class/interface that is trying to do too much and should be split into more fundamental pieces. The lack of a tight, documented contract makes testing impossible and implementation even harder!

+1  A: 

stream.CanRead just checks whether underlying stream has possibility of reading. It does not say anything about whether actual reading will be possible (e.g. disk error).

There is no need to catch NotImplementedException if you used any of *Reader classes since they all support reading. Only *Writer will have CanRead=False and throw that exception. If you are aware that stream supports reading (e.g. you used StreamReader), IMHO there is no need to make additional check.

You still need to catch exceptions since any error during read will throw them (e.g. disk error).

Also notice that any code that is not documented as thread-safe is not thread-safe. Usually static members are thread safe, but instance members aren't - however, there is need to check documentation for each class.

Josip Medved
"There is no need to catch NotImplementedException if you used any of *Reader classes" (I think you mean NotSupportedExpection). What about a framework that only deals with the abstract Stream class? The only thing it knows is the contract on the Stream interface - no wait, that's not defined at all, so it knows nothing. Hence, the framework can't be written correctly... This was never a question of thread safety, rather, it's about the contract *implied* (since it is clearly not documented) by the Stream base class interface.
Daniel Paull
A: 

This sounds more like a theoretical problem than a practical one. I can't really think of any situations in which a stream would become unreadable/unwritable other than due to it being closed.

There may well be corner cases, but I wouldn't expect them to show up often at all. I don't think the vast majority of code needs to worry about this.

It's an interesting philosophical problem though.

EDIT: Addressing the question of whether or not CanRead etc are useful, I believe they still are - mostly for argument validation. For example, just because a method takes a stream which it's going to want to read at some point doesn't mean it wants to read it right at the start of the method, but that's where the argument validation should ideally be performed. This is really no different to checking whether a parameter is null and throwing ArgumentNullException instead of waiting for a NullReferenceException to be thrown when you first happen to dereference it.

Also, CanSeek is slightly different: in some cases your code may well cope with both seekable and non-seekable streams, but with more efficiency in the seekable case.

This does rely on the "seekability" etc remaining consistent - but as I've said, that appears to be true in real life.


Okay, let's try putting this another way...

Unless you're reading/seeking within memory and you've already made sure there's enough data, or you're writing within a preallocated buffer, there's always a chance things will go wrong. Disks fail or fill up, networks collapse etc. These things do happen in real life, so you always need to code in a way which will survive failure (or consciously choose to ignore the problem when it doesn't really matter).

If your code can do the right thing in the case of a disk failure, chances are it can survive a FileStream turning from writable to non-writable.

If Stream did have firm contracts, they'd have to be incredibly weak - you couldn't use static checking to prove that your code will always work. The best you could do is to prove that it did the right thing in the face of failure.

I don't believe Stream is going to change any time soon. While I certainly accept that it could be better documented, I don't accept the idea that it is "completely broken." It would be more broken if we couldn't actually use it in real life... and if it could be more broken than it is now, it's logically not completely broken.

I have far bigger issues with the framework, such as the relatively poor state of date/time APIs. They've become a lot better in the last couple of versions, but they're still missing a lot of the functionality of (say) Joda Time. The lack of built-in immutable collections, poor support for immutability in the language etc - these are real problems which cause me actual headaches. I'd rather see them addressed than spend ages on Stream which seems to me to be a somewhat intractable theoretical problem which causes few issues in real life.

Jon Skeet
(Adding to what you said.) I believe the problem becomes: "Should we use the CanXXX property for anything?" If the operation is reading a stream and it fails, you have to throw an exception. This isn't like the `TryParse` methods.
280Z28
Will edit to address this.
Jon Skeet
If by "theoretical" you mean "mathematically correct", then yes this is a theoretical problem. To me this is so important - if Microsoft can't document the contract on an abstract class, how can anyone every program against it? You don't know what assumptions you can make, or even worse, what assumptions others have made. Are we all doomed?
Daniel Paull
I mean "theoretical" as "yes, a stream *could* change whether or not it's readable without being closed, but it doesn't tend to happen in real life." Yes, there are plenty of interfaces and abstract classes which aren't as well designed as they might be, unfortunately.
Jon Skeet
Doesn't happen in real life... wow. If it doesn't happen, then it should not be allowed to happen and documented as such. You are right that this abstract class is not as well designed as it might be as it is pretty hard to get worse than completely broken.
Daniel Paull
But, Jon, interfaces are used also so that others can implement them. If I'm using a stream class that someone else wrote, then all bets are off other than what is specified clearly in the contract.
Jesse Pepper
@Jesse: I agree with you in theory - but in practice, we seem to be using streams pretty reasonably in real life, don't we? Put it this way: has this specific possibility *ever* been an issue in any real life software you've seen? I'm all for trying to make things as robust and well-specified as possible, but I'm finding it hard to get too worked up about this case.
Jon Skeet
@Daniel: If people are using streams successfully, I fail to see how you can describe them as "completely broken."
Jon Skeet
@Jon - its all about dumb luck. I have a saying that is appropriate here, "the only thing worse than doing the wrong thing for the right reason is doing the right thing for the wrong reason." As far as I can see, everyone using .Net streams who are not up in arms about the lack of documentation on the Stream interface are doing the right thing for the wrong reasons. That just leads to perpetuating the poor quality of software that is indicative of current software development practices.
Daniel Paull
@Daniel: So I take it you conscientiously avoid ever using Stream in your code, given how it's so "completely broken"? I prefer to have practical code which works in real life than code which is theoretically perfect but actually doesn't do anything. You ask "how can anyone ever program against [Stream]?" but people are doing so successfully every day. Strangely enough we *don't* seem to be completely doomed. I'll ask again: has this possibility ever been an issue in any real life software you've seen?
Jon Skeet
Note that I'm in no way arguing against well-designed interfaces - I'm just arguing that you seem to be making a lot of fuss about something which doesn't *actually* seem to be causing any practical problems.
Jon Skeet
My practical problem was writing an adaptor to a C++ stream library that had very clear contract defined on its interface. I feel like the adaptor is a time bomb.
Daniel Paull
"These things do happen in real life, so you always need to code in a way which will survive failure" - I've never said that you should not catch exceptions for conditions that you can deal with. "Intractable theoretical problem" - can you prove that? Surely writing a little doco is not intractable... isn't that all it needs?
Daniel Paull
"I prefer to have practical code which works in real life than code which is theoretically perfect but actually doesn't do anything." Why do you consider the two to be mutually exclusive? I have many examples when a good theoretical basis has led to extremely practical implementations - they work in the real world and have been driven by pragmatic reasoning.
Daniel Paull
What exactly should those documents say? Why not propose some to Microsoft, instead of just claiming that it's "completely broken"? I suspect it would be reasonable to say that while the stream is open, a stream should not become unreadable, unwritable or unseekable if it initially *does* have that ability - although I wouldn't want to make guarantees the other way round. I agree that there is a fundamental impedance mismatch between a library with firm contracts and one without, and I've agreed that .NET *could* be better specified. I just think you're being overly dramatic about it.
Jon Skeet
@Daniel: Yes, they can work together. You can also get situations where everything sounds ever so elegant, but is very impractical. (The Java DOM API is a classic example of this - so pluggable that it's a pain to actually use.) To answer the question in your title: in all practical situations I've seen, unless another thread closes the stream while you're using it (in which case you've got bigger problems) the CanXXX pattern *is* safe. No, it's not guaranteed... but I've never seen it be a problem.
Jon Skeet
By the way, it sounds like you might enjoy Code Contracts (http://research.microsoft.com/contracts) if you haven't seen it. I doubt that it'll help you in this particular case, but it shows a possible way forward. (There's nothing particularly new in the idea of contracts, of course - but this is the biggest "push" to make it mainstream that I've seen.)
Jon Skeet
For the moment, I would suggest you *assume* a contract of CanRead/CanWrite/CanSeek never going from true to false other than due to the stream closing.
Jon Skeet
(I'm trying to work out whether this answer + comments is actually a constructive conversation or whether I should just delete it as an unhelpful argument. Hmm.)
Jon Skeet
I feel that your answer+comments is useful and constructive. It seems that you required a bit of provoking to get out all you had to say though.
Daniel Paull
@Daniel: Right, I'm starting a new answer to hopefully make things clearer and more constructive, then we can consider ditching this one.
Jon Skeet
@Jon: Your patience is amazing. I would have upvoted for that reason, but wanted to preserve this moment for as long as possible - a Jon Skeet answer with -1 net votes.
John Saunders
@John: It happens. The other day I had an accepted answer with a net -1 :) Hopefully my newer answer is more constructive though.
Jon Skeet
I don't think your new answer captures the entire sentiment of this discussion, be it over dramatised or not. I see no reason why you would delete this answer.
Daniel Paull
+1  A: 

From your question and all the subsequent commentary, I'm guessing that your issue is with the clarity and "correctness" of the stated contract. The stated contract being what is in the MSDN online documentation.

What you've pointed out is that there is something missing from the documentation which forces one to make assumptions about the contract. More specifically, because there is nothing said about the volatility of the readability property of a stream, the only assumption that one can make is that it is possible for a NotSupportedException to be thrown, regardless of what the value of the corresponding CanRead property was a few milliseconds (or more) prior.

I think that one needs to go on the intent of this interface in this case, that is:

  1. If you use more than one thread, all bets are off;
  2. until you go calling something on the interface that will potentially change the state of the stream, you can safely assume that the value of CanRead is invariant.

Notwithstanding the above, Read* methods may potentially throw a NotSupportedException.

The same argument can be applied to all the other Can* properties.

Eric Smith
Eric - your interpretation of the interface's intent is quite good, but I *hate* the idea that contracts in programming are like laws - open to interpretation based on *perceived* intent. Imagine if software developers could not put in place a EULA that completely indemnifies themselves - would you stake your business on developers implying an intent of an interface, or would you rather have a strong mathematically defined interface? We can probably all learn a bit from ADA here, with it's fancy pre and post conditions and what not.
Daniel Paull
+1  A: 

Without knowing the internals of an object, you must assume that a "flag" property is too volatile to rely on when the object is being modified in multiple threads.

I have seen this question more commonly asked about read-only collections than streams, but I feel it's another example of the same design patter, and the same arguments apply.

To clarify, the ICollection interface in .NET has the property IsReadOnly, which is intended to be used as an indicator of whether the collection supports methods to modify its contents. Just like streams, this property can change at any time and will cause InvalidOperationException or NotSupportedException to be thrown.

The discussions around this usually boil down to:

  • Why isn't there an IReadOnlyCollection interface instead?
  • Whether NotSupportedException is a good idea.
  • The pros and cons of having "modes" versus distinct concrete functionality.

Modes are rarely a good thing, as you are forced to deal with more than one "set" of behaviour; having something which can switch modes at any time is considerably worse, as your application now has to deal with more than one "set" of behaviour too. However, just because it's possible to break something down into more discreet functionality does not necessarily mean you always should, particularly when breaking it apart does nothing to reduce the complexity of the task at hand.

My personal opinion is that you have to pick the pattern which is closest to the mental model you perceive the consumers of your class will understand. If you are the only consumer, pick whichever model you like most. In the case of Stream and ICollection, I think that having a single definition of these is much closer to the mental model built up by years of development in similar systems. When you talk about streams, you talk about file streams and memory streams, not whether they're readable or writeable. Similarly, when you talk about collections, you rarely refer to them in terms of "writeability".

My rule of thumb on this one: Always look for a way to break down the behaviours into more specific interfaces, rather than having "modes" of operation, as long as it compliments a simple mental model. If it's hard to think of the separate behaviours as separate things, use a mode-based pattern and document it very clearly.

Programming Hero
This is a well thought out answer, and I agree with most of what is written. Regarding the NotSupportedException, I ranted about that here: http://stackoverflow.com/questions/410719. One thing I would add is that when the mental model of the masses is clearly flawed, influential developers (like Sun and Microsoft) have an obligation to educate the unwashed and move the industry in a better direction. This does not seem to be the trend in Software Development and for some reason we all seem to accept the rubbish that is pushed upon us.
Daniel Paull
Unfortunately, mental models are the ultimate backwards-compatability requirement in legacy system; we unwashed masses don't enjoy having to re-learn something that worked perfectly well before, even if the new way is superior!
Programming Hero
Ok, so what if the legacy system did not work perfectly well? If it indeed worked perfectly well, then there would be no new superior way of doing things.
Daniel Paull
Perhaps I'm just an unreasonable man - "The reasonable man adapts himself to the world; the unreasonable man persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man" - From 'Maxims For Revolutionists' by George Bernard Shaw. Mind you, you are promoting a man adapting the world to other people - that is just plain weird. What is your aversion to doing things better? The great unwashed who 'don't enjoy having to re-learn something' should not be trying to write software or work in any other engineering discipline.
Daniel Paull
I like that phrase "adapting the world to other people" quite a lot. In my ongoing work in software, with user interfaces particularly, I find that trying to adapt my systems to people not accustomed to it, is one of the major goals.I personally believe that the user interface with any product is the single most important part of that product, be it GUI, command-line or programming API. Creating interfaces that users don't have to adapt to would be a huge accomplishment of that goal.
Programming Hero
+2  A: 

Okay, here's another attempt which will hopefully be more useful than my other answer...

It's unfortunate that MSDN doesn't give any specific guarantees about how CanRead/CanWrite/CanSeek may change over time. I think it would be reasonable to assume that if a stream is readable it will continue to be readable until it is closed - and the same will hold for the other properties

In some cases I think it would be reasonable for a stream to become seekable later - for instance, it might buffer everything it reads until it reaches the end of the underlying data, and then allow seeking within it afterwards to let clients reread the data. I think it would be reasonable for an adapter to ignore that possibility, however.

This should take care of all but the most pathological cases. (Streams pretty much designed to cause havoc!) Adding these requirements to the existing documentation is a theoretically breaking change, even though I suspect that 99.9% of implementations will obey it already. Still, it might be worth suggesting on Connect.

Now, as for the discussion between whether to use a "capability-based" API (like Stream) and an interface-based one... the fundamental problem I see is that .NET doesn't provide the ability to specify that a variable has to be a reference to an implementation of more than one interface. For example, I can't write:

public static Foo ReadFoo(IReadable & ISeekable stream)
{
}

If it did allow this, it might be reasonable - but without that, you end up with an explosion of potential interfaces:

IReadable
IWritable
ISeekable
IReadWritable
IReadSeekable
IWriteSeekable
IReadWriteSeekable

I think that's messier than the current situation - although I think I would support the idea of just IReadable and IWritable in addition to the existing Stream class. That would make it easier for clients to declaratively express what they need.

With Code Contracts, APIs can declare what they provide and what they require, admittedly:

public Stream OpenForReading(string name)
{
    Contract.Ensures(Contract.Result<Stream>().CanRead);

    ...
}

public void ReadFrom(Stream stream)
{
    Contract.Requires(stream.CanRead);

    ...
}

I don't know how much the static checker can help with that - or how it copes with the fact that streams do become unreadable/unwritable when they're closed.

Jon Skeet