tags:

views:

210

answers:

8

I'm trying to understand REST. Under REST a GET must not trigger something transactional on the server (this is a definition everybody agrees upon, it is fundamental to REST).

So imagine you've got a website like stackoverflow.com (I say like so if I got the underlying details of SO wrong it doesn't change anything to my question), where everytime someone reads a question, using a GET, there's also some display showing "This question has been read 256 times".

Now someone else reads that question. The counter now is at 257. The GET is transactional because the number of views got incremented and is now incremented again. The "number of views" is incremented in the DB, there's no arguing about that (for example on SO the number of time any question has been viewed is always displayed).

So, is a REST GET fundamentally incompatible with any kind of "number of views" like functionality in a website?

So should it want to be "RESTFUL", should the SO main page either stop display plain HTML links that are accessed using GETs or stop displaying the "this question has been viewed x times"?

Because incrementing a counter in a DB is transactional and hence "unrestful"?

EDIT just so that people Googling this can get some pointers:

From http://www.xfront.com/REST-Web-Services.html :

4. All resources accessible via HTTP GET should be side-effect free. That is, the request should just return a representation of the resource. Invoking the resource should not result in modifying the resource.

Now to me if the representation contains the "number of views", it is part of the resource [and in SO the "number of views" a question has is a very important information] and accessing it definitely modifies the resource.

This is in sharp contrast with, say, a true RESTFUL HTTP GET like the one you can make on an Amazon S3 resource, where your GET is guaranteed not to modify the resource you get back.

But then I'm still very confused.

+1  A: 

Just because the page is accessed via a GET, doesn't mean that there isn't a way to increment the counter. You could use an AJAX POST for example.

I also think that this kind of "passive" transaction could probably be safely ignored. There's a big difference between visiting a URL and deleting an object somewhere, and visiting a URL and incrementing a visit counter. I'd be interested to hear other views on the subject though.

Edit: I think Håvard S and I are basically in agreement, that a GET that triggers a counter may be technically un-RESTful, but isn't something worth worrying about.

Skilldrick
@Skilldrick: I don't get this... The GET is initiated from the client side. The number of views must be saved on the backend when any GET happens. The server can trivially increment a counter upon any GET (and must do so if it wants to keep track of the number of views). My question is really to know if a server that increment a counter upon a GET initiated from the client-side is fundamentally unrestful.
cocotwo
@Skilldrick: to be more precise: my question is not as to how the increment a counter or not. My question is really to know that in the case you increment a counter on a GET, are you fundamentally unrestful?
cocotwo
What I meant in the first part of my answer was that you can have an entirely un-transactional GET request, which is followed by a transactional POST via AJAX. If you want to maintain pure REST, then this would be one method.
Skilldrick
@Skilldrick: oh I see: it's interesting but at first it looks a bit weird :) Then it would fail to account for people who have Javascript turned off (I do, I'm using the "NoScript" extension for most websites I visit, only whitelisting the websites that absolutely need it).
cocotwo
Any reason for the downvote?
Skilldrick
@Skilldrick: I understand that... But then we can take the reasoning further: *"should we worry about REST at all?"* if something as trivial as a GET that updates a counter is non-RESTFUL?
cocotwo
@cocotwo I think REST is worth worrying about, but I think you need to set a sensible threshold for side-effects.
Skilldrick
@Skilldrick A GET that triggers a counter is not un-RESTful, technically or otherwise.
Darrel Miller
Oh and BTW it wasn't me that downvoted you :-)
Darrel Miller
+6  A: 

IMO avoiding a statistics update in a GET request because "someone said so" is being dogmatic about ReST. Do what is pragmatic. If that involves updating a counter when responding to a GET request, so be it.

To elaborate further, what is really important (and the reason the advice is there) is that the resource a consumer is accessing is not updated or altered in any manner when the consumers intent is to read it. However, updating other data, in particular stuff like logs and statistics, is not a problem. In short, reading a resource should not have side-effects on the resource being read.

EDIT: To answer your case of a self-incrementing counter, ask yourself what the context you apply is. Clearly, if you define a resource called counterThatIncrementsItselfWhenBeingRead, then it either:

  • Breaks ReSTfulness since a read-incrementing counter is a self-contradictory resource if the only rule is that GET can never have side-effects, or
  • Is just fine given a different context, where you for instance take a very short resource lifespan into account, and choose to view the increment as something that happens after you have read the resource (or more generally at the resource owner's discression)

Regardless of the resolution you choose to apply, the issue is really about what the expected behavior is. IMO, a counter that increments itself when being read should be incrementing itself when being read. I still access a representation of a resource, albeit one with a very short lifespan, which I know will be changed an instant after I have read it. There's nothing non-ReSTful about that.

Håvard S
@Havard S: it is not about being dogmatic or not. It is about being REST or not being REST, following the REST definition. My webapps are not RESTFUL and I'm perfectly fine with it. What I want to know is if true RESTFUL applications are allowed to transactionally modify a counter upon a GET.
cocotwo
@cocotwo: See elaborated answer, which hopefully clarifies a bit.
Håvard S
@Håvard S: I don't disagree with your answer and I welcome your elaborated answer and I see what you mean but still, "number of views" on SO is more than logs and statistics: it's about determining active/hot questions and earning special badges for questions read a lot of times, etc. I also disagree about dogmatic/pragmatic and "because someone said so" comment that got you +votes because it's not really adressing the question. REST has a very precise definition and I want to know what that definition is. I'm certainly not planning on avoiding statistics, I'm planning on doing not-REST ;)
cocotwo
@Håvard S: Great answer. @cocotwo: You have to consider the purpose behind not triggering a data change due to a GET. The purpose is to not update the particular item you are retrieving; which has nothing at all to do with ancillary data. Also, ALL definitions, standards, etc, are simply guidelines. They are NOT commandments. The reality is that programming is much more fluid than any definition can handle.
Chris Lively
@Chris Lively: I agree about the 'particular item' but it is completely wrong and misleading to write with an uppercase 'all' that definitions and standards are simply guidelines. Standards are standards and a RFC is an RFC and any programmer that does not respect them is writing not fluid but broken code. A standard *is* a commandment. And RFC *is* a commandment.
cocotwo
@cocotwo: Sorry, but even RFC's are not commandments. They are spec's developed to try and ease communication between various systems. These spec's can be (and usually are) implemented in a number of different ways. Email is a prime example. There are several RFCs governing how email servers should communicate and yet every single server out there has it's own quirks/differences from those specs. My point is, those things can get you close, but there is always divergence.
Chris Lively
@cocotwo Updated to elaborate on the case of the counter that increments when it is being read. Does that clear things up for you? There really is no conflict between ReST and the counter, given a proper context.
Håvard S
@Chris Lively: divergence are due to poor specs. And there are several mail servers which are provably broken because their implementors didn't understand the 'MUST', 'MUST NOT', 'MAY', 'MAY NOT' etc. terms as defined by RFC2119. And at least you can point fingers at the culprit (which is all to often Microsoft). All our JavaDocs *start* with a pointer to RFC2119. These words have a meaning and they MUST be followed. I know this is something Microsofties have a problem understanding but I'm sorry, you cannot say that when an RFC states "MUST" you can interpret is as "MAY".
cocotwo
A: 

One point you'd need to look out for: GET requests can be triggered by bots, e.g. search engines. These would skew your statistics if you don't allow for them.

jammycakes
+4  A: 

What matters is that from a client point of view GET is safe (has no side effects) by definition and that a client therefore can safely call GET any number of times without considering any side effect that might have.

What a server does is the server's responsibility. In the case of the view counter the server has to make the decision if it considers the update of the counter a side effect. Usually it won't because the counter is part of the semantic of the resource in the first place.

However, the server might decide NOT to increment the counter for certain requests, such as a GET by a crawler.

Jan

Jan Algermissen
Right: "Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them." -- http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p2-semantics-08.html#safe.methods
Julian Reschke
@Jan: but that doesn't tell the whole story... The whole idea about a GET, if RESTFUL, is that the same GET performed by, say, a hundred different users, could be processed by potentially a hundred different servers (scalability being a huge benefit of REST and the reason with highly scalable service like Amazon's S3 are RESTFUL). And this can't work if there is any side-effect!? (I'm getting actually more and more confused).
cocotwo
@Julian Reschke: You're pointing to a documentation that documents a regular HTTP GET, not a REST GET. The REST HTTP GET definition AFAIK is strenghtened compared to a regular HTTP GET, which is kinda the whole point of my question. So a regular HTTP GET may or may not have side effects and you can't decide. But if REST defines that a GET cannot modify anything then you can be sure that it won't, right?
cocotwo
@Jan Algermissen: IBM article on the subject: ... *there are many cases of unattractive Web APIs that use HTTP GET to trigger something transactional on the server... In these cases the GET request URI is not used properly or at least not used RESTfully* https://www.ibm.com/developerworks/webservices/library/ws-restful/ I think side-effects are not RESTFUL.
cocotwo
@cocotwo The Uniform Interface constraint in REST says you should use http methods exactly as described in RFC2616. REST does not "strengthen" or change in any way the definition of a GET. When reading stuff on REST you really have to look at the source of the information there is lots of misinformation around. If Julian or Jan tell you something, I personally would take that as fact, unless Roy himself told me differently.
Darrel Miller
Jan is perfectly right, and this is IMO the only "RESTful" answer. A client can't be held responsible for any change occuring as a result of a GET, that's what "safe" means, nothing more, nothing less. Incrementing a counter or writing a log entry on a GET is perfectly RESTful.
Stefan Tilkov
@Darrel Miller: and @Stefan Tilkov: ok thanks all, I'm accepting Jan's answer and I'll be reading what Stefan wrote on the subject. It's still not 100% clear but at least now I understand it a bit better.
cocotwo
+4  A: 

You are mixing a couple of issues here. A single request to a REST interface CAN trigger a back-end transaction. However, that transaction must start and finish within the scope of the single request.
What a REST interface should not do is have multiple independent requests participate in the same back-end "two phase commit" transaction.

The second issue is whether a GET request can do updates. As Jan points out in his answer the GET is allowed to have side effects as under certain conditions. He says it much better than I could so read his answer for why.

Darrel Miller
@Darrel Miller: +1, I don't know who modded you down. Thanks for clearing up my misconception about REST, this is very interesting!
cocotwo
+2  A: 

GET is only safe and idempotent with regards to the resource identified by the request - that is all the client needs to, and should, be concerned with.

The easiest way to think about this is to consider any mechanism performing such counts as an intermediary (i.e. you are leveraging the layered constraint) which monitors the request/responses and updates some other view count resource, rather than the actual resource itself.

Mike
+1  A: 

A POST is for sending information the client supplies to the server. That is not happening here, so POST is unnecessary.

In order to maintain the statelessness of the HTTP interaction (which I believe is the purpose of REST), it is not necessary for GETs not to cause any state changes; but what is required is that it doesn't hide states from the client; i.e. any amount of state with future consequences for the HTTP interaction will need to be encoded into the URL space so the client can use it to address future requests.

The counter is part of the state iff its value will affect future interactions - for instance, if after every millionth increment, the "please book a bus tour on which we will try to sell you real estate property in Orlando" subsystem kicks in. REST basically says that in such cases, it should be part of the URL space, so the state can be maintained explicitly as part of the addressing - for instance, you might generate a GET to a URL to which a string ?counter=$cnt is appended (with $cnt the value of the counter).

If not, it is just part of the view - there is no reason for the client ever to feed it, or any other information based on it, back to the server, so there is no need for it to be present in a URL (or anywhere else). You display it and discard it.

reinierpost
A: 

One more thought:

I think the "read count" must not be incremented. Think of all sorts of bots/crawlers/caching applications that do not represent a "read". Instead there could be some way to trigger a real read (by the client).

/question/2363294 --> Get returns the question but does not increment the counter /question/2363294/readCount --> Get returns the current read count /question/2363294/readCount --> Post updates the read count

Then my nice client app just posts the read, as soon as I realy have taken a look at it. And not when it has been downloaded, e.g. because of prefetching ...

cedarsoft