views:

91

answers:

2

I am working on a project that is in transition from proof-of-concept to something worthy of a pilot project. One of the key improvements for this phase of development is to move away from the current "persistence" mechanism, which uses a hash table held in memory and periodically dumped to a file, to a more traditional database back-end.

The application itself is designed with ReSTful principles in mind. It uses Jersey to provide access to resources and jQuery to present these resources to the user and enable basic interactions (create, update, delete). Pretty straight-forward.

I have used JPA and Hibernate successfully in the past for persisting server-side state in a relational database, so it seemed a natural choice in this case. With a few minor changes to the model entities, I was able to get basic read, create, and delete operations working in reasonable time. However, the update operation has proved more difficult.

The client-side part of the application automatically sends changes to the server a couple seconds after the user modifies the resource. In addition, there is a "save and close" button that the user can press to send the latest version of the resource to the server immediately before returning to the home page.

My first problem was how to update the managed entity from the database with the unmanaged object coming from the client. Since the data sent to the client deliberately omits database keys, this was a bit tedious as it came down to explicitly "merging" elements from the unmanaged object into the Hibernate object. This isn't my question, but if anyone knows of an elegant way to do this, I'd be very interested to hear about it.

My second problem, and the object of this post, occurs when I press the "save and close" button around the same time as the auto-save operation I mentioned earlier is going on. More often than not, I get an optimistic locking exception. This is because the second update operation is being handled in a separate thread while the first is still being processed.

The resource has an "updated" time-stamp that is set each time an update is processed, so an update to the database is pretty much guaranteed each time, even if nothing has changed. This in itself is probably an issue, but even after I fix it, there is still a "window of opportunity" where the user could make a modification and send it to the server while an auto-save is in progress, triggering the same problem.

The simplest approach I can think of to address this issue is to rework some of the client-side Javascript to ensure that there is only ever one outstanding "update" operation from that client at any point in time. (Note that if a different client happens to update the same resource at the same time, an optimistic locking exception is perfectly fine.) However, I'm concerned that forcing this limitation on the client may be deviating from the spirit of ReST. Is it reasonable to expect a given client to have no more than one outstanding "update" (PUT) request to a particular resource at any point in time in a ReSTful application?

This seems like a fairly common scenario, but I haven't been able to find a definitive answer about how best to handle it. Other ideas I've considered and discarded include somehow serializing requests from the same client (perhaps based on HTTP session) so that they are handled in order, implementing a "queue" of updates for a JPA/Hibernate worker thread, and inserting new "versions" of the resource while keeping track of the latest one rather than updating any single record.

Any thoughts? Does the "one outstanding update at a time" limitation on the client seem reasonable, or is there a better approach?

A: 

My first problem was how to update the managed entity from the database with the unmanaged object coming from the client. Since the data sent to the client deliberately omits database keys, this was a bit tedious as it came down to explicitly "merging" elements from the unmanaged object into the Hibernate object. This isn't my question, but if anyone knows of an elegant way to do this, I'd be very interested to hear about it.

Either you should include the id to the resource the client is sending or use PUT (e.g. PUT /object/{objectId}). Then you don't need to merge but just "replace". I prefer to always return and expect complete resources. It avoid the yucky merging.

My second problem, and the object of this post, occurs when I press the "save and close" button around the same time as the auto-save operation I mentioned earlier is going on. More often than not, I get an optimistic locking exception. This is because the second update operation is being handled in a separate thread while the first is still being processed.

As you mentioned you can do this on client-side by JavaScript. You can introduce a 'dirty' flag and both the auto-save or the save-close button only transmit a request to server when the flag 'dirty' is set. The 'dirty' flag is toggled, when the user changed something. I would not let the server serialize requests, it complicates things.

[...] However, I'm concerned that forcing this limitation on the client may be deviating from the spirit of ReST. Is it reasonable to expect a given client to have no more than one outstanding "update" (PUT) request to a particular resource at any point in time in a ReSTful application?

Some of the spirit of REST is to decouple stuff and client can well decide when to do CRUD operations.

BTW: There is a great frontend java-script framework which goes very well with REST apis: extJS. It worked well for us for internal applications (I wouldn't do a public website with extJS because of the extJS-style look&feel).

manuel aldana
To clarify: the URL _does_ include an "objectId" that uniquely identifies the resource in the database. However, the JSON objects initially retrieved from the server (and thereafter re-submitted by the client back to server following modifications) omit the actual database ID's. I think the reasons for this should be obvious: not relying on the client to preserve values which are effectively meaningless as far as it is concerned. It makes merging more difficult, as I said, but it seems the correct approach.
C P1R8
"Some of the spirit of REST is to decouple stuff and client can well decide when to do CRUD operations." -- I'm not sure I understand what you are saying here. If I go with the simple approach I described, then it means that the client is _not_ free to use the ReST API however it wishes without regard for this new (perhaps non-obvious) restriction.If restricting the client to no more than one outstanding PUT at a time for a given resource is indeed the best solution, then I would have expected that someone somewhere would have explicitly made this statement. This _can't_ be a new problem...
C P1R8
1st comment: if the URL contains the id, yes you can discard the id inside the resource representation (in your case JSON). Still you can make it a contract of your REST interface, that the client is supposed to pass "complete representations" of the resource. So far it was not a problem with the APIs (client + server) I worked with or implemented.
manuel aldana
Regarding your 2nd comment: It is difficult to state explicit statements of design approaches (which REST is)... PUT is supposed to be idempotent, but there can be race conditions like in your case. To tackle this, I would avoid making unecessary calls from the client side (easiest solution). Alternative is to let the server return 400 Bad Request, so the client should retry the call. The one with highest effort is to serialize requests on the server side. Depending on the problem I would only use this option if I for example am exposing event resources, which should never get lost.
manuel aldana
(Back to the first comment.) Are you suggesting that I should wipe the previous version of the resource from the database and insert a new one with the same "objectId" for the PUT operation?This was actually the first approach I tried, but it seemed to generate way too much traffic to the database since the resources in question are moderately complex documents with multiple sub-tables.
C P1R8
A: 

I've run into the same problems of clients performing simultaneous PUT operations towards the same URI at the same time. It doesn't make sense for a client to do so, but I don't think you'll find any hard documentation that says it's prohibited or any degree of being RESTful. Indeed, a server can't do anything other than serialize them and complain about optimistic concurrency when it does happen. A well behaved client would do wise to avoid these by synchronizing all operations on each resource, even safe operations like GET or HEAD, so you don't pollute your caches with stale data.

mogsie
You win by default, mogsie. I guess this is the way of things with ReST applications -- there really isn't a spec as such beyond RFC 2616. If the behavior isn't explicitly defined for HTTP 1.1, then it's anyone's guess as to what is the "right" way to do things. As you suggest, it is probably just "common sense" for the client to avoid sending multiple PUT requests without waiting for a response in between. I was sort of hoping for some more discussion from others who encountered the same issue, but I guess it isn't exactly the most exciting topic. :) Thanks for taking the time to respond.
C P1R8
I have run into the problem, actually. We had a client that did a background refresh (conditional `GET`) of resources every now and then, and sometimes that would happen at the same time as the _same_ client did a `PUT` because the user hit "save". So we did the only sane thing and that was to introduce a mutex in the client so that all interaction with one URI would be queued. I can't recall if the server already had that, but it makes even more sense to do it there too.
mogsie