I'm working in a research group where we intend to publish implementations of some of the algorithms we develop on the web via a RESTful API. Most of these algorithms work on small to medium size datasets, and in many cases, a user of our services might want to run multiple queries (with different parameters) on the same dataset, so for me it seems reasonable to allow users to upload their datasets in advance and refer to them in their queries later. In this sense, a dataset could be a resource in my API, and an algorithm could be another.
My question is: how should I let the users upload their own datasets? I cannot simply let users upload their data to /dataset/dataset_id
as letting the users invent their own dataset_id
s might result in ID collision and users overwriting each other's datasets by accident. (I believe one of the most frequently used dataset ID would be test
). I think an ideal way would be to have a dedicated URL (like /dataset/upload
) where users can POST their datasets and the response would contain a unique ID under which the dataset was stored, but I'm not sure that it does not violate the basic principles of REST. What is the preferred way of dealing with such scenarios?