views:

129

answers:

2

I'm working on a file management system and would like to include an automated versioning such as bates numbering if a file with the same name exists. I thought of inserting a "-v0001" between the filename and extension and counting the number of versions as they come in.

$basename = pathinfo($filename, PATHINFO_BASENAME);
$fname = pathinfo($filename, PATHINFO_FILENAME);

 while (filenameExists($basename)) {
     //look for existing -vnnnn (at end of file name)
     if (versioningExists($fname)) {       
         //roll number ahead, set bates number
     } else {
         //start bates numbering at 1    
     }
     //insert bates version number (str_pad)
 }

I'm thinking I would use a regex pattern to check if the versioning exists.

My questions are:

  • What are the potential problems of using a numbering system like this?
  • And what alternatives are there for dealing with filename versioning?

I'm intending this to be a mass import system, so I don't want to bug the user to give me unique filenames if I don't have to, and I do have the option of including a selection of other versioning schemes. My system has tags, so filename has a decreased importance, but I would think there is still some importance.

A: 

Alternative approach
May I suggest exploring a true versioning system as subversion as a transparent backend? You could use svn hooks to automate commits and so on. Maybe this would be simple and more robust.

The "Autoversioning" chapter on the subversion doc may be a good starting point for that.

AlberT
I'm just versioning the filename only. This approach might be good for version control of documents, but I need to present all versions of the file as unique files. I guess what I'm saying is that my reason for versioning is to disallow identical filename collisions unobtrusively, not necessarily keep strict control of versions of the same document.
jjclarkson
Ah ok, I missed your real goal. Sorry
AlberT
`s/robuster/more robust/;`
Brad Gilbert
@Brad Gilbert, thanks for your english grammar lesson. Btw, the spell checker does not dislike "robuster" :)
AlberT
+1  A: 

In the past I have always just tacked on the results of mktime() before the file extension (when said file name already exists on the system). No need to parse for the current version number and you also get a nice timestamp added to the file name so you can tell which one came first and when it was created. You can check for the file with the timestamp included before saving if you are worried someone else might save the same named file at the same exact second on the server. If that really is a concern, you should probably do the same with your system as well, up the number, then check again to see if a file with that name already exists.

The timestamp has the added benefit of being far more unlikely someone is uploading a file already named with something similar to your version number, for instance, bob_321235678.jpg compared to bob_1.jpg.

The one downfall to all this is you can end up with a bunch of files that are more or less the same but with different names, so you may want to parse that data periodically looking for files not in use in the system.

catfarm
Good solution. But I'd improve it using an ISO 8610 date format, as it has a number of advantages (such as ordering ability). See http://en.wikipedia.org/wiki/ISO_8601 for a reference and a brief description of its advantages, also compared with other standard formats.
AlberT