views:

727

answers:

3

Are There different methods to store binary files in SVN? if so, what are they, and how I modify the storage options?

I read that there are 4 ways to store binary files in SVN:

  1. Compressed tar - import - export.
  2. Tar - import - export.
  3. import - export.
  4. Efficient check-in.

Which of those are the most useful for time efficiancy? and how do I set the SVN to use any of those methods?

Thanks, Oded.

+2  A: 

You don't set Subversion to use any of those methods, you specify which method to use when putting files into the repository. And by "method", I don't mean any of the 4 you mention, but rather just "import" or "commit", and you'll have to keep telling Subversion about the method chosen each time you want to store a new revision of that file into the repository.

See Performance tuning Subversion.

As you can see from the description there, in order to use "method 1", compressing to tar and then use import, they have to themselves compress all the binary files into a .tar file, and then use the import command of Subversion to add the files into the repository.

Also note that caveat there, the import command stores files as new files, not as deltas to a previous revision, so it might be time-efficient, but not space-efficient, if few changes to a big file has been committed.

Subversion by itself only does commits and imports. A commit is a new revision to an existing file, stored as a sequence of deltas (or the first revision of a new file, which isn't), and an import is just a new file. Anything else you'll have to do yourself.

If the binary files are only changed now and then, this might be worth looking more into, but if they are changed regularly, I'd suggest just using Subversion as normal, with the commit command.

Also note that the typical advice when it comes to binary files is that you instead of the binary file store the source code to whatever it is that produces those binary files, if possible, and then re-runs the tools to reproduce the actual binary files. If the binary files are time or space-consuming to reproduce, only then do you also store the binary files in question.

Binary files have the problem of not really being good to compare, and thus if developer a and b both retrieves the latest version, and then developer a commits a new revision before developer b tries to do the same, some type of conflict will occur. Developer B might be left with no option but to try to figure out the changes by himself.


Edit: Let me emphasize what I mean by COMMIT and IMPORT.

The main difference is that COMMIT will, assuming you already have the file in the repository already, try to diff the file in your working copy against the previous repository version, and store only the changes. This will take time, and memory, in order to work out those differences, but will typically result in a small revision changeset in your repository. In other words, disk space on your Subversion server will be less impacted than with the IMPORT command.

IMPORT, on the other hand, will import the new file as though you just gave it a new file and said "forget about the previous one, just store this file", and thus no time or memory will be spent on working out the differences, but the resulting changeset in the repository will be larger. In other words, disk space on your Subversion server will be more impacted than with the COMMIT command, but IMPORT will typically run much faster.

Any other workflow you want to impose has to be done outside of Subversion. This includes the TAR command and compression options available in your operating system. If you want to go with "method 1", you, yourself, has to manually compress the file(s) you want to import into a single .tar file before you give it to Subversion. You can not ask Subversion to do any of that for you. You can of course make script files that automate the process somewhat, but still, it's not a Subversion problem.

I would do some serious tests with this to figure out if the gains are actually worth the extra work you will impose on your Subversion workflow.

Lasse V. Karlsen
are you basicaly saying that the only 2 options are IMPORT and COMMIT ?and that Import has the option of importing a compressed tar file (compressed manualy) or the regular files ?which one of 3 is time efficient?thanks :)
Oded
IMPORT will import whatever it is that you give it, be it the original file, a tar file, a gzip file, a zip file, a rar file, whatever. COMMIT as well. The difference is that COMMIT will try to diff the file against its previous repository revision, whereas IMPORT will not, so it will be faster, but take more space if there are in fact changes from the previous revision, as opposed to radical new file contents.
Lasse V. Karlsen
I think I understand now.Does in VisualSVN I can maybe configure the storage mothod for binary files or there is no such option but to do this manually or writing scrips?
Oded
+1  A: 

Could you describe your situation in more detail?

Do you have several smallish binary files that all change together? A few large binary files that change independently? Do your files change frequently?

Have you actually found that the defaults aren't good enough? I've always just added binary files in the same way as normal and found it to just work. Like any performance problem, I wouldn't try to make things complicated unless you've got a good reason to - in which case, please share that reason with us.

Jon Skeet
A: 

I have many small-sized binary files and a few large-sized ones. All are changed frequently. I'm currently working on CVS and switching to SVN soon and I wanted to know about ways to store binaries.

I read Performance tuning Subversion (mentioned above) and found it useful but no examples made so I didn't exactly understand how to do each of the 4 ways he suggested.

My basic question is weather or not the defaults are good (and what are they?) My first consideration is time-efficient and then space. Thanks :)

Oded