views:

87

answers:

3

I am working on a system that will have several hundred thousand XML files, ranging from 2K to 1MB in size. Does anyone have experience using version control with >300k files? Will SVN or git become problematic?

I am familiar with SVN but have no experience with any other version control.

EDIT: I have tried both SVN and git with 120,000 XML files, weighing 1.2 GB. git works much better, SVN becomes very slow with this many files. On a Mac, both SvnX and gitX choke on the repos, so it's command line all the way.

+1  A: 

Windows or Unix? It's been my personal experience that single directories with so many files can cause some performance issues in Windows unrelated to source control. If possible, I'd separate those XML files into subdirectory groupings.

As far as source control goes, I haven't had any issues with both SVN and TFS repositories containing 10k+ files, so I can guess that it will handle 100k+ files.

Hope that helps.

Brian Hasden
Thanks, they will probably live on a Mac and they're already divided up into folders. FYI I edited 100k+ to 300k+.
sakabako
Looks like for a Mac if you're actually browsing the directory through the GUI you can run into some slowdowns around 20k files, but through the command line it seems like it can support more files. Overall I wouldn't be as worried about the number of files in source control as I would about how the intended O/S you will be consuming those files on is going to handle it.
Brian Hasden
+1  A: 

How about just trying? There are many factors involved (disk, memory, caches) and it depends on how you want to check them out (all at once vs. a couple)... On top of that, your definition of "what performs" might be different. For example, you might want to wait 2 min's for a checkout if it only happens every 6 months. But not if it happens every 5 minutes.

No substitute...

Jilles
+2  A: 

I'm working on a project that involves somewhere around 300K XML (and other) files. Subversion (hosted on a Linux VM) seems to handle it just fine. The only caveat is that commits involving changes to large subsets (around 50,000 files) can take a very long time. I have had to parcel them out (e.g. execute an svn commit for each subdirectory instead of the whole) in order to get them to work.

hcayless