views:

261

answers:

3

I've been asked to put every single file in my project under source control, including the database file (not the schema, the complete file).

This seems wrong to me, but I can't explain it. Every resource I find about source control tells me not to put generated output files in a source control system. And I understand, it's not "source" files.

However, I've been presented with the following reasoning:

  • Who cares? We have plenty of bandwidth.
  • I don't mind having to resolve a conflict each time I get the latest revision, it's just one click
  • It's so much more convenient than having to think about good ignore files
    • Also, if I have to add an external DLL file in the bin folder now, I can't forget to put it in source control, as the bin folder is not being ignored now.

The simple solution for the last bullet-point is to add the file in a libraries folder and reference it from the project.

Please explain if and why putting generated output files under source control is wrong.

+3  A: 

You haven't explained what "the database file" is.

I would certainly include 3rd party libraries in source control, as they're necessarily for the build and it's good to have a way of reproducing a build at a later time with the library versions you used at that particular moment. But yes, those libraries should be included from a "libraries" folder rather than the output directory.

I wouldn't generally include my own libraries built from the sources elsewhere in the same repository - although I have been in situations where that's been worth doing, where some projects didn't use the "latest and greatest" version of a common library, but just occasionally updated.

The most important practical argument I'd give against including everything, in a world where disk, processor and network are considered free and instantaneous, is that it makes it harder to tell what really changed for any given commit. It's easier to look down a list of 3 source files than 3 source files and 150 binaries from the obj/bin directories.

Jon Skeet
The database are the ms sql .mdf and .ldf files. The library files are not actually the main focus of this question, I'm particularly interested in problems with generated files. The changeset readability is a good one, thanks!
sebastiaan
That's said the types of files, but not what's in them. Are they empty database files which make it easier for developers to get up and running? If so, that's absolutely fine. If they're including custom data, that's clearly a different matter.
Jon Skeet
Prepare to be shocked: it's the complete current working db (we don't yet have a central server and we're working from different locations). Currently this makes it impossible to work concurrently. Luckily, this is going to be fixed fairly soon. I do appreciate the suggestion of including. a quick-start db, great idea!
sebastiaan
+3  A: 

Generated output files (in general) are "dangerous" in a VCS because:

  • what you need to version is how to regenerate them: the day you will need to actually update them, chances are you won't remember how to do it
  • they can contain some private generated file which make them work on the committer desktop, but not on a client one ("works on my machine" TM syndrome)
  • some generated file are not easily stored in delta (binary especially), making them consuming lots of space (and the topic of cleaning that space will come-up someday...)

External libraries are not generated directly by your project, and can be put in a VCS, although external repositories like a public Maven repo are better at this kind of management.

VonC
+1  A: 

Do we also put compiled object files such as class files, executables, DLLs build from our source? What about when we're doing serious volume testing and that database becomes many gigabytes or terabytes in size?

The clue is in the name: it's Source Code Management System.

I can understand the simplicity of put eveything in, it's more likely that developer doesn't forget some important file. But if you're doing regular automated builds then surely that gets picked up anyway?

I think the key phrase is here:

It's so much more convenient than having to think about good ignore files

Are you explicitly forbiden from having good ignore files? My guess is that already you are excluding .exe and .class (or whatever) files. Suppose you did take the trouble to exclude your database would that be a problem? Why? It's a concious action that you are chosing to take for the commone good. In Eclipse it's a couple of seconds work to add a new file type to the workspace's CVS ignore rules for all projects.

A rule of "No Ignore Files" is almost self-evidently absurd. Once you have the freedom the have some ignore files then why not just use them intelligently to exclude the DB? Who is inconveninced? Only yourself, if anyone, and you're prepared to do the extra work.

djna
I suppose that we don't have the time to set-up source control properly, so "let's just act like to project is a giant zip file". Resolving conflicts doesn't take too much time. Only after x month of conflict solving, you've spent as much time on it as setting up proper source control.
sebastiaan
@sebastian, yes I agree that I think this is likely to be what's happening. I really do dispute that setting up a few ignore files is excessive effort.
djna