ansaurus

Question

How to combine version control with data analysis

Answer 1

A:

Subversion, maybe other [d]vcs as well, supports symbolic links. The idea is to store raw data 'well organized' on a filesystem, while tracking the relation between 'script' and 'generated date' with symbolic links under version control.

data -> data-1.2.3

All your scripts will call load data to retrieve a given dataset, being linked through versioned symbolic link to a given dataset.

Using this approach, code and calculated datasets are tracked within one tool, without bloating your repository with binary data.

zellus 2010-10-28 19:19:30

ansaurus

tags:

views:

answers:

How to combine version control with data analysis

related questions