views:

421

answers:

5

What makes repository different from database, filesystem or any other kind of storage? How can I exactly tell that this or that is repository judging by some set of features that it has or does not have?

When I say 'repository', first of all I mean version control. But there are other examples of repositories, such as digital libraries, for instance. There might be other examples, of course, but all of them would assume that repository is 'the place where you can store something'. But it's not really clear what exact differences does it have that allows to distinct it from other 'places where you can store something'.

+1  A: 

I would complement "Places where you can store something" with "... for you and other people to retrieve it". Or maybe reword that as "Places where you can store a collection of related things for you and other people to retrieve them." The meaning is really that generic.

In contrast, file system and database have more technical definitions: "In computing, a file system is a method of storing and organizing computer files and the data they contain to make it easy to find and access them". See the wikipedia entry. Database is a collection of logically related data structured in way that is easily accessed, managed, and updated.

Bruno Rothgiesser
+5  A: 

Repository is just a descriptive term the author's chose.

I'm not sure why you'd ask what it means. It's just a word they picked so they wouldn't have to say "the file system locations in which we keep your stuff".

*What makes repository different from database, filesystem or any other kind of storage? *

Nothing. It's storage. It's a filesystem. It's a database. It's just a word they picked so they wouldn't have to say "the file system locations in which we keep your stuff". They shortened it to "repository".

Usually, we reserve "filesystem" for the underlying OS features that give us persistent storage. A repository probably has some more organization than just random files. But it might not.

Usually, we reserve "database" for a discrete product that has a more formal API, a query language, and locking and some reliability features like backups and logs.

How can I exactly tell that this or that is repository judging by some set of features that it has or does not have?

You can't. Something is a repository because the folks that wrote the software decided to call it a "repository". The application developers could call anything a repository -- database, filesystem, individual file. Anything "stateful" can be a repository.

It's just a word they picked so they wouldn't have to say "the file system locations in which we keep your stuff".

it's not really clear what exact differences does it have

Why does that matter? Who actually cares? What problem do you have?

Why does it matter which files are a "repository", which files are a "database" and which files are just files?

You can have files that are a "backup" or a "vault". You can have files that are a "collection" or anything the developers want to call it.

They're free to use any descriptive term they want to replace "the file system locations in which we keep your stuff".

S.Lott
A: 

In terms of database you have to be more precise. Is it a RDBMS,ODBMS or a big persisted hash-table? To me filesystem is also kind of "implementation of a database" (hierachical and directory/file based).

manuel aldana
+1  A: 

When I worked on repository software, many years ago. Back then, the difference between (general purpose) databases and repositories was the difference between "data" and "meta-data".

So, a database stores data. A repository is a special class of database which is designed to store meta-data, that is, data that describes other data.

Any general purpose database software could be used as a repository, but there are some characteristics of meta-data that make it desirable to use a special-purpose tool. Generally, the granularity of the data is small, with lots of cross-references to other data. The number of records is likely to be tractable. There is often a requirement for version control and/or diffs of the contents.

Because of these special requirements, database manufacturers were tempted into writing special DBMS systems to support the needs of repository builders. (Does anyone remember Microsoft Repository or the Unisys's UREP?) I am no longer in that field, and couldn't tell you about the progress in the past decade.

Oddthinking
A: 

Database is where the data is stored, preferably in tabular form and might contain numerous tables that are (or may be) joined / linked together. The database may be facilitated to generate some reports by using a language query (mostly SQL) to make your work easy. To be noted that the database generally undergoes "STRUCTURAL" modifications when needed.

A Repository would essentially be linked to at least one database, but in general, it is linked to many databases. The repository would offer a selection of database from where it would pick the information for the user (which may be a drop-down menu, or automatic selection based on your login credentials or IP address, or something that may help repository identify you and your scope). The Repository essentially has a logical interpretor working between the user and the database, hence, an amateur with no knowledge of the database can also operate it. The Language queries may still be used by the experts to carry out some specific operations, but it is not an essential part. The repository would not store the information into itself, but offer a means to input or display the information stored in or taken from the linked database. To be noted that the repository would undergo a "LOGICAL" modification, or version control, to offer modified view, added features etc.

Vishal Tomar