views:

2679

answers:

22

I was thinking the other day on normalization, and it occurred to me, I cannot think of a time where there should be a 1:1 relationship in a database.

Name:SSN? I'd have them in the same table PersonID:AddressID? Again, same table.

I can come up with a zillion examples of 1:many or many:many (with appropriate intermediate tables), but never a 1:1.

Am I missing something obvious?

+48  A: 

One reason is database efficiency. Having a 1:1 relationship allows you to split up the fields which will be affected during a row/table lock. If table A has a ton of updates and table b has a ton of reads (or has a ton of updates from another application), then table A's locking won't affect what's going on in table B.

Others bring up a good point. Security can also be a good reason depending on how applications etc. are hitting the system. I would tend to take a different approach, but it can be an easy way of restricting access to certain data. It's really easy to just deny access to a certain table in a pinch.

Kevin
Yes I wish the OP could mark this as a second accepted answer. My first thought, when reading this question, was "row locks".
Ricket
+7  A: 

In terms of pure science, yes, they are useless.

In real databases it's sometimes useful to keep a rarely used field in a separate table: to speed up queries using this and only this field; to avoid locks, etc.

Quassnoi
"Pure science"? WTF does that mean? You think NaCl doesn't come in a 1:1 relationship?
Computer science. No relationship to either reality or coding.
chaos
@Mark Brady: no, it doesn't, as there are Na2SO4 and KCl. When making a universe database, you should create t_atom (id INT, protons INT, neutrons INT), t_molecule (id INT, atom_id INT) and make joins. It's a one-to-many relationship.
Quassnoi
On a second thought, INT might not suffice here, as there are 10^78 atoms in the universe. Even two GUID's glued together can hold only 1/10th of the atoms. We'll need a RAW(40) primary key — just in case our database will grow too fast. Dark matter, you know, of something.
Quassnoi
Oh, so only relational theory is pure science. Didn't realize that chemistry wasn't a pure science unless we built a database for it.
@Mark Brady: couldn't you please just say what is that you don't agree with? I don't get your sarcasm, really :)
Quassnoi
Quassnoi (and Mark Brady) You get an upvote for the LOL you just gave me.
Pulsehead
@Mark Brady: Actually, NaCl is better described as a cubic crystal of alternating Na and Cl atoms. The NaCl formula is a handy abbreviation, and works on chemical equations, but it's not actually the true stuff. I don't think that anyone has actually seen a distinct NaCl molecule.
Adriano Varoli Piazza
@Adriano: not a problem, it will just take a `CONNECT BY` query.
Quassnoi
Well if you really really really meant pure science, then a proper quantum mechanical characterization of the Na and Cl atoms would be sufficient to specify the structure of the NaCL molecule as well as it's chemical constituents. This may be a very small set of numbers... so your database would perhaps not grow too fast.
Eric M
Who is Mark Brady? How come ther's no link to his profile?
Cape Cod Gunny
+2  A: 

If you're using the data with one of the popular ORMs, you might want to break up a table into multiple tables to match your Object Hierarchy.

Jason Punyon
Sad, sad reason. If an ORM can't handle a divergence between the object model and the physical storage then.... just sad.
+5  A: 

Rather than using views to restrict access to fields, it sometimes makes sense to keep restricted fields in a separate table to which only certain users have access.

eleven81
+44  A: 

A 1:1 relationship typically indicates that you have partitioned a larger entity for some reason. Often it is because of performance reasons in the physical schema, but it can happen in the logic side as well if a large chunk of the data is expected to be "unknown" at the same time (in which case you have a 1:0 or 1:1, but no more).

As an example of a logical partition: you have data about an employee, but there is a larger set of data that needs to be collected, if and only if they select to have health coverage. I would keep the demographic data regarding health coverage in a different table to both give easier security partitioning and to avoid hauling that data around in queries unrelated to insurance.

An example of a physical partition would be the same data being hosted on multiple servers. I may keep the health coverage demographic data in another state (where the HR office is, for example) and the primary database may only link to it via a linked server... avoiding replicating sensitive data to other locations, yet making it available for (assuming here rare) queries that need it.

Physical partitioning can be useful whenever you have queries that need consistent subsets of a larger entity.

Godeke
A perfect example of this might be a table that contains files. You may (for obvious reasons) want to have one table that contains only the file's meta data (file name, mime type, etc) and another table, mapped 1:1, that contains the actual blob data. This would reduce overhead in querying/sorting the files in some instances.
Kevin Peno
Yes. It depends on the database (modern designs are storing blobs in the filesystem just by using the correct type) and even with such support one has to be careful to exclude the columns (in SQL explicit column lists are normal, but some ORMs want to drag the entire record). The trick is to know your use pattern: if most of the time the actual data is ignored I would use a 1:1 blob table. If most accesses are downloads of the data I would use the native storage mechanism.
Godeke
+1  A: 

Most frequently it is more of a physical than logical construction. It is commonly used to vertically partition a table to take advantage of splitting I/O across physical devices or other query optimizations associated with segregating less frequently accessed data or data that needs to be kept more secure than the rest of the attributes on the same object (SSN, Salary, etc).

The only logical consideration that prescribes a 1-1 relationship is when certain attributes only apply to some of the entities. However, in most cases there is a better/more normalized way to model the data through entity extraction.

JohnFx
+2  A: 

extended information that is only needed in certain scenarios. in legacy applications and programming languages (such as RPG) where the programs are compiled over the tables (so if the table changes you have to recompile the program(s)). Tag along files can also be useful in cases where you have to worry about table size.

Ryan Guill
+2  A: 

I have found that when I do a 1:1 relationship its totally for a systemic reason, not a relational reason.

For instance, I've found that putting the reserved aspects of a user in 1 table and putting the user editable fields of the user in a different table allows logically writing those rules about permissions on those fields much much easier.

But you are correct, in theory, 1:1 relationships are completely contrived, and are almost a phenomenon. However logically it allows the programs and optimizations abstracting the database easier.

DevelopingChris
phenomenon: is any fact or event of scientific interest that is susceptible of being scientifically described and/or explained. I do not think you meant to use that word.
I did mean to use it, in its connotation, of being a rare thing. Typically phenomenon is used to describe outliers, even though its definition, defines your very need to correct every word in my post.
DevelopingChris
@DevelopingChris I agree that 1:1's are a phenomomenon. Except for school theory there are very few real world situations that exist.
Cape Cod Gunny
+16  A: 

Sparseness. The data relationship may be technically 1:1, but corresponding rows don't have to exist for every row. So if you have twenty million rows and there's some set of values that only exists for 0.5% of them, the space savings are vast if you push those columns out into a table that can be sparsely populated.

chaos
"but corresponding rows don't have to exist for every row" Then that's not 1:1. You're talking about 1:0,1.
Yeah. Dunno if the OP draws the distinction.
chaos
Well, I'd assume they did because 1:0,1 has lots of uses, including yours but 1:1 has far fewer. And they were struggling to find uses, so I'd say the OP was drawing the distinction.
My working assumption was the opposite because he enumerated 1:1, 1:many, and many:many without mentioning 1:0,1.
chaos
+2  A: 

I can also think of situations where you have an OO model in which you use inheritance, and the inheritance tree has to be persisted to the DB.

For instance, you have a class Bird and Fish which both inherit from Animal. In your DB you could have an 'Animal' table, which contains the common fields of the Animal class, and the Animal table has a one-to-one relationship with the Bird table, and a one-to-one relationship with the Fish table.

In this case, you don't have to have one Animal table which contains a lot of nullable columns to hold the Bird and Fish-properties, where all columns that contain Fish-data are set to NULL when the record represents a bird.

Instead, you have a record in the Birds-table that has a one-to-one relationship with the record in the Animal table.

Frederik Gheysels
A: 

1:1 relationships don't really make sense if you're into normalization as anything that would be 1:1 would be kept in the same table.

In the real world though, it's often different. You may want to break your data up to match your applications interface.

Eppz
I disagree with the one table theory. There are times when a SuperType SubType relationship is best expressed with 1:1 relationships.
Cape Cod Gunny
Actually, if you went for extreme normalization then you would have a lot more 1:1 relationships. Early forms of normalization proposed by Date included the rule that no column should allow NULL. That meant that if a column might be NULL for a table then it should be in its own table and a row added only when it was valued. This rule has been mostly discarded, including by Codd.
Tom H.
A: 

Most of the time, designs are thought to be 1:1 until someone asks "well, why can't it be 1:many"? Divorcing the concepts from one another prematurely is done in anticipation of this common scenario. Person and Address don't bind so tightly. A lot of people have multiple addresses. And so on...

Usually two separate object spaces imply that one or both can be multiplied (x:many). If two objects were truly, truly 1:1, even philosophically, then it's more of an is-relationship. These two "objects" are actually parts of one whole object.

Mark Canlas
+3  A: 

It's also a way to extend a table which is already in production with less (perceived) risk than a "real" database change. Seeing a 1:1 relationship in a legacy system is often a good indicator that fields were added after the initial design.

Boofus McGoofus
Why equivocate? Do you think that a brand spankin new table has as much risk as altering an existing table. Please list the things that break when new tables are added. The only thing I can think of is code that operates over metadata i.e. SELECT * From USER_TABLES loop <something> end loop.
It's less that things will break than that it often takes more work to add a 1:1 table properly than to add an extra field. Now a new record means updating two tables instead of just one. Deleting a record? Two queries as well. Now we're maintaining more code instead of changing it once.
Boofus McGoofus
@Mark Brady, strongly typed datasets can be a hell to manage when adding a couple of fileds in the database. If a tableadapter in the dataset has many querys and so on, its much more easy to just drag in the new table and then its done.
Stefan
@Boofus McGoofus, Cascade Delete...
Stefan
@Stefan: There's no Cascade insert.
Boofus McGoofus
@Boofus, well.. sometimes we do have to use the keyboard and program a little for the money we earn. ;)
Stefan
+6  A: 

I use them primarily for a few reasons. One is significant difference in rate of data change. Some of my tables may have audit trails where I track previous versions of records, if I only care to track previous versions of 5 out of 10 columns splitting those 5 columns onto a separate table with an audit trail mechanism on it is more efficient. Also, I may have records (say for an accounting app) that are write only. You can not change the dollar amounts, or the account they were for, if you made a mistake then you need to make a corresponding record to write adjust off the incorrect record, then create a correction entry. I have constraints on the table enforcing the fact that they cannot be updated or deleted, but I may have a couple of attributes for that object that are malleable, those are kept in a separate table without the restriction on modification. Another time I do this is in medical record applications. There is data related to a visit that cannot be changed once it is signed off on, and other data related to a visit that can be changed after signoff. In that case I will split the data and put a trigger on the locked table rejecting updates to the locked table when signed off, but allowing updates to the data the doctor is not signing off on.

Another poster commented on 1:1 not being normalized, I would disagree with that in some situations, especially subtyping. Say I have an employee table and the primary key is their SSN (it's an example, let's save the debate on whether this is a good key or not for another thread). The employees can be of different types, say temporary or permanent and if they are permanent they have more fields to be filled out, like office phone number, which should only be not null if the type = 'Permanent'. In a 3rd normal form database the column should depend only on the key, meaning the employee, but it actually depends on employee and type, so a 1:1 relationship is perfectly normal, and desirable in this case. It also prevents overly sparse tables, if I have 10 columns that are normally filled, but 20 additional columns only for certain types.

ShaneD
@ShaneD +1 for real word example. I also like the "Read Only" distinction.
Cape Cod Gunny
+6  A: 

The most common scenario I can think of is when you have BLOB's. Let's say you want to store large images in a database (typically, not the best way to store them, but sometimes the constraints make it more convenient). You would typically want the blob to be in a separate table to improve lookups of the non-blob data.

Mystere Man
Seriously? Typically not the best way? The single largest online music vendor stores it's, ahem, MP3's, in a database.
@Mark, there are a few questions dealing with the best way to store images, either in a database or out, and the consensus seems be to that for the most part the file system is faster. I imagine if that is in fact true, then the same would be hold true for MP3s.
James McMahon
"You would typically want the BLOB in a separate table" - Usually BLOB's are not stored inline (if they exceed a certain db-specific row-length). BLOBs if not inline, are usually stored as a 'pointer' to their physical location in the DB-pages.
blispr
It is arguable whether it makes sense to store large data (files) in a database, some are for some are against all have valid reasons, but +1 for the example of a 1:1.
Kevin Peno
This should really be its own question that I would like to see. With input from smart people who measure the time/prestanda for different harddrives/OS/RDBMS. There SHOULD be an definitive answer to this question, it should not be arguable, if measured correctly. Have'nt anyonw done it allready?
Stefan
+2  A: 

1-1 relationships are also necessary if you have too much information. There is a record size limitation on each record in the table. Sometimes tables are split in two (with the most commonly queried information in the main table) just so that the record size will not be too large. Databases are also more efficient in querying if the tables are narrow.

HLGEM
@HLGEM I agree.
Cape Cod Gunny
A: 

Yes, you are missing something obvious. Several people can live at one address.

jandersson
Wouldn't that be a one to many relationship?
Eugene M
Exactly my point.
jandersson
And one person can have many addresses. Or indeed be homeless/travelling. And an address can be empty also.Person:Address is definitely not 1:1 unless there is a strict business requirement for it, which is usually *not* the case.
Peter Boughton
I was looking at it from the other angle. Let's say that my address is stored in the database as Address_ID = 1. 15 years later, you get another customer who happens to live at my current address. Will you insert Address_ID of 1, or create a new one?
Pulsehead
If you were talking Geographic location, perhaps you would preserve ADDRESS_ID of 1 , 15 years later.I think the context here for address, is we have a person, and what are his address(es). In that case, (and in the case with several room-mate person entitites), we'd have multiple PERSON_IDs each of which have their own unique address_IDs, but in the case of room-mates, those distinct address_IDs will all have the same street, house # - i.e be the same postal address.
blispr
A: 

Anywhere were two entirely independent entities share a one-to-one relationship. There must be lots of examples:

person <-> dentist (its 1:N, so its wrong!)

person <-> doctor (its 1:N, so it's also wrong!)

person <-> spouse (its 1:0|1, so its mostly wrong!)

EDIT: Yes, those were pretty bad examples, particularly if I was always looking for a 1:1, not a 0 or 1 on either side. I guess my brain was mis-firing :-)

So, I'll try again. It turns out, after a bit of thought, that the only way you can have two separate entities that must (as far as the software goes) be together all of the time is for them to exist together in higher categorization. Then, if and only if you fall into a lower decomposition, the things are and should be separate, but at the higher level they can't live without each other. Context, then is the key.

For a medical database you may want to store different information about specific regions of the body, keeping them as a separate entity. In that case, a patient has just one head, and they need to have it, or they are not a patient. (They also have one heart, and a number of other necessary single organs). If you're interested in tracking surgeries for example, then each region should be a unique separate entity.

In a production/inventory system, if you're tracking the assembly of vehicles, then you certainly want to watch the engine progress differently from the car body, yet there is a one to one relationship. A care must have an engine, and only one (or it wouldn't be a 'car' anymore). An engine belongs to only one car.

In each case you could produce the separate entities as one big record, but given the level of decomposition, that would be wrong. They are, in these specific contexts, truly independent entities, although they might not appear so at a higher level.

Paul.

Paul W Homer
A dentist on has one patient? A doctor has only one patient?Person to spouse in only 1:1 if you put 1 spouse in one table and the other spouse in another (I was about to say men in one and women in the other. oh well)
Mark, you could just do a "Couples" table where rows always comes in pairs. But otherwise I agree with your, ehm... rant. :P
JohannesH
Yea, I agree with Mark too. :-) (I'm I allowed to dis my own stupid answer?)
Paul W Homer
Yeh, owning up to the "dumbness" proves how smart you are. The real cowards delete their dumb answers... good thing they aren't doctors, erasing medical records. Actually, it's a good thing we aren't either; reboot would be a bad medical treatment.
Duh, I could have deleted it? :-)
Paul W Homer
I've always figured that even the world's smartest person (whoever that may be at this time) still has their moments of udder stupidity. It's a human curse :-) Whereas my computer has its moments of near intelligence.
Paul W Homer
+5  A: 

Your question can be interpreted in several ways, because of the way you worded it. The responses show this.

There can definitely be 1:1 relationships between data items in the real world. No question about it. The "is a" relationship is generally one to one. A car is a vehicle. One car is one vehicle. One vehicle might be one car. Some vehicles are trucks, in which case one vehicle is not a car. Several answers address this interpretation.

But I think what you really are asking is... when 1:1 relationships exist, should tables ever be split? In other words, should you ever have two tables that contain exactly the same keys? In practice, most of us analyze only primary keys, and not other candidate keys, but that question is slightly diferent.

Normalization rules for 1NF, 2NF, and 3NF never require decomposing (splitting) a table into two tables with the same primary key. I haven't worked out whether putting a schema in BCNF, 4NF, or 5NF can ever result in two tables with the same keys. Off the top of my head, I'm going to guess that the answer is no.

There is a level of normalization called 6NF. The normalization rule for 6NF can definitely result in two tables with the same primary key. 6NF has the advantage over 5NF that NULLS can be completely avoided. This is important to some, but not all, database designers. I've never bothered to put a schema into 6NF.

In 6NF missing data can be represent by an omitted row, instead of a row with a NULL in some column.

There are reasons other than normalization for splitting tables. Sometimes split tables result in better performance. With some database engines, you can get the same performance benefits by partitioning the table instead of actually splitting it. This can have the advantage of keeping the logical design easy to understand, while giving the database engine the tools needed to speed things up.

Walter Mitty
+1 for actually discussing normalization levels
Kevin Peno
A: 

Hi

In my opinion a 1:1 relationship maps a class Inheritance on a RDBMS. There is a table A that contains the common attributes, i.e. the partent class status Each inherited class status is mapped on the RDBMS with a table B with a 1:1 relationship to A table, containing the specialized attributes. The table namend A contain also a "type" field that represents the "casting" functionality

Bye Mario

A: 

In SQL it is impossible to enforce a 1:1 relationship between two tables that is mandatory on both sides (unless the tables are read-only). For most practical purposes a "1:1" relationship in SQL really means 1:0|1.

The inability to support mandatory cardinality in referential constraints is one of SQL's serious limitations. "Deferrable" constraints don't really count because they are just a way of saying the constraint is not enforced some of the time.

dportas
A: 

The best reason I can see for a 1:1 relationship is a SuperType SubType of database design. I created a Real Estate MLS data structure based on this model. There were five different data feeds; Residential, Commercial, MultiFamily, Hotels & Land.

I created a SuperType called property that contained data that was common to each of the five separate data feeds. This allowed for very fast "simple" searches across all datatypes.

I create five separate SubTypes that stored the unique data elements for each of the five data feeds. Each SuperType record had a 1:1 relationship to the appropriate SubType record.

If a customer wanted a detailed search they had to select a Super-Sub type for example PropertyResidential.

Cape Cod Gunny