What are the pros/cons from a performance/indexing/data management perspective of creating a one-to-one relationship between tables using the primary key on the child as foreign key, versus a pure surrogate primary key on the child? The first approach seems to reduce redundancy and nicely constrains the one-to-one implicitly, while the second approach seems to be favored by DBAs, even though it creates a second index:
create table parent (
id integer primary key,
data varchar(50)
)
create table child (
id integer primary key references parent(id),
data varchar(50)
)
pure surrogate key:
create table parent (
id integer primary key,
data varchar(50)
)
create table child (
id integer primary key,
parent_id integer unique references parent(id),
data varchar(50)
)
the platforms of interest here are Postgresql, Microsoft SQL Server.
Edit:
So here is the basic idea from an actual DBA. The main concern is index fragmentation on the child table. Suppose records with primary keys 1-1000000 are inserted into the parent table, nothing in the child table. Later, ad-hoc operations begin to populate the child table with rows that correspond to those in the parent table, but in a random order. The concern is that this will cause page splits on inserts, cause index fragmentation, and cause the "swiss cheese" effect for deletes. I will admit that these are not terms I am deeply familiar with, and when googling for them, the hits seem to be all Microsoft SQL server related. Are these MS-specific concerns (i.e., does PG's ANALYZE and such mitigate the issue on PG)? If so then this is yet another reason to use a database like Postgresql.