views:

171

answers:

9

Is there a way to have "pointers to rows" in a database?

for example I have X product rows, all these rows represent distinct products but many have the same field values except their "id" and "color_id" are different.

I thought of just duplicating the rows but this could be error prone, plus making a small change would have to be done on several rows, again buggy.


Question: Is there a way to fill some rows fully, then use a special value to "point to" certain field values?

For example:

id | field1   | field2   | field3   | color_id
-----------------------------------------------
1  | value1   | value2   | value3   | blue
2  | point[1] | point[1] | point[1] | red    (same as row 1, except id and color)
3  | point[1] | point[1] | point[1] | green  (same as row 1, except id and color)
4  | valueA   | valueB   | valueC   | orange
5  | point[4] | point[4] | point[4] | brown  (same as row 4, except id and color)
6  | valueX   | valueY   | valueZ   | pink
7  | point[6] | point[6] | point[6] | yellow (same as row 6, except id and color)


I'm using MySQL, but this is more of a general question. Also if this is goes completely against database theory, some explanation of why this is bad would be appreciated.

A: 

well this would be called normalization under normal circumstances .. the whole point of it is to deal with that kinda scenarios .. so no it cant be done the way u want to do it.. u will need to normalize the data properly.

Sabeen Malik
A: 

Create separate tables for the field1, field2 and field three values. Put existing values there, and reference them by putting their id's into your current table.

Zed
A: 

If you're using common string values, it's good to store the strings in a separate table and refer to them with foreign keys. If you're storing anything like an integer, it wouldn't be worth it - the size of the pointer would be comparable to the size of the data itself.

Mark P Neyer
A: 

One way of doing this is to separate the columns that seem to have repeated data and put that in a separate table. Give each of the rows in this new table a unique id. Add a column to the original table which contains the id in the new table. Then use a FOREIGN KEY relationship between the original table and the new table's id column.

Jack
A: 

It does go against database theory because you're throwing the relational part of databases out the window.

The way to do it is to make an ObjectID column that contains the key of the row you want to point to.

id | field1   | field2   | field3   | color_id | object_id |
------------------------------------------------------------
1  | value1   | value2   | value3   | blue
2  | null     |  null    | null     | red      | 1         |
3  | null     |  null    | null     | green    | 1         |
4  | valueA   | valueB   | valueC   | orange
5  | null     |  null    | null     | brown    | 4         |
6  | valueX   | valueY   | valueZ   | pink
7  | null     |  null    | null     | yellow   | 6         |


But remember: This is a bad idea. Don't do it. If you did want to do it, that would be how.

There are instances where it's required; but after dealing with a system that was pervasive in this, I'd always try to find another way, even if it means duplicating data and letting your business layer keep everything straight.

I work in a system where this was done throughout the system, and it's maddening when you have to recreate the functionality of relationships because someone wanted to be clever.

George Stocker
If you had to do that I would make the table you have displayed here a view over the normalised tables
Mark
A: 

The way you would want to implement this in a database would be to create two tables:

object_id | field1 | field2 | field3

and

instance_id | object_id | colour

And then the rows of the second would point to the first, and you could generate the full table of data you want on the fly by

select t1.*, t2.colour from t1 join t2 on (t1.object_id=t2.object_id)
njk
+3  A: 

This does go against database design look for descriptions of normalization and relational algebra. It is bad mainly because of the comment ypou have made "duplicating the rows but this could be error prone, plus making a small change would have to be done on several rows, again buggy."

The idea of relational databases is to act on sets of data and find things by matching on primary and forign keys and absolutely not to use or think of pointers at all.

If you have common data for each product then create a product table

create table product ( product_id int, field1 ..., field2 ..., field3 )

with primary key on product_id

The main table would have fields id, color_id and product_id

if product table is like

product_id | field1   | field2   | field3  
-----------------------------------------------
1          | value1   | value2   | value3  
2          | valueA   | valueB   | valueC  
3          | valueX   | valueY   | valueZ

The main table would be like

id | product_id | color_id
--------------------------------
1  | 1          | blue
2  | 1          | red
3  | 1          | green  
4  | 2          | orange
5  | 2          | brown
6  | 3          | pink
7  | 3          | yellow
Mark
A: 

You should probably have two tables with a foreign key relationship.

Example

Products:
  Id
  field1
  field2
  field3

ProductColors:
  Id
  ProductId
  Color
erikkallen
A: 

Sure there is a way to have pointers to rows in a database. Just don't use a relational DBMS. In the 1960s and 1970s, there were several very successful DBMS products that were based entirely on linking records together by embedding pointers to records inside other records. Perhaps the most well known of these was IMS.

The down side of having pointers to records in other records was that the resulting database was far less flexible than relational databases ended up being. For predeterimned access paths, a database built on a network of pointers is actually faster than a relational database. But when you want to combine the data in multiple ways, the lack of flexibility will kill you.

That is why relational DBMSes took over the field in the 1980s and 1990s, although hierarchical and network databases still survive for fairly specialized work.

As others have suggested, you should learn normalization. When you do, you will learn how to decompose tables into smaller tables with fewer coulmns (fields) in each table. When you need to use the data in joined fashion, you can use a relational join to put the data back together. Relational joins can be almost as fast as navigating by pointers, especially if you have the right indexes built.

Normalization will help you avoid harmful redundancy, which is the problem you highlighted in your question.

Walter Mitty