views:

168

answers:

7

I have company, customer, supplier etc tables which all have address information related columns.

I am trying to figure out if I should create a new table 'addresses' and separate all address columns to that.

Having address columns on all tables is easy to use and query but I am not sure if it is the right way of doing it from a good design perspective, having these same columns repeat over few tables is making me curious.

Content of the address is not important for me, I will not be checking or using these addresses on any decision making processes, they are purely information related. Currently I am looking at 5 tables that have address information

+1  A: 

If you are using those addresses only within the scope of their own tables, there may be no real benefit to moving them to their own tables.

Basically, it doesn't sound like it's worth the effort.

John MacIntyre
A reason for the down vote would be appreciated. thx
John MacIntyre
No real benefit to moving them to their own tables ignores the value of maintaining a single coherent address schema in a single place, as opposed to maintaining it in several places. (I've seen companies go crazy over bugs related to this, when the length of an address field ended up being different for a company and a customer.) Also, the double negative is confusing.
McWafflestix
@McWaffle: but this is where the "it depends" comes in. If you have only 2 or 3 entities that have addresses, it really isn't much of a threat to your system. If you have 10 entities that use addresses, and you need a generic way of dealing with the addresses themselves as first class objects, I'd tend to agree with you.
Dave Markle
@DaveMarkle: I don't disagree with your point; I just think that in most systems of this type, expansion over time gets you in a situation where you'll wish you'd done the normalization initially.
McWafflestix
@McWafflestix-It really comes down to the business requirements. If you were a GIS company or Canada Post, you'd want it, but for most other companies tying it to the customers table is fine. Also, there are times when you definitely want it tied to individual tables, for example when you ship, do you really want delivery records changing if the customer moves? In my experience, the reasons to do this are usually lacking. BTW-Thanks for the typo comment.
John MacIntyre
@Dave Markle - I am big fan of refactoring code. Start simple and only create complex designs when the situation warrants itself. However, database design is different. Refactoring a database is not te be taken lightly. It is MUCH harder to refactor a database when it is in production than it is to refactor code. It is just not worth it since putting a database in 3rd normal form is really easy to begin with once you know how.
Nemi
@JohnMacIntyre: your point about records changing when the customer moves is a perfect one. What do you do about historical data? If you have a separate address table, you just keep the old address, add the new, blam you're done; but if you don't have a separate address table, what do you do? Keep a copy of the entire customer record for every edit they make to their address, in case something refers to that instance of the customer record? And you're welcome on the typo comment; I figured it was just that.
McWafflestix
@Nemi: Exactly! it's really really really worth getting the database design right the first time.
McWafflestix
+1  A: 

Yes, you should separate the addresses to a table of their own. It's a smart thing to know to ask. The key here is that general format of addresses is the same, regardless of who it is; a customer, a company, a supplier... they all have the same fields for addresses.

What makes this worthwhile is the ability to treat addresses as an atomic element; that is, you can generalize all the functionality related to addresses and have it deal with just one table, as opposed to having to worry about it dealing with several tables, and the associated schema drift that can occur.

McWafflestix
This isn't always necessary. There's usually not a lot of schema drift with addresses.
Dave Markle
I disagree. Normalization for its own sake isn't always the right way to go. It depends on the purpose of the data and realistic proportion of repeated data. If there's nil chance of repeated data, I'd say there's no purpose to normalizing the schema.
BenAlabaster
That's correct; there's not usually a lot of schema drift, but I've seen a company nearly go out of business because of a schema difference between the length of an address field between a customer and a supplier (varchar(80) vs. varchar(120)). IMO, the relatively small amount of work involved in the normalization is worth the potentially high risk at a later time.
McWafflestix
+6  A: 

The answer to all design questions is this:

It depends.

So basically, in the Address case it depends on whether or not you will have more than 1 address per customer. If you will have more than 1, put it in a new Addresses table and give each address a CustomerID. It's overkill (most times, it depends!) to create a generic Address table and map it to the company/customer/supplier tables.

It's also often overkill (and dangerous) to map addresses in a many-to-many relationship between your objects (as addresses can seem to magically change on users if you do this).

The one big rule is: Keep it simple!

Dave Markle
can you give an example of how you use the address on many to many relationship
kaivalya
Thank god for someone thinking before they spout "normalization rules should be followed blindly"
BenAlabaster
@korki: Sure. You might decide that a customer and supplier can be located at the same place, and that they might have more than one address (billing address, receiving address). You'd then create a table to map the customer/supplier record to the addresses. This is scary because it's generally more complicated than you want, and updating one of the addresses might have unintended consequences.
Dave Markle
@ balabasterFor the record, I voted Dave up also. Non-normalized databases cause so many issues that a beginner should not be deciding whether or not to fudge, imho. Only after they realize the reasons for doing it correctly should they be cutting corners.
Nemi
@Nemi: I completely agree...
McWafflestix
Ah, design questions. It's like listening to a Supreme Court judge go through the nomination process. "Senator, I can not answer a hypothetical..." LOL
Dave Markle
@Dave: My situation is much simpler. I only store the address info for information purposes and don't care if it is a duplicate with any other entity or not. I m not yet sure if i will use separate billing and shipping address option also but why i considered table separation in the first place was to have this such different address options for future if I need to...
kaivalya
@korki: I'll still stand by my contention that it's better to have this in place if you need it in the future than it is to not have it in place. Yes, there's extra work involved in implementing this, but that work is specifically to mitigate the risk further down the line, and I think that work to mitigate the risk is (very) worthwhile.
McWafflestix
A: 

If there's an overlap between tables (i.e. the same organization is entered in both the company and supplier tables), and the address should always be the same in both tables, then it's probably worth moving address off in to its own table and having foreign keys to it from your other three tables. That way, you only have to update it in one spot when it changes.

If the three tables are entirely independent from each other, then there's not really much to gain from moving the data to another table, so you might as well leave it alone.

Warren Pena
+4  A: 

This is called Database Normalization. And yes, you want to split them up, if for no other reason because if you need to in the future it will be much harder when you have code and queries in place.

As a rule, you should always design your database in 3rd Normal Form, even for simple apps (there will be a few cases where you won't for performance or logistic reasons, but starting out I would always try to make it 3rd Normal Form, and then learn to cheat after you know the right way of doing it).

EDIT: To expand on this and add some of the comments I have made on other's posts, I am a big believer in starting with a simple design when it comes to code and refactoring when it becomes clear that it is becoming too complex and more indepth object oriented principles would be appropriate. However, refactoring a database that is in production is not so simple. It is all about ROI. It is just too easy to design a normalized database from the outset to justify not doing it. The consequences of a poorly designed database can be catastrophic and it is usually too late before you come to that realization.

Nemi
A: 

It is very simple

1° Take a table and execute a SELECT clause

If your result set has many and many NULL values, it is a signal your table IS NOT normalized and if you want, you can create new tables in order to normalize your database

regards,

Arthur Ronald F D Garcia
A: 

I think it entirely depends on the purpose of the database. Admittedly all address information is structurally the same and from a theoretical standpoint should all be in a single table linked from the parent table by a key.

However from a performance and query perspective, keeping them in their respective tables does simplify things from a reporting standpoint.

I have a situation with my current company [logistics] where the addresses are actually logically the same - they're all locations regardless of whether they're a pickup location, delivery location, customer etc.

In my case, I'd say that they should most definitely all be in one table. But if it's looking at it from a supplier, customer, contact information standpoint, I'd say that while theoretically it's nice to have the addresses in one table, in practice it won't buy you a whole lot as the data is unlikely to be repeated.

BenAlabaster