I have some ideas, but I really want to know what makes things go smoothly for you when modeling database:

  1. Table name matches Primary Key name and description key
  2. Schemas are by functional area
  3. Avoid composite primary keys where possible (use unique constraints)
  4. Camel Case table names and field names
  5. Do not prefix tables with tbl_, or procs with SP_ (no hungarian notation)
  6. OLTP databases should be atleast in BCNF / 4NF
+1  A: 

Good database design and normalization.

Lance Roberts
That's not much of a standard. A standard would be saying "databases must normalized to 3NF minimum" or something of that sort.
This is like saying "programmers must write good code" and calling it a standard.
If you don't want to standardly normalize your database that's too bad. It will be my standard irregardless.
Lance Roberts
These are my standards. If you don't like them, I have different ones.
+2  A: 

Everyone writes SQL queries (views, stored procedures, etc) in the same basic format. It really helps development/maintenance efforts down the road.

This is just not how it works. You will acquire another company, or integrate another product, and their format will be all different
:)Let's say each company has 10 developers. I think the way Lance meant it, we get _two_ standard styles instead of 20 (and that's if people don't change their coding style on the way).
Raj More
+6  A: 

I always try not to use the type in the field name - "sFirstName", "sLastName", or "iEmployeeID". While they match at first, if something changes, they'll be out of sync, and it's a huge headache to change those names later, since you have to change the dependant objects as well.

Intellisense and the GUI tools make it trivial to find out what type a column is, so I don't feel this is necessary.

+19  A: 
  • Name similarly targetted stored procs with the same prefix, for instance if you've got 3 stored procedures for Person. That way everything for person is grouped in one place and you can find them easily without having to look through all your procs to find them.
    • PersonUpdate
    • PersonDelete
    • PersonCreate
  • Do similar things for tables when you have groups of tables with related data. For instance:
    • InvoiceHeaders
    • InvoiceLines
    • InvoiceLineDetails
  • If you have the option of schemas within your database, use them. It's much nicer to see:
    • Invoice.Header
    • Invoice.Line.Items
    • Invoice.Line.Item.Details
    • Person.Update
    • Person.Delete
    • Person.Create
  • Don't use triggers unless there's no other reasonable approach to achieve that goal.
  • Give field names a meaningful prefix so you can tell what table they come from without someone needing to explain. That way when you see a field name referenced, you can easily tell which table it's from.
  • Use consistent data types for fields containing similar data, i.e. don't store phone number as numeric in one table and varchar in another. In fact, don't store it as numeric, if I come across a negative phone number I'll be mad.
  • Don't use spaces or other obscure characters in table/field names. They should be entirely alphanumeric - or if I had my druthers, entirely alphabetic with the exception of the underscore. I'm currently working on an inherited system where table and field names contain spaces, question marks and exclamation marks. Makes me want to kill the designer on a daily basis!
  • Don't use syntax keywords as object names it'll cause headaches trying to retrieve data from them. I hate having to wrap object names as [index] that's two needless chars I didn't need to type damn you!
@balabaster: I do this as well. Helps keep things nicely organized. More often than not, you have multiple searches PersonGetByPersonId, PersonGetByName, PersonGetByOderId, PersonGetByCity.. Prefixing code with Entity name keeps it very tight.
Raj More
@balabaster: question marks and exclamation marks in table and column names? i haven't seen that in ages. i feel for you, mate!
Raj More
@Tapori: Yes, it makes me wonder about some people, lol
+1 Nothing like opening a database to see 100 usp_Insert... sprocs
why the negative comment on triggers ??
@oo A number of reasons: they're a pain to debug, they cause un-necessary confusion and unless you know they're there and/or go looking for them, they cause side-effects that can be otherwise unexplainable. Thus, I only use the in situations they're meant to be used in to prevent phenomenon that would cause confusion for future developers/DBAs. It's not that I particularly dislike triggers, I'm just cognesant of the developer that's going to come in after me and I want to make their life as simple as possible.
I like these though I disagree with the "give field names a meaningful prefix" standard. If you have enough tables, the prefixes either get abbreviated down to uselessness or are long enough that you resent them every time you type column names. The query will tell you what table the column is from. Or, you can put a column alias in the query that takes care of this on a case by case basis.
Jeff Siver
+2  A: 

In addition to normalization to 3NF or BCNF (more about that in this question), I have found the following to be useful:

  • Name tables as plural nouns
  • Name columns as sigular

So a "People" table has a "PersonID" column.

  • There is nothing wrong with composite keys, so long as the rules of 3NF or BCNF still hold. In many cases (such as the "many-to-many" case) this is entirely desirable.
  • Avoid repeating the table name in the column names. peoplePersonID is better written as table.column anyway, and much more readable and therefore self-documenting. People.PersonID is better, to me at least.
  • ON DELETE CASCADE should be used very carefully.
  • Remember that NULL means one of two things: Either it's unknown or it's not applicable.
  • Remember also that NULLs have interesting affects on joins, so practice your LEFT, RIGHT, and FULL outer joins.
sorry but plural nouns for table names are my enemy (am working with a db where someone thought this a good idea...)
Peter Perháč
To be a little more precise: I advocate plural nouns for strong entities; i.e., those tables whose rows can exists without any dependancies in other tables. For weak entities (associations, subtypes, etc.), plural nouns are less suited. Verbs are probably better.All this is to help improve self-documentation and communicate the intentionality of the design.What sorts of trouble are you seeing in the DB that you mention?
I think you mentioned one of the problems already- the "people" table has a "personid" column. At least for us who name primary keys by appending "_id" (or "ID") to the table name, that's broken.Also in query, the predicates act on a row from a table, not the table itself. So for the query to read naturally, you have to realias all the tables to singular names to avoid "people.first_name" instead of the (imho clearer) "person.first_name". Again this ties into one of my standards: only use table aliases in queries if they're needed.
Also, tables aren't just the physical grouping of rows- they're a definition of a type, analagous to classes in the dev world. And in that context, it is conventional to name classes and types singularly (e.g. "typedef struct person" or "class Person").
@araqnid - Interesting points. re: Primary keys - I agree that if your convention is to name the PK as the table name + ID then the plural noun scheme is broken. But I'm not a fan of that convention. For one thing, tables with composite keys also break it.
re: Predicates in the queries - They act on the collection of (individual) rows in the table. Consider a WHERE predicate such as (People.age > 21). That considers the the entire set of "persons" (i.e., the "people") and returns those from whom the predicate is true. However, I also see your point that one could interpret that as WHERE any person.age > 21I agree that aliasing tables in queries should be used only when absolutely necessary.
re: Tables == Types - I strongly disagree. Allow me to channel CJ Date from his Third Manifesto: The relational counterpart to a OO class is a domain, NOT a table. Date writes, "A domain is a data type, system or user-defined, of arbitrary complexity, whose values are manipulable solely by the operators defined on them. A class is a data type, system or user-defined, of arbitrary complexity, whose values are manipulable solely by the operators defined on them." So class == data type. In this context, I think of rows as a collection of related truths expressed as an instance of the domain.
Great points, and a fascinating topic of conversation. It's great to rethink this stuff from time to time.

I agree with just about everything you have put there except for #5. I often use prefixes for tables and stored procedures because the systems that we develop have lots of different functional areas, so I will tend to prefix the tables and sprocs with an identifier that will allow for them to group nicely in Management Studio based on what area they belong to.

Example: cjso_Users, cjso_Roles, and then you have routing_Users, routing_Roles. This may sound like replication of data, but in reality the two different user/roles tables are for completely separate functions of the system (cjso would be for the customer-based ecommerce app while the routing would stand for employees and distributors who use the routing system).

I work with MS SQL Server. I use schemas to do this for me.
Raj More
+10  A: 

My standards for Oracle are:

  • Keywords are always in UPPERCASE;
  • Database object names are always in lowercase;
  • Underscores will replace spaces (ie there won't be any camel case conventions that are common on, say, SQL Server);
  • Primary keys will pretty much always be named 'id';
  • Referential integrity will be enforced;
  • Integer values (including table ids) will generally always be NUMBER(19,0). The reason for this is that this will fit in a 64-bit signed integer thus allowing the Java long type to be used instead of the more awkward BigInteger;
  • Despite the misnomer of appending "_number" to some column names, the type of such columns will be VARCHAR2 not a number type. Number types are reserved for primary keys and columns you do arithmetic on;
  • I always use a technical primary keys; and
  • Each table will have its own sequence for key generation. The name of that sequence will be _seq.

With SQL Server, the only modification is to use camel case for database object names (ie PartyName instead of party_name).

Queries will tend to be written multi-line with one clause or condition per line:

SELECT field1, field2, field2
FROM tablename t1
JOIN tablename2 t2 ON = t2.tablename_id
WHERE t1.field1 = 'blah'
AND t2.field2 = 'foo'

If the SELECT clause is sufficiently long I'll split it out one field per line.

working almost exclusively with Oracle, I must agree with you in every point. However, I do find the use of CamelCase more finger-friendly as opposed to typing a lot of underscores. We have a third-party system that has about 200 tables in its schema and they chose not to use underscores at all. e.g. PROJECTTASKTYPE instead of PROJECT_TASK_TYPE, and although it's somewhat harder to read, I have to admit typing queries like so is easier.
Peter Perháč
We use "old_school_names" for SQL Server too. Apart from that, our main difference would be that I prefer to use "tablename_id" rather than just "id" as the primary key- redundant, but useful from time to time; it also means that most of the time, the link between table A and table B is made on identically named columns, e.g. purchase.purchase_type_id=purchase_type.purchase_type_id. We also specify that table names should be singular, not pluralised ("purchase", not "purchases").
My architect says he wants to use underscores for join tables where the name is not apparent. For example, if Customer and Product join into Order with a PK of OrderId, we are set! But if Products and Category join, and there is no business name for the join, then we get Product_Category with a Product_Category_Id. (SQL Server)
Raj More
+3  A: 

Consistent naming standards. Having everyone on the same page, using the same format (whether it be Camel Case, specific prefixes, etc..) helps in being able to maintain a system accurately.

consistency is arguably more important than any particular standard. (so long as it's done my way, I'm happy :p)

I like our table naming convention:

People Table

Which helps make larger querys a bit more readable. and joins make a bit more sense:

Select * -- naughty!
From People
Join Orders on PEO_PersonID = ORD_PersonID

i guess rather than what the naming convention is, is the consistency of the naming.

SELECT * FROM PEOPLE PEO JOIN ORDER ORD ON PEO.PersonID=ORD_PersonID. Use aliases instead of prefixes.
Peter Perháč
Good idea, if starting from scratch i would defiantly consider it. However this is a long existing database, and not likely to have its conventions changed (which is a good thing, consistency is always a plus)
Schemas (mssql 2005 +) would be a better bet here!
I'm with ScottE on this one, schemas in mssql 2005 + and oracle make a lot more sense I think.If you are going to use prefixes, then make the prefix acronym make sense.
+7  A: 

don't forget to back up your databases on a regular basis.

Stan R.
I'd take this a step further. **Backup daily, restore weekly**. Always double check that you can actually restore backups otherwise you will find out at a difficult time that your backups are not good enough.
Raj More
+7  A: 
  1. Don't use type names in the field names. The older guys will remember the old MS standard of lpszFieldName and the stupidity that ensued.

  2. Use descriptive field names That follow normal language conventions. For example "FirstName" instead of "NameFirst"

  3. Each word in the field name is capitalized

  4. No underscores

  5. Do not use normal keywords such as "Index"

  6. Do not prefix ANYTHING with the object type. For example we do NOT use tblCustomers or spCustomersGet. These don't allow for good sorting and provide zero value.

  7. Use schemas to define separate areas of the database. Such as sales.Customers and hr.Employees. This will get rid of most of the prefixes people use.

  8. Loops of any kind should be viewed with suspicion. There's usually a better set based way.

  9. Use views for complicated joins.

  10. Avoid complicated joins when possible. It may be more astheticaly pleasing to have a CustomerPhoneNumbers table; but honestly, how many phone numbers do we really need to store? Just add the fields to the Customers table. Your DB queries will be faster and it's much easier to understand.

  11. If one table calls a field "EmployeeId" then EVERY SINGLE TABLE that references it should use that name. It doesn't need to be called CustomerServiceRepId just because it's in an extension table.

  12. Almost all tables have the "s" ending. For example: Customers, Orders, etc. After all the table holds many records...

  13. Evaluate your queries, indexes and foreign key relationships with an analysis tool. Even those that may be generated for you. You might be surprised.

  14. Linking tables which support many to many relationships have both linked tables in the name. For example, SchoolsGrades. It's very easy to tell by the table name what it does.

  15. Be CONSISTENT. If you start down one path with your conventions, don't change horses halfway unless you are willing to refactor all of the previous work. This should put the brakes on any "wouldn't it be great if.." ideas that end up causing confusion and vast amounts of rework.

  16. Think before you type. Do you really need that table, field, sproc, or view? Are you sure it isn't covered somewhere else? Get concensus before adding it. And if for some reason you have to take it out, talk to your team first. I've been at places where the DBA's make daily breaking changes without regard for the devs. This isn't fun.

Chris Lively
Strongly disagree with #10, this is a very bad practice most of the time, you have to change table structure to add a new phone type. You'd be surprised how many phone numbers you need to store for some people.I strongly agree with #8 though.
@HLGEM: most apps need between 1 and 3 phone numbers. Usually 2 is sufficient. Exactly how often have you ever needed to capture more? Or, more to the point, exactly how many numbers for a given person would actually be used by the business? Usually, 2: business and cell/home.
Chris Lively
+1 for #10. You can not exaggerate the cost of overnormalization.
We have 12 different types of phone number, although any one entity may typically have zero to 3 or four. Storing an auxiliary set of (supplier_id,phone_type_id,phone) makes sense. Not to mention that the process by which new types are created is murky.
@Chris Lively: WRT #10, IMHO depends on the usage.If you expect to call a customer using a phone number, and have the history show that you used a particular nubmer, then it makes sense to normalize. If you ship a product to a customer and the then you must normalize the address, because addresses change, and when you pa your taxes at the end of the year, the new home address state/city would get your tax money but you didnt pay the one you were supposed to!
Raj More
For #11, what do you do if you reference the same foreign column more than once in the same table?
@Chris, I worked on a system where we needed to store many different types of contact information, we used one table to handle phone numbers (home, cell, work, etc...), fax numbers, email addresses, IM information all sorts of things. It wasn't hard to work with and it made other parts of our system a heck of a lot easier to develop.
Nathan Koop
@Joe, I'd prefix with some meaningful information. IE, responsibleEmployeeId, or assistantEmployeeId
Nathan Koop
@Joe, Ditto what Nathan said.
Chris Lively
@Tapori, I agree that #10 probably depends somewhat on the system requirements. However, I don't think your examples are good ones. For call history, it would be better to put the actual phone number in the history table itself, the same about the address. It should be in the Order table. The reason is that this keeps your history intact even if someone deletes the records from your "phone" or "addresses" table. Duplicate data? Possibly, but your data is much easier to report on AND the data in those fields should never change.
Chris Lively
CONT: By keeping the phone number in the same record as the actual call history, then your UI / data layer can manage phone numbers with the main customer account without regard for damaging history. The same with addresses and orders.
Chris Lively
@araqnid: I'd call into question the idea of 12 different types of phone numbers. At the end of the day, does it really matter if it's a home, work, or cell number? Usually not. (I concede special cases) Those numbers are collected in case you need to call that person. Do you care where they happen to be when the call is made? Fax numbers are slightly different in that you *may* actually be faxing the person; and the only reason to know that a phone is a cell is if you are texting them.
Chris Lively
@Chris Lively: Not sure that I completely agree with you there. Storing a phone number with the dialed record is one thing, but storing the billing and shipping address with every order may be bit much. Also, when you normalize and have foreign keys from the address to the billing detail and shipping detail table, Referential Integrity will take care of the deletes! I have found that treating such data as a slowly changing dimension is much more to my liking.
Raj More
@Joe: For double joins (for example - lets keep a list of products to display on shelve side by side / in proximity) - I use ProductId_MainProduct, ProductId_KeepBesideMainProduct.
Raj More
@Tapori: It's the referential integrity that's the problem. When dealing with historical transactions you generally want to always keep the data as it is at the point in time that it occurred. Orders / OrderHistory is a perfect example of that. If the customer comes back and wants to change their address, it's a simple thing. Unless you have to maintain the previous address history in the same address table (to preserve the order shipping information). At that point your address storage logic has to become a bit more complex thereby introducing the potential of errors.
Chris Lively
+1  A: 

Document everything; wiki type documentation is easy to setup and the software is free.

Make sure you understand the interface first and design the database second. Most of the time its a lot better to know how the data you are going to use needs to work and then engineer the database. Most bad DB design happens as things evolve not upfront.

Then define the database standard and version you are going to work to. Define standards for the code elements (views, functions etc), database naming; naming conventions for columns, tables; type conventions for columns; coding templates.

Spend time considering how you define types having standard database types for fields or bespoke types are a good thing to sort out upfront.

As part of your documentation include a list of don'ts as well as dos for the application which include your prefered hated functionality cursors, triggers.

Review it regularly.

-1 I've never ever seen a real company which documented anything, much less everything. We're ISO 9001 certified. And we don't document anything important.
Might be a culture thing; at my current company we write a bible at the start of the development which covers db and development software standards. Last company teh development standards were like your ISO documents; me i prefer the documentation.
+5  A: 

The WITH clause really helps break queries down into manageable parts.

It also really helps for efficiency on the execution plans of the queries.

Tom Hubbard
+9  A: 
  • Name all constraints
Edward Shtern
I like the idea of giving meaningful names to the constraints.
Raj More
+2  A: 

Some others (albeit small) comments to throw against the wall...

SQL Server database schemas can be useful for both organizing tables and stored procedures as well as controlling security.

Every transactional table should always track both who and when created the record as well as updated the record in separate columns. I've seen implementation that simply used "update date" which can lead to auditing challenges in the future.

Use GUID's for row identifiers for all rows for projects with offline/synchronization requirements.

Cody C
+12  A: 

One thing I haven't seen mentioned yet:

Never use database keywords as object names. You do not want to have to qualify them every time you use them

If you misspell something when you create it, fix it as soon as you notice it. Don't spend years having to remember that in this table UserName is really Usernmae. It's a whole lot easier to fix when there isn't much code written against it.

Never use implied joins (the comma syntax), always specify the joins.

like the Branches.barnchID in our schema... I already got used to this but caused me great headache in the beginning. The schema is used by two quite different products, so renaming it wasn't really an option. But I agree with you. FIX MISSPELLING AS SOON AS POSSIBLE!
Peter Perháč
I think we still have a "care_hire_driver_id" primary key in one of our tables... several years later :o
+1 for the implied joins, I'm tired of fixing queries like that.
Honestly I find "implied joins" much simpler and easier to use. You can see instantly which tables you're selecting from because they're listed right next to each other e.g. "FROM table1, table2". As long as you specify the join fields first in the where clause there's no problem, e.g. "WHERE AND ..."
Oh and don't forget, table/field names that deviate from your naming system - for example plural for "Users" table but singular for "Message" table - count as misspellings, too, and should be fixed ASAP.
+1  A: 

13- Evaluate your queries

Thats true. Sometimes you don't get what you wanted.

For me, it's always useful to name the tables and fields with their exact content and (for us) in clear spanish and using Upper Camel Case, with no whitespaces:

User Name: NombreUsuario

First Last Name: ApellidoPaterno

Second Last Name: ApellidoMaterno

etc etc

+2  A: 
  • Tables are named in the singular, lowercase, no underscores, no prefix
  • Fields also lowercase, no underscores, no prefix
  • Stored procedures prefixed with "st_" (sorts nicely)
  • Views that are treated like tables have no prefix
  • Views created for special reports, etc. have a "v" prefix
  • Indexed views created for performance have an "ixv" prefix
  • All indexes have purposeful names (no auto-naming)
  • Strongly prefer uniqueidentifier (with sequential increment) over int IDENTITY for surrogate keys
  • Don't artificially limit VARCHAR/NVARCHAR fields to 100 or 255. Give them room to breath. This isn't the 1980s, fields are not stored padded to their max length.
  • 3NF minimum standard
  • Prefer joining tables to column-level foreign keys: many 1:m assumptions are challenged as a system grows over time.
  • Always use surrogate keys, not natural keys, as the primary key. All assumptions about "natural" keys (SSNs, usernames, phone numbers, internal codes, etc.) will eventually be challenged.
@richardtallent: Not sure what you mean here "Prefer joining tables to column-level foreign keys". can you please elaborate?
Raj More
@Tapori - Assume specs underestimate the fuzziness of reality. If someone says a 1:m relationship is appropriate, implement it as a joining table (thus allowing for m:m eventually) instead. (Obviously, only if performance will be ok.)Example:Today: "Each individual product in the warranty database should be assigned to a single customer." So you implement as a foreign key in the Product table: Product.CustomerID --> Customer.ID.Months from now: "Whoops, warranties are transferrable. We don't want to overwrite Product.CustomerID or falsify Product.SerialNumber. Make it many:many."
+5  A: 

Ensure that every varchar/nvarchar choice is appropriate.

Ensure that every NULLable column choice is appropriate - avoid NULLable columns where possible - allowing NULL should be the justifiable position.

Regardless of any other rules you might use from the suggestions here, I would create a stored procedure in the database that can be run on a regular basis to determine system health for any rules or standards you do have (some of this is a little SQL-Server specific):

  • Look for orphaned records in any cases where the DBMS system's referential integrity cannot be used for some reason (in my system I have a table of processes and a table of tests - so my system_health SP looks for processes without tests, since I only have a one-way FK relationship)

  • Look for empty schemas

  • Look for tables without primary keys

  • Look for tables without any indexes

  • Look for database objects without documentation (we use SQL Server Extended properties to put the documentation in the database - this documentation can be as granular as the column).

  • Look for system-specific issues - tables which need to be archived, exceptions which are not part of normal monthly or daily processing, certain common column names with or without defaults (CreateDate, say).

  • Look for non-deterministic UDFs

  • Look for TODO comments to ensure that code in the DB does not somehow have untested or pre-release code.

All this can be automated to give you an overall picture of system health.

Cade Roux
+2  A: 

Tabular formatted SQL.

select a.field1, b.field2
from       any_table   a
inner join blah        b on b.a_id       = a.a_id
inner join yet_another y on y.longer_key = b.b_id
where a.field_3         > 7
and   b.long_field_name < 2;

Part of this is to use uniformly long alias names (in the example, here, a, b, and y are all length 1).

With this kind of formatting, I can more quickly answer common questions like, "what table is aliased by 'a'?" and "which fields join table T into the query?" The structure doesn't take long to apply or to update, and I find that it saves a lot of time. We spend more time reading code than writing it.

Carl Manaster
@carl: my mentor hammered into my head that a, b, c's were bad, _bad_, _BAD_ as prefixes. so i use the same short name as prefix across the board.
Raj More
Until you get into a tabs vs spaces debate? :)
Andrew Coleson
@Tapori: This contrived example didn't have a basis for real names with real abbreviations. Probably I most commonly use two-character abbreviations for aliases, and in the short context of a simple query, it's enough. It frees me to use long, easy-to-understand actual table names.
Carl Manaster
+3  A: 

A few likes and dislikes.

My opinion is prefixes are horrible in every aspect. I currently work on a system where the tables are prefixed, and the columns within the tables are prefixed with 2 letter table name acronyms, I waste at least 30 mins each day working on this database because the acronym isn't logical. If you want to denote something with a prefix use a schema owner instead.

Using NVarchar from the start of a project if there is even a slight hint that down the line the text data will need to support multi lingual chars. Upgrading large databases because of lack of forward planning and thinking is a pain and wastes time.

Splitting each condition within a where clause onto a new line for readability (in and not in statements wrapped in brackets and tabbed in.) I think this is the important standard for me.

I worked at one company where a standard was that comma's must always be placed at the start of a line when performing parameter or variable declarations. This apparently made it more readable however I found it a complete nightmare.

A comma at the front of a parameter (instead of the end) is hard to get used to, but is handy when you want to comment something out. No need to comment out that specific line and the comma on the line before.
I will admit as well, the guy who was in charge of the company we were outsourced too was a total idiot who would reject a project for 1 comma being out of place. The rejection process wasted 3 days of everyones time, for 1 comma. Then he tried to sell us a code generator which created stored procs to the letter of his law. So thats another non technical reason why I hate that comma standard!
+7  A: 

If a database is for a particular application, have a version table so that the database releases can be checked against the code releases (amongst other reasons).

+1  A: 

Taking "database" to mean "SQL product", my answer is, "Too many to mention. You could write a whole book on the subject." Happily, someone has.

We use Joe Celko's SQL Programming Style (ISBN 978-0120887972): "this book is a collection of heuristics and rules, tips, and tricks that will help you improve SQL programming style and proficiency, and for formatting and writing portable, readable, maintainable SQL code."

Advantages of this approach is include:

  • the guy knows more about this kind of thing than me (is there another book on SQL heuristics?!);
  • the work has already been done e.g. I can give the book to someone on the team to read and refer to;
  • if someone doesn't like my coding style I can blame someone else;
  • I recently got a load of rep on SO by recommending another Celko book :)

In practice we do deviate from the prescriptions of The Book but surprisingly rarely.

+1  A: 

In MS-SQL, I've always had objects owned by dbo., and I prefix calls to those objects with dbo.

Too many times I've seen our devs wonder why they can't call their objects that they inadvertainly owned.

+6  A: 

Putting everybody's input together into one list.

Naming Standards

  • Schemas are named by functional area (Products, Orders, Shipping)
  • No Hungarian Notation: No type names in object names (no strFirstName)
  • Do not use registered keywords for object names
  • No spaces or any special characters in object names (Alphanumber + Underscore are the only things allowed)
  • Name objects in a natural way (FirstName instead of NameFirst)
  • Table name should match Primary Key Name and Description field (SalesType – SalesTypeId, SalesTypeDescription)
  • Do not prefix with tbl_ or sp_
  • Name code by object name (CustomerSearch, CustomerGetBalance)
  • CamelCase database object names
  • Column names should be singular
  • Table names may be plural
  • Give business names to all constraints (MustEnterFirstName)

Data Types

  • Use same variable type across tables (Zip code – numeric in one table and varchar in another is not a good idea)
  • Use nNVarChar for customer information (name, address(es)) etc. you never know when you may go multinational

In code

  • Keywords always in UPPERCASE
  • Never use implied joins (Comma syntax) - always use explicit INNER JOIN / OUTER JOIN
  • One JOIN per line
  • One WHERE clause per line
  • No loops – replace with set based logic
  • Use short forms of table names for aliases rather than A, B, C
  • Avoid triggers unless there is no recourse
  • Avoid cursors like the plague (read


  • Create database diagrams
  • Create a data dictionary

Normalization and Referential Integrity

  • Use single column primary keys as much as possible. Use unique constraints where required.
  • Referential integrity will be always enforced
  • OLTP must be at least 4NF
  • Evaluate every one-to-many relationship as a potential many-to-many relationship
  • Non user generated Primary Keys
  • Build Insert based models instead of update based
  • PK to FK must be same name (Employee.EmployeeId is the same field as EmployeeSalary.EmployeeId)
  • Except when there is a double join (Person.PersonId joins to PersonRelation.PersonId_Parent and PersonRelation.PersonId_Child)

Maintenance : run periodic scripts to find

  • Schema without table
  • Orphaned records
  • Tables without primary keys
  • Tables without indexes
  • Non-deterministic UDF
  • Backup, Backup, Backup

Be good

  • Be Consistent
  • Fix errors now
  • Read Joe Celko's SQL Programming Style (ISBN 978-0120887972)
Raj More
+1  A: 

Avoid silly abbreviation conventions, such as comprehensive dictionaries of abbreviations that actively encourage monstrosities like EMP_ID_CONV_FCTR_WTF_LOL_WAK_A_WAK_HU_HU. This rule is inspired a real set of guidelines I've seen before.

+1  A: 

Table name matches Primary Key name and description key

I have just recently, after years of agreeing with this, jumped ship, and now have an "ID" column on every table.

Yes I know, when linking tables it's abiguous! But so is linking ProductID to ProductID, so uhh, why the extra typing?


SELECT p.Name, o.Quantity FROM Products p, Orders o WHERE o.ProductID = p.ID

Is slightly better than this:

SELECT p.Name, o.Quantity FROM Products p, Orders o WHERE o.ProductID = p.ProductID

Note that both will require table or alias prefixes. But not only am I typing slightly less (multiply that across dozens of tables with long descriptive names and it adds up fast in a data intensive application) but it also makes it easier to know which table is the parent table in every join, which, when joining 8-10 tables in a query, can help quite a bit.

Neil N
+1  A: 

I follow a lot of the same conventions as others here, but I wanted to say a few things that haven't been said yet.

Regardless of whether you like plural names or singular names for your tables, be consistent. Choose one or the other, but don't use both.

The primary key in a table has the same name as the table, with the suffix _PK. Foreign keys have their same name as their corresponding primary key, but with a suffix of _FK. For example, the Product table's primary key is called Product_PK; in the Order table the corresponding foreign key is Product_FK. I picked this habit up from another DBA friend of mine and so far I'm liking it.

Whenever I do an INSERT INTO...SELECT, I alias all the columns in the SELECT portion to match the names of the columns from the INSERT INTO portion to make it easier to maintain and see how things match up.

+1  A: 

The most important standard is: don't have a database by default. I find too many developers grabbing a database for projects where life would have been much easier without one (at least yet). It is just a tool in the toolbox, and not every problem is a nail.

Inappropriate use of a database leads to anemic domain models, badly testable code and unneeded performance problems.

Stephan Eggermont