I'm not convinced you need to distinguish between companies and persons; only registered and unregistered authors.
I added this for clarity. You could simply use a check constraint on the Authors table to limit the values to U and R.
Create Table dbo.AuthorRegisteredStates
(
Code char(1) not null Primary Key Clustered
, Name nvarchar(15) not null
, Constraint UK_AuthorRegisteredState Unique ( [Name])
)
Insert dbo.AuthorRegisteredState(Code, Name) Values('U', 'Unregistered')
Insert dbo.AuthorRegisteredState(Code, Name) Values('R', 'Registered')
GO
The key in any database system is data integrity. So, we want to ensure that usernames are unique and, perhaps, that Names are unique. Do you want to allow two people with the same name to publish an article? How would the reader differentiate them? Notice that I don't care whether the Author represents a company or person. If someone is registering a company or a person, they can put in a first name and last name if they want. However, what is required is that everyone enter a name (think of it as a display name). We would never search for authors based on anything other than name.
Create Table dbo.Authors
(
Id int not null identity(1,1) Primary Key Clustered
, AuthorStateCode char(1) not null
, Name nvarchar(100) not null
, Email nvarchar(300) null
, Username nvarchar(20) not null
, PasswordHash nvarchar(50) not null
, FirstName nvarchar(25) null
, LastName nvarchar(25) null
...
, Address nvarchar(max) null
, City nvarchar(40) null
...
, Website nvarchar(max) null
, Constraint UK_Authors_Name Unique ( [Name] )
, Constraint UK_Authors_Username Unique ( [Username] )
, Constraint FK_Authors_AuthorRegisteredStates
Foreign Key ( AuthorStateCode )
References dbo.AuthorRegisteredStates ( Code )
-- optional. if you really wanted to ensure that an author that was unregistered
-- had a firstname and lastname. However, I'd recommend enforcing this in the GUI
-- if anywhere as it really does not matter if they
-- enter a first name and last name.
-- All that matters is whether they are registered and entered a name.
, Constraint CK_Authors_RegisteredWithFirstNameLastName
Check ( State = 'R' Or ( State = 'U' And FirstName Is Not Null And LastName Is Not Null ) )
)
Can a single author publish two articles on the same date and time? If not (as I've guessed here), then we add a unique constraint. The question is whether you might need to identify an article. What information might you be given to locate an article besides the general date it was published?
Create Table dbo.Articles
(
Id int not null identity(1,1) Primary Key Clustered
, AuthorId int not null
, PublishedDate datetime not null
, Headline nvarchar(200) not null
, Content nvarchar(max) null
...
, Constraint UK_Articles_PublishedDate Unique ( AuthorId, PublishedDate )
, Constraint FK_Articles_Authors
Foreign Key ( AuthorId )
References dbo.Authors ( Id )
)
In addition, I would add an index on PublishedDate to improve searches by date.
Create Index IX_Articles_PublishedDate dbo.Articles On ( PublishedDate )
I would also enable free text search to search on the contents of articles.
I think concerns about "empty space" are probably premature optimization. The effect on performance will be nil. This is a case where a small amount of denormalizing costs you nothing in terms of performance and gains you in terms of development. However, if it really concerned you, you could move the address information into 1:1 table like so:
Create Table dbo.AuthorAddresses
(
AuthorId int not null Primary Key Clustered
, Street nvarchar(max) not null
, City nvarchar(40) not null
...
, Constraint FK_AuthorAddresses_Authors
Foreign Key ( AuthorId )
References dbo.Authors( Id )
)
This will add a small amount of complexity to your middle-tier. As always, the question is whether the elimination of some empty space exceeds the cost in terms of coding and testing. Whether you store this information as columns in your Authors table or in a separate table, the effect on performance will be nil.