views:

586

answers:

8

I have a table which has two varchar(Max) columns

Column 1      Column 2
-----------------------
URLRewitten   OriginalURL

its part of my url re-writing for an asp.net webforms site.

when a url comes in I do a check to see if its in the table if it is i use the OriginalURL.

My question is, if all I'm doing is querying the table for urls and no other table in the database will ever link to this table does it need a dedicated primary key field? like an auto-number? will this make queries faster?

and also how can I make the query's run as faster?

Edit: I do have a unique constraint on URLRewitten.

Edit: ways i'm using this table..

  • Query when a new Request comes in.. search on URLRewitten to find OriginalURL
  • When needing to display a link on the site, i query on the OriginalURL to find the URLRewitten url i should use.
  • When adding a new url to the table i make sure that it doesn't already exist.

thats all the querys i do.. at the moment.

Both columns together would be unique.

+1  A: 

You don't NEED one, but it is best to have one in your table for future scalability. Even though you said you won't expand the usage of the DB further, it's good to be prepared.

FiberOptick
Maybe specify how a primary key would help scalability? Even partitioning does not require one.
Andomar
+12  A: 

Do you need a primary key? Yes. Always. However, it looks like in your case OriginalURL could be your primary key (I'm assuming that there wouldn't be more than one value for URLRewritten for a given value in OriginalURL).

This is what's known as a "natural key" (where a component of the data itself is, by its nature, unique). These can be convenient, though I have found that they're generally more trouble than they're worth under most circumstances, so yes, I would recommend some sort of opaque key (meaning a key that has no relation to the data in the row, other than to identify a single row). Whether or not you want an autonumber is up to you. It's certainly convenient, though identity columns come with their own set of advantages and disadvantages.

For now I suppose I would advise creating two things:

  1. A primary key on your table of an identity column
  2. A unique constraint on OriginalURL to enforce data integrity.
Adam Robinson
-1 primary keys are not required
Andomar
+1 Never, EVER make a table without a primary key. End of story.
Adam McKee
@Andomar: I hope any prospective employer reads that statement. There is no excuse not to have a primary key of SOME kind on any permanent table.
Adam Robinson
Primary keys add overhead, which can hurt if you're logging hundreds of page referrals per second. Same for a unique constraint.
Andomar
I don't believe in natural primary keys. In Poland we have unique id number (something like SSN in US) and there are few people with the same id :)
Grzegorz Gierlik
@Grzegorz: I agree 100%. Natural keys are problems waiting to happen.
Adam Robinson
@Andomar: Overhead for a primary key? I'll grant you the unique constraint, but the overhead for a primary key would be immeasurably small (assuming that it's an opaque key, and not something that's potentially going to cause the table to be reorganized on an insert or an update). Again, never an excuse not to have a primary key in some form or fashion.
Adam Robinson
Andomar's first comment is absolutely right. He just says, that generally speaking a primary key is not required, which is correct. You should have a good reason to leave it out, though.
rudolfson
@rudolf: I don't think anyone here is under the impression that primary keys are a technical requirement and that the application or database will not function without one, so I'm not seeing any other way to take the statement other than "you don't always need to add a primary key".
Adam Robinson
+1 Agreed, always have a PK
Otávio Décio
Agree with @rudolfson. @Adam Robinson: a primary key on an identity column is small overhead indeed. But a primary key on an URL would add the same overhead as a unique constraint.
Andomar
@Andomar: We're only talking in generalities here. If we're referring to his specific scenario, then that won't matter because this is a static lookup table. It isn't a log, and isn't going to be highly transactional.
Adam Robinson
I go one step further and say that (almost) all permanent tables should have both an artificial surrogate key and a unique constraint. Which one you make clustered depends on the usage. Consider the simple case of a code/lookup table -- you don't want to have the same item twice, EG two rows in a state table with Texas as the state name or TX as the abbreviation.
JohnOpincar
In the logical model, a relation should have at least one candidate key. In the physical model, a relation is implemented as a table and every candidate key should be implemented with something that will ensure uniqueness: could be a PRIMARY KEY, could be a UNIQUE, could even be a trigger (a valid-state 'history' table is a common example of a table whose required sequenced key cannot be implemented using a PRIMARY KEY constraint). In a SQL product which uses contiguous storage, every table should have a clustered index.
onedaywhen
+1  A: 

An identity column can help when you search for recent events:

select top 100 * from table order by idcolumn desc

We'd have to know what kind of queries you are running, before we can search for a way to make them faster.

Andomar
+5  A: 

I'd put one in there anyway... it'll make updating alot easier or duplicating an existing rule...

i.e. this is easier

UPDATE Rules SET OriginalURL = 'http://www.domain.com' WHERE ID = 1

--OR

INSERT INTO Rules SELECT OriginalUrl, NewUrl FROM Rules WHERE ID = 1

Than this

this is easier

UPDATE Rules SET OriginalURL = "http://www.domain.com" WHERE OriginalURL = 'http://old.domain.com'

--OR

INSERT INTO Rules SELECT OriginalUrl, NewUrl FROM Rules WHERE OriginalURL = 'http://old.domain.com'

In terms of performance, if your going to be searching by OriginalURL, you should add an index to that column,

Eoin Campbell
The question specifies there is no need for updates: "if all I'm doing is querying the table for urls"
Andomar
There is always a need for updates. What if he has a typo in one of his URLs and notices a moment after he has inserted. I would assume he meant "Won't need to be regularly updated"
Eoin Campbell
+1  A: 

As you are doing your query on the URLRewritten column I don't think adding an auto-generated primary key would help you.

Have you got an index on your URLRewritten column? If not, create one: that should see a big increase in the speed of your queries (perhaps just make URLRewritten your primay key?).

Ian Oxley
A: 

I don't think adding auto generated primary key will make your query faster.

However there are are a few things to consider:

  1. I would not be so sure, that never ever nothing will link to this table :(.
  2. I've seen a lot of people asking about how to i.e. remove duplicates from table like that -- with primary key it is much easier.
  3. To make this query faster we need to know more about this table and ways of using it...

In my opinion, every table, must have auto generated primary key (i.e. identity in MSSQL).

I don't believe in unique natural keys.

Grzegorz Gierlik
A: 

Yes there should be a Primary Key Because you can set INDEX on that Primary Key for Fast Access

Access Denied
+2  A: 

I would use the OriginalURL as your primary key as I would assume this is unique. Assuming your are using SQL-Server you could create an index on RewrittenURL with OrigionalURL as an "Included column" to speed up the performance of the query.

Joe Swan