views:

35

answers:

4

I'm building a commenting mechanism into an application that allows a programmer/plugin author to implement comment threads in a simple manner.

The way I wanted to do this is by using a unique identifier for the comment threads, which took the hard work away from the developer by using a unique key for the thread, where you can place code like this anywhere in the application.

let's say the programmer wanted to add comments to an image upload plugin he calls "my images". In the code he can then call something like:

insertCommentThread('myimages:340');

another developer might have a more complicated thing and he wants to add comments to a wiki entry:

insertCommentThread('wiki-entry-page-name-it-could-be-long');

So the developer can call the threads any name they like.

I'm a bit worried about the speed of things if the length of the keys will become long, so I'd like to store the keys in some other format.

So my question is...

Is there a way to store a string key in some encoded way so that it becomes smaller and faster to lookup?

(I could hash the strings, but then readability of the database gets lost...)

btw. I'm using MySQL

A: 

I think you would want this column to be a unique, indexed field.. not the primary key.

Fosco
This way, you would rarely be using it as a lookup... (If my assumptions are correct)
Fosco
sorry, that's what I meant. will edit :P
arnorhs
corrected. thanks for the help
arnorhs
Since you made the change... is there an actual issue or just a theoretical concern? Are you having speed issues or are you just planning for something which may or may not become an issue?
Fosco
+1  A: 

Why do you need to make your search string a primary key?

I would use a numeric primary key for speed, and a separate unique lookup field for the long string.

You will most likely have to do a duplicate check before you insert the record anyway, and find a substitute if the check fails. I'm not sure how much mySQL's UNIQUE constraint will help you here.

Unicron
accident. fixed
arnorhs
+1: Exactly. Plus, what if the title changes - how to you refer the comment to the correct thread?
OMG Ponies
A: 

I would suggest to build the functions with 2 parameters. One for the the comment type and one for the unique comment itselfe:

insertCommentThread('images', '340');
insertCommentThread('wiki', 'entry-page-name-it-could-be-long');

The DB I would design then like this:

ID (int) - GROUP (varchar) - NAME (varchar)
PriKey: ID
Unique: (Group + Name)

This way you can limit your query to the module (commenttype) currently loaded and it is also possible to have the same comment within different groups.

JochenJung
will it not be slower to query two separate varchar keys instead of a single one, when you've inserted a lot of data?
arnorhs
When you have an index over both varchar keys, this should not matter.
JochenJung
+1  A: 

You can create a hashed index on a column without leaving the column unreadable; it is the index that is hashed, not the data. That would seem to be the way to go if you don't want to search on ranges.

Brian Hooper