views:

100

answers:

3

Hello everybody!

I am not an expert of MySQL, and I have search a lot about the following problem without finding the solution.

So, I'm using a MySQL table with this structure:

CREATE TABLE photos (
  file varchar(30) NOT NULL default "",
  place tinytext NOT NULL default "",
  description tinytext NOT NULL default "",
  type char(1) NOT NULL default "",
  taken year NOT NULL default "0000",
  modified tinyint NOT NULL default 0,
  PRIMARY KEY (file)
);

The first parameter is the relative path to a file. It is used as a primary key.

Currently, the rows are not sorted (which means that the order I observe in phpMyAdmin, by default, is the order in which the items were inserted in the table).

As 99% of the accesses to that table are SELECT (INSERT and UPDATE are rare in my program), I suppose that I should add an INDEX to the column 'file' (almost all the SELECT use only this column).

However, my question is "wouldn't it be more efficient to always store the rows by alphabetical order of 'file', instead of creating an index?" and so, "is it possible to inform MySQL that the data is sorted in that order and force it to keep that order when inserting elements in the table?".

I'm not sure my questions really make sense, because maybe this is just a particular case in which there is already a kind of index in the table... But I would really like to have an answer...

Thanks in advance!

+3  A: 

If file is your primary key, it will almost certainly already have an index for it. That's how primary keys work. You don't have to specify another index for that particular column.

And, in response to your question as to whether you can store the rows sorted in the DBMS, you can't. SQL is a relational algebra that will extract rows in any order it wants to, unless you specifically use an order by clause.

When you execute the query:

select * from photos;

there's no guarantee in what order the rows will be delivered to you. On the other hand:

select * from photos order by file;

does guarantee the order, and the file primary key index will be used to extract records in that order. There should be no sorting or other performance penalty since the primary key index is built for ordered access to the rows.

You're best to think of a primary key as just a special type of index. You can certainly index other columns if you think it's necessary (although it sounds like that's not the case here).

paxdiablo
+1  A: 

However, my question is "wouldn't it be more efficient to always store the rows by alphabetical order of 'file', instead of creating an index?" and so, "is it possible to inform MySQL that the data is sorted in that order and force it to keep that order when inserting elements in the table?".

That is called clustered index on MSSQL parlance. Though doing ORDER BY on file field(that has clustered index) shall speed up your query, inserting records could slow down due to physical re-ordering(see page-splitting) of records; as incoming file values are not sequential in nature. Clustered index is ideal on auto-incremented primary key or any id which are entered sequentially in the database. Or perhaps you could increase the page size of your clustered index on your file field so page-splitting won't occur much. There could be only one clustered index per table for reason that is obvious.

[EDIT: November 23, 2009 5:55 PM CN]

Hi Peter, Mysql has clustered index. Mysql, though it supports clustered index, it can only do so on primary key. If your filename field is really a primary key then you can flag clustered index on it. Clustered index on MySQL is for primary key only, while MSSQL will allow you on any field. Clustered index other than primary key is useful, for example contacts database; though account id (some sort of internal number that is not user-facing) is the primary key, it's more prudent to assign clustered index on contact name, because that's how you frequently present the contacts list to users, think iPhone or any smartphone contact database, or CRM, etc, they frequently list the contacts using name, not their telephone number nor their internal id(primary key(think guid or some integer identity))

Michael Buen
That's great, if the OP was using SQL Server...
OMG Ponies
@OMG Ponies: First: What's wrong in knowing about clustered index even if you use RDBMS that doesn't support them? OP apparently needs some help with basics of relational databases. Second: While MyISAM engine doesn't support clustered indices at all, InnoDB organizes *all* tables as B-Tree based clustered index.
Tomek Szpakowicz
Thank you Michael! I will try clustered index. I indeed need it on my primary key!
Peter Potrowl
A: 

The short answer: You shouldn't care.

In pure SQL, as pax has noted, there is no notion of how the data is stored. If you want to use a column for access, then specify an index - which will be almost always automatically done if you specify a Primary key.

There are optimizations, but this will Be Limited to special circumstances that you shouldn't Be concerned with unless you are a DBA.

IronGoofy