views:

64

answers:

4

I was a MySQL user. Now I'm migrating to SQL Server. But I have a problem. I can not find any way for specifiying the kind of index a table has. In MySQL I could easily say to build a BTree Index or Hash Index. How can I do that here?

The main problem is that I have two tables. one of them (named "posts") has a foreign key to the other (named "users") which has a primary key constraint on "id". In my java program for inserting posts I have to check whether the user of this post has been inserted or not and if not, insert it (I can not insert all users first!).

This code was inserting about 1000 posts in each 10 seconds in MySQL. But In SQL Server the search part takes too much time and 1000 posts takes more than 1 minute.

This is the slow SQL query:

select * from users u where u.id = "UserName"

This is the user table:

CREATE TABLE [dbo].[users](
[id] [varchar](50) NOT NULL,
[type] [char](1) NULL,
[name] [nvarchar](150) NULL,
[reserved] [char](8) NULL,
[description] [nvarchar](4000) NULL,
[text] [ntext] NULL,
 CONSTRAINT [PK__users__3213E83F00551192] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

What's the problem?

Thank you.

+1  A: 

SQL Server does not support hash indexes. The list of available indexes can be found here.

If you are trying to insert thousands of records per second it sounds like you would be better off using a BULK INSERT. Ideally you would do this:

  • Bulk insert all users.
  • Bulk insert all messages.

Given your requirement to not insert all the users first I would suggest something like this:

  • Remove your foreign key constraint.
  • Bulk insert all the messages.
  • Query to find which users are missing and insert them (this can be done in one statement).
  • Add the constraint again.
Mark Byers
It sounds rational.
Shayan
+2  A: 

In MySQL I could easily say to build a BTree Index or Hash Index.

Indexes are not ANSI standard, the similarity between database vendors is amazing for that fact. Additionally, MySQL and SQL Server are more similar than other pairs... While SQL Server doesn't allow you to specify indexes as BTREE or HASH, there are other controls exposed that are not provided by MySQL.

You'll have to post your query for us to comment on how it might be optimized. Assuming the following is what you're referring to:

select * from users u where u.id = "UserName"
  1. SELECT * is only valid if you are using the data from all the columns returned. Columns containing large amounts of text can negatively impact query performance. Ideally, never use SELECT * - always be explicit about what columns are being used. There's numerous questions on SO about SELECT * performance and how it should be avoided.
  2. Storing a string as the id value will not be as fast as if it were an INT
  3. Use single quotes, not double, to indicate a string in SQL
OMG Ponies
select * from users u where u.id = "UserName"
Shayan
But this query was very fast in MySQL
Shayan
@Shayan: MySQL is an entirely different database - at a high level you're comparing apples to apples, but it still really like comparing Macintosh apples to Crab apples.
OMG Ponies
+1  A: 

The first way you should create an index is to create your primary key. This will have an associated unique index. The u.id column may be a candidate for the primary key if u.id contains unique values.

If u.id is not unique, or you don't want it to be the primary key, you can create an index on it with the following:

CREATE INDEX ix_users_id
ON users (id)
;

That will probably give you better performance on the query you show.

Of course, as others indicated, you will need to do more than this to fully design the index strategy.

bobs
I forgot to write. I have a primary key constraint on id.
Shayan
By default, SQL Server defines a CLUSTERED index on the primary key unless a CLUSTERED key has already been defined for the table. A CLUSTERED index doesn't necessarily mean it's attached to the primary key column(s), but it's a likely possibility.
OMG Ponies
There is a clustered index on it.
Shayan
Agreed. Identifying the CLUSTERED index would be appropriate as well, particularly where you (@OMG Ponies) mentioned that all columns are returned. I avoided that part of the issue in my answer.
bobs
A: 

Are you sure the query in your post is exactly the query you're running? Could it be by any chance you're running select * from users u where u.id = @userName?

An extreamly common problem in application development is pasisng in Unicode parameters for seeking ASCII columns. This can sneak into the code by a simple mistype in the mapping type in NHibernate, or a parmeter added with SqlCommand.Parameters.AddWithValue("@userName", Request["userName"]); or somehtig similar. The result is that the query will do a scan instead of the a seek because of the rules of SQL data type precendence.

I'm not saying this is your problem. I'm saying is that there could be all sort of little things like this that may had laded you on a wrong side of a query plan. Luckly, there are tools to troubleshot and investigate any problem. Here is a good starter: Troubleshooting Performance Problems in SQL Server 2005.

As a side note, BTree indexes are blazingly fast, you don't need Hash indexes ;)

Remus Rusanu