views:

56

answers:

2

Hi ,

I have a table as below

dbo.UserLogs

-------------------------------------
Id | UserId |Date | Name| P1 | Dirty
-------------------------------------

There can be several records per userId[even in millions]

I have clustered index on Date column and query this table very frequently in time ranges. The column 'Dirty' is non-nullable and can take either 0 or 1 only so I have no indexes on 'Dirty'

I have several millions of records in this table and in one particular case in my application i need to query this table to get all UserId that have at least one record that is marked dirty.

I tried this query - select distinct(UserId) from UserLogs where Dirty=1

I have 10 million records in total and this takes like 10min to run and i want this to run much faster than this.

[i am able to query this table on date column in less than a minute.]

Any comments/suggestion are welcome.

my env 64bit,sybase15.0.3,Linux

+2  A: 

my suggestion would be to reduce the amount of data that needs to be queried by "archiving" log entries to an archive table in suitable intervals.

You can still access all entries if you provide a union-view over current and archived log data, but accessing current logs would be much reduced.

lexu
thanks for the suggestion..i did consider this before but atm i am looking at other possible options
daedlus
+1  A: 

Add an index containing both the UserId and Dirty fields. Put UserId before Dirty in the index as it has more unique values.

Guffa