views:

44

answers:

2

This could turn out to be the dumbest question ever.

I want to track groups and group members via SQL.

Let's say I have 3 groups and 6 people.

I could have a table such as:

alt text

Then if I wanted to have find which personIDs are in groupID 1, I would just do

select * from Table where GroupID=1

(Everyone knows that)

My problem is I have millions of rows added to this table and I would like it to do some sort of presorting about GroupID to make lookups as fast as possible. I'm thinking of a scenario where it would have nested tables, where each sub table would contain a groupID's members. (Illustrated below)

alt text

This way when I wanted to select each GroupMembers, the structure in SQL would already be nested and not as to expensive look up as would trolling through rows.

Does such a structure exist, in essence, a table that would pivot around the groupID ? Is indexing the table about groupID the best/only option?

+1  A: 

If the inserts will typically be incremental (in other words, when you add a row you will typically add a groupid + personid that are greater than the last row) you can create a clustered index on groupid + personid and that will make SQL physically store the rows in that order and it makes a lookup on that key very fast.

Josh Einstein
Sounds like what I need -- will try it out.
Matt
+2  A: 

Perhaps you see it otherwise at the moment, but what you ask is nothing else but an index on GroupId. But there are many more shades of gray, a lot depends on how you plan to use the table (the actual queries you're going to run) and the cardinality of expected data.

  • Should the table be clustered by (PersonID) with a non clustered index on (GroupId))?
  • Should it be a clustered index on (GroupId, PersonID) with a non clustered index on (PersonId)?
  • Or should it be clustered by (PersonId, GroupId) with a non clustered index on (GroupId, PersonId)?
  • ...

All are valid choices, depending on your requirements, and the choice you make is pretty much going to make or break your application.

Approaching this problem from the point of view of what EF or other ORM layer gives you will likely result in a bad database design. Ultimately your whole app, as fancy and carefully coded as as it is, is nothing but a thin shell around the the database. Consider approaching this from a sound data modeling point of view, create a good table schema design, and then write your code on top of it, not the other way around. I understand this goes against everything the preachers on the street recommend today, but I've seen too many applications designed in the Visual Studio various data context editor(s) fail in deployment...

Remus Rusanu
'Approaching this problem from the point of view of what EF or other ORM layer gives you will likely result in a bad database design. Ultimately your whole app, as fancy and carefully coded as as it is, is nothing but a thin shell around the the database."I love this statement.
HLGEM