tags:

views:

274

answers:

3

I have several entities that I require users be able to add custom fields to.

If I had an entity called customer with base variables like {Name, DateOfBirth, StoreId} and another one called Store with {Name}

Then I would want it so that the owner of that store could login and add a new variable for all their customers called favourite colour which is a dropdown with red, green or blue as options.

Now I have had a look at EAV and come up with a solution that looks like this

Attribute {StoreId, Name, DataType}, Value {AttributeId, EntityName, EntityId, Value}

I'm wondering is there some solution that will work best for SQL Server 2008 especially given that I'll want to be able to view and query this information easily.

I've heard that you can query within the xml datatype. Is that a better way to go?

I will also probably want users to be able to add custom fields that are foreign keys at some point too.

Will be looking at this all day so will ask questions quickly.

+6  A: 

EAV in general is an anti-pattern that results in dismal performance and chokes the scalability. Now if you decide to go with EAV, the SQL Server Customer Advisory Team has published a white paper with common pitfalls and problems and how to avoid them: Best Practices for Semantic Data Modeling for Performance and Scalability.

Querying an XML data type is possible in SQL, but if your XML has no schema then querying it it will be slow. If it has a schema and the schema is EAV, then it will have all the problems of relational EAV plus some of its own for XML performance. again the good folks of the CAT team have published a couple of white papers on the topic: XML Best Practices for Microsoft SQL Server 2005 and Performance Optimizations for the XML Data Type in SQL Server 2005. They are valid for SQL 2008 too.

Remus Rusanu
+4  A: 

I've been using the XML features of SQL 2005 / 2008 for a while. I've come to rely on XML columns quite a bit. What you want to do sounds like the perfect candidate for XML. For instance, the following snippet defines your 2 entities (@customers and @stores), with a column called "attrs" that can be expanded to include more attributes. I hope this helps!

declare @customers as table ( id int, attrs xml);
INSERT INTO @customers VALUES 
    (1,'<Attrs Name="Peter" DateOfBirth="1996-01-25" StoreId="10" />'),
    (2,'<Attrs Name="Smith" DateOfBirth="1993-05-02" StoreId="20" />')
;
declare @stores as table ( id int, attrs xml);
insert into @stores VALUES
    (10, '<Attrs Name="Store1" />'),
    (20, '<Attrs Name="Store2" />')
;
With c as (
    select id as CustomerID, 
     attrs.value('(/Attrs[1])/@Name', 'nvarchar(100)') as Name,
     attrs.value('(/Attrs[1])/@DateOfBirth', 'date') as DateOfBirth,
     attrs.value('(/Attrs[1])/@StoreId', 'int') as StoreId
    from @customers
), s as (
    select id as StoreID, 
     attrs.value('(/Attrs[1])/@Name', 'nvarchar(100)') as Name
    from @stores
)
select * 
from c left outer join s on (c.StoreId=s.StoreID);
Nestor
+2  A: 

Excellent answers already. I'll only add the suggestion that you maintain metadata for the custom fields as well. This would make a UI for entering the custom fields easier - you'd be able to limit the set of custom fields for a Customer, for instance, and to specify that DateOfBirth is to be a date, and that StoreID is meant to match the ID of an actual store.

Some of this metadata could be maintained as XML schemas. I've seen that done, with the schemas stored in the database, and used to validate custom fields being input. I do not know if those schemas can also be used to strongly-type the XML data.

John Saunders