views:

265

answers:

6

I apologise for the long-winded, rantyness of this but I'm up at 3am dreading having to go to work tomorrow and deal with this database... it just feels wrong but maybe I'm wrong and just like things done my way... Please tell me what you think.

Our database schema looks something like:

page: id, label, contentid, parentpageid

content: id, xmldata

We have about 4000 entries in the page table...

I have included a vastly simplified page hierarchy which looks something like the code listing at the bottom of this post.

The source of the problem is the fact the database is fraught with things like this:

  • For every single 'page' with 'label' SET (there are 3) there are 16 children, each labeled with SECTION numbers.
  • Every single SECTION has a child with the name 'About' and a child with the name 'Info'
  • Every single 'About' page has a set of children labeled: 'About Things', 'About Stuff' and 'About Other Stuff'

Note although the children are labeled the same, they will likely have different (but similar) content.

I have a few major problems with this schema, considering the data, and I'd like to know if I'm justified, and if there's a good solution or something I should read.

The main problem is: There are a -lot- of duplicate entries and duplicate hierarchies.

edit: Another issue is, if a page has children, it's contentid is ignored. Wasted space?

I'm also having trouble with the fact that it's hard to figure out WHERE I am in the tree... what type of object am I looking at?

For example, the TASK nodes need to be displayed differently than other nodes... and the only way to figure out if the node is a task, is if it's parent's label is "Tasks" (or "Jobs" or "ToDo"). I tried using the 'depth' of the node, but this doesn't work because sometimes other items requiring a different display on the same depth.

The solution proposed was the addition of a "Type" field... and then the UI decides how to display a node (eg 'Nav' nodes go in the main tab bar (eg About) and the Task type nodes a dropdown in the sidebar), but again, this doesn't feel right either.

Also, currently there is NO method to sort the items. We've just been lucky so far that the data went into the database in the right order they want to display it, and there have been no deletions so far.

They want to add a 'sort order' field to fix this. it seems like a lot of manual work to set up, especially if they ever want to reorder a specific menu set... (eg insert a new child to every single About node), although I guess a simple script could be written do it... though my argument is that i should be able to change it in one location and then it's just done.

I've never seen a database set up like this, but they claim, this is how all databases are set up. It's what they are for. Am i going crazy or is this a ridiculous way to set up any database with this much duplicated hierarchy (also keep in mind the below hierarchy has many less duplicate entries and duplicate trees than the real page hierarchy).

Honestly, I'd like to suggest a better solution but I'm not really sure what that is because in the real hierarchy there are 8 different types of 'About' style nodes, some of which contain more nodes with more children... but the children's labels and ordering are always in the same pattern. Do I need to create a table for each of these types of pages?

I've noticed this also appears to be a common problem for websites, where they might have a bunch of pages, all with the exact same set of children... but no way to say: ok i want this page to inherit this set of children, but i want them to all have different content. now I want to rearrange the children in one location and have them all update. Is there a robust, simple way to solve this?

SET1
    SECTION1
        -About 
             -About Things
             -About Stuff
             -About Other Stuff
        -Info
        -Tasks
             -A
             -B
             -C
              ...
             -H
        -Data
    SECTION2
        -About
             -About Things
             -About Stuff
             -About Other Stuff
        -Info
        -Tasks
             -A
             -B
             -C
              ...
             -H
        -Data
...
    SECTION16

SET2
    SECTION1
        -About
             -About Things
             -About Stuff
             -About Other Stuff
        -Info
        -Jobs
             -1
             -2
             -3
              ...
             -8

        -ToDo
             -A
             -B
             -C
              ...
             -H
        -Other Data
    SECTION2
        -About
             -About Things
             -About Stuff
             -About Other Stuff
        -Info
        -Jobs
             -1
             -2
             -3
              ...
             -8

        -ToDo
             -A
             -B
             -C
              ...
             -H
        -Other Data
SET3 (Exact same setup as SET2)
       ...

THANKS!

+1  A: 

What database and web development platform are you using (e.g. SQL Server with ASP.net)?

SidC
mysql, php, flex/flash... though I think I'd be able to change server technologies if there was a good solution
secoif
Are you familiar with ASP.net master pages and web.sitemap files? Master pages allow you to provide the same page layout and apply different content to each page. Thus, all your Section 1 pages can have same master page while all your Section 2 pages can have a separate master page. Web.sitemap file defines the page layout of the site similar to your example.
SidC
So these are like templates eh? I'd like to be able to insert a template into a template into a template, etc.... possible? So if i update one section, it will update that one section in all templates
secoif
Nested master pages are allowed in ASP.net 3.5. Please see msdn.microsoft.com and search for ASP.net nested master pages for details.
SidC
+1  A: 

I think you're beating yourself to death with the content.
If no ER-diagram of the database exists, draw one up - you seem to have a reasonable grasp of the content already, and normalize it.
Using this ERD you should be able to resolve most of the problems you mention above quite easily.

slashmais
Yeah that's a good idea but I have a suspicion coworkers will balk at the 'added complexity' that comes with using more than one table. :/
secoif
As the other comments pointed out, it's the only sane solution. If your co-workers cannot grasp that, maybe you should try to find other co-workers elsewhere.
slashmais
A: 

You could create your tables like so:

MenuItem: Id, Name
MenuItemContent: Id, Content
MenuItemHierarchy: ParentMenuItemId, MenuItemId, ContentId, SortKey

Normally, though, I do:

MenuItem: Id, Name, ContentId
MenuItemContent: Id, Content
MenuItemHierarchy: ParentMenuItemId, MenuItemId, SortKey

And not re-use my menu items to display different content.

Jacob G
+1  A: 

secoif there is a reason why they call it relational database. Without multiple tables and using joins then what is the point of using a database, might as well use a flat file. Your so called "other employees" can point and laugh all they want but the true power and effectiveness of a database driven application is when you begin using joins and normalizing your data.

JonH
Ha, wait, the kicker on that point, is that they manually run a script that renders all the navigation data to an xml file, then we're loading that xml into our flex app and using that... "To save hits to the DB" and also to make the application 'snappy'... though I don't think they have really thought it through, as any saving they make will be redundant: each time the user clicks they are going to have to query the db anyway to get the content, unless they want to store 400 pages of content in the xml too...
secoif
You poor soul..I can't say the company I work at is any better when it comes to such things. I love having to defend database driven apps more then excel "databases" *sigh*
JonH
+1  A: 

Just a few thoughts regarding some of the things you stated :

"The solution proposed was the addition of a "Type" field... "

If there is an apparent need for some of your business things to be 'typed' because somewhere somehow there exists some relevant distinction between obejcts of such distinct "types", then from a relational database persepective that is a key indication that you want/need to set up multiple tables (one for each type), AT LEAST at the logical level.

How you organise that PHYSICALLY, is a different matter, and it is a sad but true reality that SQL fails to make a proper distinction between logical/physical, and it is an even sadder and truer reality that because of SQL's failure on that level, lots of developers don't have a clue on that level either.

"Also, currently there is NO method to sort the items."

Keep in mind that at the logical level, relational database do not know any concept of ordering. Ordering is a presentation issue. The dbms/database is only implicated in that its physical design characteristics (indexes, notably) CAN be used to provide the queried items ordering that the client/user asks for. Whether or not such indexes are included in the design is a physical database design decision.

"They want to add a 'sort order' field to fix this."

Sounds anywhere between "pretty silly" and "totally insane" to me (depending on what this sort order field is supposed to contain).

"but they claim, this is how all databases are set up. It's what they are for."

Sounds like fundamental ignorance to me. But I only read your part of the story, which is possibly biased.

"the below hierarchy has many less duplicate entries and duplicate trees than the real page hierarchy".

I'll tell you "what databases are for" : they are for registering statements/assertions of fact. Each row in a database represents a statement of fact, which is believed to be true. And now for the duplicates : "If something is true, then saying it twice will not make it any truer." Copyright E.F.Codd. No database should ever hold any duplicates (that are intended to mean the same thing, of course).

Erwin Smout
+2  A: 

I think Erwin's analysis indicates that there is a whole lot of confusion amongst members of your group as to how a relational database is to be used. Judging by your schema, it seems like your'e basically re-creating a filesystem in your database. This is rather silly, your webserver already has one. I humbly suggest you need to rethink your design or consult with a site design consultant before going any further with your system. If you keep heading down the direction your'e doing right now, I think the project might be in trouble and might waste a lot of money/time.

I think you have a better grasp of the data and how it needs to be modelled, but you might have to go through a couple of iterations until you get it mostly nailed to fit your requirements and what you need out of a database-driven site system.

Best of luck on your database work!

sheepsimulator