Link to home
Start Free TrialLog in
Avatar of Da_Weasel
Da_WeaselFlag for United States of America

asked on

Database design for a dynamic category system

I am currently planning the development of a inhouse Document Managment System.
My orginal idea was this for the overall relationship of data:
Categories>--\
.......................>--SubCategories-->--Documents>--Files
Keywords>----------------------/

But then I thought "I don't want to be stuck with only 2 levels of categories, I would like the system to be much more flexible.  So I came up with this overall relationship concept:

Categories>--\
.......................>--Documents>--Files
Keywords>----/

Its not obvious in this due to my lack of ASCII art skills, but SubCategories was removed, and a new field was added to the Categories table called ParentID so that it looks like this:
ID     Name     Description     ParentID
1      Cat1     Cat Desc1       0
2      Cat2     Cat Desc2       0
3      Cat3     Cat Desc3       0
4      Cat4     Cat Desc4       0
5      Cat5     Cat Desc5       1
6      Cat6     Cat Desc6       1
7      Cat7     Cat Desc7       3
8      Cat8     Cat Desc8       9
9      Cat9     Cat Desc9       7

Categories records now have a relationship setup with each other.  Any category with a ParentID of 0 is a top level category, and all others refer to another category as its parent.  i.e. Cat1 is a top level, and Cat5, and Cat6 are subcategories of Cat1.
This at first seemed like my solution.  But as I began thinking about the various logic involved, and program flow I relized that this would require tons of code to maintain the integrity of the data.
Examples:
1. No parent level item (Categories, or Documents) can be deleted unless all of its children have first been removed.  This would be difficult to program given the dynamic aspect of the Categories.
2. It would take lots of code to keep Categories from having circular references.  This would cause confusion for the user and infinite loops for code that examines the child level items.

Can anyone think of a better database design for this, that still offers a dynamic category system with out using multiple tables for each category level?
Point will be given to the best idea, and additional point will be give to other who contribute ideas based on how useful they were.

Thanks...
Avatar of KC_Speedball
KC_Speedball

Its the way i did it, too. you have to climb from categorie to categorie when you are showing them but you can handle this
Put in a root cayegory with ID 0 ParentID null and make ParentID referencing ID. This will solve the problem about leaving orphan categories.

You can enforce a rule that the child ID must be greater than the parent ID to enforce no circular references.
This will only cause a problem if you move categories between paraents but ...
The code for both these rules isn't as complicated as you might think if you want to do it like that.

This is the usual way to model this sort of requirement.

Another variation is if you want subcategories to be linked to two parents then you would need:

ID Name Description

ParentID, ChildID

Avatar of Da_Weasel

ASKER

that root category idea is why i have to 0's in the parentID field for the first four.  Anything with a ParentID of zero is a root category.  (BTW, ID is a autonumber field)
I had thought about the idea of enforcing the rule: ParentID must be less than ID, but should a existing category need to be moved later, it would require renumbering the ID field.

Another idea I thought might help was to add a new field called Level which would store a number that represents the child/parent level.  i.e. root categories are 0 and any direct children are 1, thus when a category is clicked, you can select categories with parentID = [selected cat ID] WHERE level is [selected cat+1].  This would clear up most of the issues involved with dynamic categories and make the logic of the application much easier.  This also has a problem when it comes to moving categories.  Should it switch levels, then it would need to be updated to have a level = parent-level+1 and each of its children on down the line would need to be adjusted accordingly.

I will leave this open till tomorrow in hopes of drawning in more discussion, and then award points.

nigelrivett: what would be stored in the ChildID field?  I don't follow...

Thanks for the idea....
ASKER CERTIFIED SOLUTION
Avatar of nigelrivett
nigelrivett

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ok, i see what your are saying, setups a seperate table to maintain the Child/Parent relationship in.  Cool, I could then get all of the category info that I need with a simple JOIN.
So this would be a One-to-Many relationship.  One Parent has many children.
Looks like a solution to me ;), will award point...I think the rest of the data is ok, but if you have any ideas about the rest of it, i would love to hear it.
posted a new question regarding the UI for this, if your interested...
https://www.experts-exchange.com/questions/20534800/User-Interface-for-Document-Managment-System.html
I just relized that the ChildID field will never repeat a value, which mean that it will present the same data that is already available in the ID field of the Categories table.  So making this new table to hold parent/child relationships has only UN-Normalized the data.

Just a note for anyone that stumbles upon this, the above answer did not actually solve the problem.  I failed to throughly review the answer offered, before accepting it.
Yep as I said

Another variation is if you want subcategories to be linked to two parents then you would need:
Right...thats might prove interesting....I will work out the code to get the menu based interface (example below) for the current setup working first and then try this, and post my finding here.

Menu interface for Categories with each item being a link, and each SubCategory being indented, something like a simple tree view menu:

Main1
  SubMain1
  SubMain2
  SubMain3
    SubSub1
    SubSub2
  SubMain4
Main2
Main3
Main4
Main5

The difficult part was, figuring out that the currently open branch of Categories had to be built in reverse.  This required making a function that finds the TopLevelParent for any given category and also one to get a count of the number of levels a category is from its TopLevelParent.
I've got some code to return hierarchies here

www.nigelrivett.com
Retrieve Tree Hierarchy


There is also a lot more discussion at

www.sqlteam.com