DB design question: multi-level attribute boolean search implementation?
Posted on 2004-09-10
Okay, just for the information, the question is sort of DB independant, but the implementation will most probably be on MySQL.
So, there will be a database of documents, which each will have a number of properties and categories.
properties: TITLE, DESCRIPTION, EXPIRY_DATE, LANGUAGE, ...
categories: this document has let us say 3 levels of categories, like:
5 - Food - Textiles - Clothing - Watches - Household
5.18.33 Shoes, shoe parts
so it has :
category level 1 = 5,
category level 2 = 18,
category level 3 = 33
Document B is of different type. It has some of the properties the same as document A, but some are different:
properties: TITLE, DESCRIPTION, RECIPIENTS, ...
categories: this document uses a different category system, which has let's say 5 category levels:
3 - general document
3.1 - memo
A - important
A.1.B.2.F - secrecy level: high
so it has:
categories of level 1 = 3; A
category of level 2 = 1
category of level 5 = F
The second example is totally random, just to explain that category system can be different.
Also, of course, any document can be in more than one category per level (ie. doc B is in top level categories 3 _and_ A).
Typical queries might be:
give me all documents which are in category "C" in this classification system, or in any of subcategories of it OR ( documents that have EXPIRY_DATE more than year in future AND NOT have subcategory "5.18")
So, the questions are:
1. What would be a good general table design be for such document storage, that would allow cross-category search on arbitrary number of attributes and categories, considering that additional classification systems will be added later.
2. Any ideas about performance of such a design?
The query builder and parser in programming language (PHP) will be a separate pain, but that's something I've already done in past, so I could reimplement it and tweak a bit.
(Wanted to give 1000 points, but said max is 500, however, for a good and concise answer, I will give 1000 total somehow)