binary masks or other solution for query

Posted on 2009-12-26
Last Modified: 2012-05-08
Hello, I was wondering someone could help me with this problem:

A person has the desire to find products matching his needs on ingredients: vitamines, proteines, sodium. He selects these options using a website(php) out of a list of 500 ingredients, no use of db so far, the user isn't in the db.

There is a db with 3 products
Product 1 consists of vitamines, proteines and sodium
Product 2 consists of vitamines and sodium
Product 3 consists of vitamines, proteines, sodium and fat

How can I query that only products 1 and 2 are in the result?
(since product 3 also contains fat as ingredient)

The database hasn't been created yet so I can design this for optium usage.
I expect a load of 5 user request a minute with 500,000 products containing each 1-30 ingredients. These ingredients are dynamic and change/add or get deleted.

I have found some solution in using subqueries
| Field | Type         | Null | Key | Default | Extra |
| id    | int(11)      | YES  |     | NULL    |       |
| name  | varchar(255) | YES  |     | NULL    |       |
| Field | Type         | Null | Key | Default | Extra |
| id    | int(11)      | YES  |     | NULL    |       |
| name  | varchar(255) | YES  |     | NULL    |       |
| Field   | Type    | Null | Key | Default | Extra |
| prod_id | int(11) | YES  |     | NULL    |       |
| ig_id  | int(11) | YES  |     | NULL    |       |

select * from PRODUCT p where exists(select * FROM PRODUCT_VITAMINE pv,INGEDIENT i where AND = pv.prod_id and in('vitamines','sodium','proteines')) AND not exists (select * FROM PRODUCT_VITAMINE pv,INGEDIENT i where AND = pv.prod_id and  not in('vitamines','sodium','proteines'))

But I rather have 1 query and more simplier and faster. Maybe using binary masks. Anyone has an idea about how to arrange that for this problem or some other solution?

Regards and merry christmas!
Question by:sheep7
    LVL 39

    Accepted Solution

    You can use a long data type to keep track of 30 ingredients since a long in 32 bits.  In this case you would assign:
    vitamines = 1
    proteines = 2
    sodium     = 4
    fat            = 8
    etc and
    item 30    = 2 ^ 30 = 1,073,741,824

    Calling the long value lngValue, lngValue would be for:
    Product 1: 7   (vitamines: 1 + proteines: 2 + sodium: 4)
    Product 2: 5   (vitamines: 1 + sodium: 4)
    Product 3: 15 (vitamines: 1 + proteines: 2 + sodium: 4 + fat: 8)

    To find items that have vitamines, proteines and sodium but not fat, you would "And" the long with 7 (1 + 2 + 4):
    lngValue And 7
    LVL 31

    Expert Comment

    For Product 1,       (lngValue And 7) = (0111b And 0111b) = 0111b
    But for Product 3, (lngValue And 7) = (1111b And 0111b) = 0111b

    So, how did we eliminate the fat? The result of (lngValue And 7) is the same for Product 1 and 3. Maybe, use (IngValue And 8) = (IngValue And 1000b). This expression will be 0 if there is no fat, and not 0 if there is fat.
    LVL 31

    Expert Comment

    re: "matching his needs on ingredients: vitamines, proteines, sodium"
    Since Product 2 is a good return, then it sounds like you want "at least" one of these ingredients (i.e., even one of these is good enough).
    If this is the case, then how about using:
    ((lngValue And 7) !=0)  AND  (IngValue And 8) = 0
    LVL 5

    Expert Comment

    This does rather depend on the size of your database. If you have only 32 nutrients or less and a few hundred products, don't use a database at all. Save it as XML. I get the idea that this data will be fairly static: it isn't crucial that a new product that is added to the database will show up the next millisecond in user queries.

    But if you really do need to use a database, then don't use a bitmapped field. It works well in memory, but in a SQL Server or Sybase database you will force a table scan on every query because, as far as I know, only Oracle implements bitwise operators at the database level. So your query will read the bitmap on every single row in the table, mask off the bits, and keep or not keep the row depending on the result.

    The fundamental issue is that you have a many-to-many relationship between nutrients (vitamins, sodium, protein, etc) and products. The classical way to implement a many-to-many relationship is to have one table of nutrients:

    ID                     Description
    --                      ----------------
    0                      Protein
    1                      Sodium
    2                      Vitamin B12

    and another table of products (which I won't bother to illustrate) and a link table with products and nutrients:

    ProductID         NutrientID
    --------------         --------------
    12345              0
    12345              1
    12345              2
    12367              1
    12369              2

    Then your queries will mostly be of 3 types:

    1. All products with vitamin B12:  select distinct ProductID from linktable where NutrientID = 2 (this is the obvious way to do the query but you could treat is as a degenerate case of type 3)

    2. All sodium-free products: select distinct ProductID from linktable where not exists (select * from linktable sodium where sodium.NutrientID = 1 and sodium.ProductID = linktable.ProductID)

    3. All products with only protein and vitamin B12 (silly example, I know, but never mind that): select distinct ProductID from linktable
    where exists (select * from linktable protein where protein.NutrientID = 0 and protein.ProductID = linktable.ProductID)
    and exists (select * from linktable b12 where b12.NutrientID = 2 and b12.ProductID = linktable.ProductID)
    and not exists (select * from linktable other where other.NutrientID not in (0,2) and other.ProductID = linktable.ProductID)

    As you can see, if you allow your users free-form queries with an unknown number of options of nutrients to include or exclude, you will have to generate the SQL in your program on the fly: there is no way to just slot numbers into a where-clause. You could do that with a bitmapped field but, as I've explained, you really shouldn't unless your database is a toy with only a couple of hundred products. And if it's that small you shouldn't be bothering with a database at all.

    But you can probably also see that generating the SQL won't be especially difficult.

    LVL 2

    Author Closing Comment

    Thank you very much, products can have 1-30 ingredients, but there are in fact more than 30 ingredients, about 200. A query example would be helpful as well, but I'll figure it out, thanks for the start and idea's. Regards

    Featured Post

    How to run any project with ease

    Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
    - Combine task lists, docs, spreadsheets, and chat in one
    - View and edit from mobile/offline
    - Cut down on emails

    Join & Write a Comment

    One of Google's most recent algorithm changes affecting local searches is entitled "The Pigeon Update." This update has dramatically enhanced search inquires for the keyword "Yelp." Google searches with the word "Yelp" included will now yield Yelp a…
    Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
    In this sixth video of the Xpdf series, we discuss and demonstrate the PDFtoPNG utility, which converts a multi-page PDF file to separate color, grayscale, or monochrome PNG files, creating one PNG file for each page in the PDF. It does this via a c…
    Internet Business Fax to Email Made Easy - With eFax Corporate (, you'll receive a dedicated online fax number, which is used the same way as a typical analog fax number. You'll receive secure faxes in your email, fr…

    746 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    16 Experts available now in Live!

    Get 1:1 Help Now