andieje
asked on
MySQL indexes on SETS and enums
Hello
I need to create some indexes on some mysql fields. One of them is of type SET. I remember reading somewhere that these are bad for indexes and I should convert them to another type. Is this correct? They are text values so should I convert them to CHAR, TEXT or ENUM?
I also have some other fields of type CHAR which only take a fixed set of values. Should I convert these to ENUM before i make them into indexes?
Thanks in advance for your help
I need to create some indexes on some mysql fields. One of them is of type SET. I remember reading somewhere that these are bad for indexes and I should convert them to another type. Is this correct? They are text values so should I convert them to CHAR, TEXT or ENUM?
I also have some other fields of type CHAR which only take a fixed set of values. Should I convert these to ENUM before i make them into indexes?
Thanks in advance for your help
ASKER
Really? Im not doubting your expertise - would just like to understand. Wouldn't having another table for the values in the set create an additional join every time I needed to do a query and slow things down? Couldnt mysql just use an index on the set column if the SET column was in a WHERE clause? thanks for any clarification
yes, additional join. but not necessarily a "slow down". it all comes down to properly design the tables and the queries
While I would wait to see the benchmarks, I'm inclined to agree with Guy Hengel.
MySQL implements sets as a bit-flag field. For example, in the set ('a','b','c'), 'a' has a bit value of 1, 'b' is 2, and 'c' is 4. Translated into storage, this is likely some kind of integer field. My untested hypothesis is that MySQL handles this in the same manner as an integer index. In other words, the index would be useful only if you are looking for a collection of certain flags, as opposed to the inclusion of any single flag.
Following that idea, you end up with two possible scenarios:
1) Looking for a collection of set values: an index would probably work better here
2) Looking for any record including a specific value: a JOIN table would probably work better here
Again, this is untested mental leg-work. A use-case benchmark would provide your answer easily enough.
MySQL implements sets as a bit-flag field. For example, in the set ('a','b','c'), 'a' has a bit value of 1, 'b' is 2, and 'c' is 4. Translated into storage, this is likely some kind of integer field. My untested hypothesis is that MySQL handles this in the same manner as an integer index. In other words, the index would be useful only if you are looking for a collection of certain flags, as opposed to the inclusion of any single flag.
Following that idea, you end up with two possible scenarios:
1) Looking for a collection of set values: an index would probably work better here
2) Looking for any record including a specific value: a JOIN table would probably work better here
Again, this is untested mental leg-work. A use-case benchmark would provide your answer easily enough.
ASKER
Hi - please could you explain this comment
"In other words, the index would be useful only if you are looking for a collection of certain flags, as opposed to the inclusion of any single flag. "
I am looking for all vehicles that have a certain value for a flag. So the SET field has 6 values. I need to be able to select all vehicles that have one particular value for that SET.
I am confused becuase you say you agree with Kevin (ie. use a join) but say an index is best if you are looking for a collection of certain flags.
My requirement is the latter. I apologise if my question didnt make my requirement clear enough. If it helps, I probably should have put the query I am trying to optimise
select *
from tablex where
(display = 'yes' or (display in ('normal','sale') and status = 'instock'))
The display and status fields are SETs.
"In other words, the index would be useful only if you are looking for a collection of certain flags, as opposed to the inclusion of any single flag. "
I am looking for all vehicles that have a certain value for a flag. So the SET field has 6 values. I need to be able to select all vehicles that have one particular value for that SET.
I am confused becuase you say you agree with Kevin (ie. use a join) but say an index is best if you are looking for a collection of certain flags.
My requirement is the latter. I apologise if my question didnt make my requirement clear enough. If it helps, I probably should have put the query I am trying to optimise
select *
from tablex where
(display = 'yes' or (display in ('normal','sale') and status = 'instock'))
The display and status fields are SETs.
in that case, your field/index looks like it's appropriate.
and I presume that with "bad" it was meant that it referred to queries that are not using the "set" of values like you do
and I presume that with "bad" it was meant that it referred to queries that are not using the "set" of values like you do
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
sorry for the delay in replying. will look into this more and award points soon
ASKER
Hi - i think the question got a bit sidetracked which is probably my fault for not posting the query in question. Given the query would it be better (quicker) if the display and status fields were sets or enums?
thanks
thanks
In general, ENUM fields are faster than SET field. ENUM types can only have one value. SET types can have multiple. So which is faster - checking for a single value, or checking for multiple possible values?
ASKER
i didnt write this database so i dont even know if a set type is required. How could i count the number of values in the fields of type set? Presumbably if the field only has one value for each row and i simply do an alter table from set to enum type? thanks a lot
You can use the BIT_COUNT() function to determine how many set members are in any particular data point.
http://dev.mysql.com/doc/refman/5.5/en/bit-functions.html#function_bit-count
http://dev.mysql.com/doc/refman/5.5/en/bit-functions.html#function_bit-count
you should have a dedicated table with 1 value per row instead of a single column.
then, you can also properly index and query that table, as needed, with JOINS, EXISTS or any other technique that is needed to achieve the query results.