MySQL indexes on SETS and enums

Hello

I need to create some indexes on some mysql fields. One of them is of type SET. I remember reading somewhere that these are bad for indexes and I should convert them to another type. Is this correct? They are text values so should I convert them to CHAR, TEXT or ENUM?

I also have some other fields of type CHAR which only take a fixed set of values. Should I convert these to ENUM before i make them into indexes?

Thanks in advance for your help
andiejeAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
Steve BinkConnect With a Mentor Commented:
I'd like to preface this explanation with another reminder that I'm making some substantial assumptions about how set indexes work in MySQL.  Benchmarking is going to provide your absolute answer.

My assertion is that MySQL, which is documented as handling sets as bit-flag fields, will store the index of a set as something similar to a standard integer field index.  Given my set ('a','b','c'), the bit-flag values of those set members will be 1, 2, and 4, respectively.  If I am looking for all records which include set value 'a', and ONLY set value 'a', then MySQL can match an integer-based index very quickly - it will look for all rows where the `set_field` bit-flag value is equal to 1.  The same holds true if I'm looking for all rows containing 'a' and 'b', but not 'c'.  MySQL can search the index for `set_field`=3.  In these cases, an index is likely to have much better performance than a join.

On the other hand, look at a case where I'm looking for all rows that contains 'a', and it does not matter if 'b' and 'c' are included or not.  I'm assuming that MySQL would have to do some bit-level math on the index to determine if it should be included in the result set (i.e., `set_field` AND 1 = 1).  I'm guessing you would see better performance from a joined table for this scenario.

In your last comment, I noted that the status field is a SET type.  Can a record in tablex have more than one associated status?  If not, you should change that to an ENUM type.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
not sure if a index on a SET columns is bad or not, but if you need a SET columns to be indexes (hence searched), it means that your table implementation should be changed.
you should have a dedicated table with 1 value per row instead of a single column.
then, you can also properly index and query that table, as needed, with JOINS, EXISTS or any other technique that is needed to achieve the query results.
0
 
andiejeAuthor Commented:
Really? Im not doubting your expertise - would just like to understand. Wouldn't having another table for the values in the set create an additional join every time I needed to do a query and slow things down? Couldnt mysql just use an index on the set column if the SET column was in a WHERE clause? thanks for any clarification
0
Learn to develop an Android App

Want to increase your earning potential in 2018? Pad your resume with app building experience. Learn how with this hands-on course.

 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
yes, additional join. but not necessarily a "slow down". it all comes down to properly design the tables and the queries
0
 
Steve BinkCommented:
While I would wait to see the benchmarks, I'm inclined to agree with Guy Hengel.  

MySQL implements sets as a bit-flag field.  For example, in the set ('a','b','c'), 'a' has a bit value of 1, 'b' is 2, and 'c' is 4.  Translated into storage, this is likely some kind of integer field.  My untested hypothesis is that MySQL handles this in the same manner as an integer index.  In other words, the index would be useful only if you are looking for a collection of certain flags, as opposed to the inclusion of any single flag.  

Following that idea, you end up with two possible scenarios:

1) Looking for a collection of set values: an index would probably work better here
2) Looking for any record including a specific value: a JOIN table would probably work better here

Again, this is untested mental leg-work.  A use-case benchmark would provide your answer easily enough.
0
 
andiejeAuthor Commented:
Hi - please could you explain this comment

"In other words, the index would be useful only if you are looking for a collection of certain flags, as opposed to the inclusion of any single flag.  "

I am looking for all vehicles that have a certain value for a flag. So the SET field has 6 values. I need to be able to select all vehicles that have one particular value for that SET.

I am confused becuase you say you agree with Kevin (ie. use a join) but say an index is best if you are looking for a collection of certain flags.

My requirement is the latter. I apologise if my question didnt make my requirement clear enough. If it helps, I probably should have put the query I am trying to optimise

select *
from tablex where
(display = 'yes' or (display in ('normal','sale') and status = 'instock'))

The display and status fields are SETs.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
in that case, your field/index looks like it's appropriate.
and I presume that with "bad" it was meant that it referred to queries that are not using the "set" of values like you do
0
 
andiejeAuthor Commented:
sorry for the delay in replying. will look into this more and award points soon
0
 
andiejeAuthor Commented:
Hi - i think the question got a bit sidetracked which is probably my fault for not posting the query in question. Given the query would it be better (quicker) if the display and status fields were sets or enums?

thanks
0
 
Steve BinkCommented:
In general, ENUM fields are faster than SET field.  ENUM types can only have one value.  SET types can have multiple.  So which is faster - checking for a single value, or checking for multiple possible values?
0
 
andiejeAuthor Commented:
i didnt write this database so i dont even know if a set type is required. How could i count the number of values in the fields of type set? Presumbably if the field only has one value for each row and i simply do an alter table from set to enum type? thanks a lot
0
 
Steve BinkCommented:
You can use the BIT_COUNT() function to determine how many set members are in any particular data point.

http://dev.mysql.com/doc/refman/5.5/en/bit-functions.html#function_bit-count
0
All Courses

From novice to tech pro — start learning today.