?
Solved

MySQL indexes on SETS and enums

Posted on 2014-01-14
12
Medium Priority
?
285 Views
Last Modified: 2014-02-02
Hello

I need to create some indexes on some mysql fields. One of them is of type SET. I remember reading somewhere that these are bad for indexes and I should convert them to another type. Is this correct? They are text values so should I convert them to CHAR, TEXT or ENUM?

I also have some other fields of type CHAR which only take a fixed set of values. Should I convert these to ENUM before i make them into indexes?

Thanks in advance for your help
0
Comment
Question by:andieje
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
  • 3
12 Comments
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 39783192
not sure if a index on a SET columns is bad or not, but if you need a SET columns to be indexes (hence searched), it means that your table implementation should be changed.
you should have a dedicated table with 1 value per row instead of a single column.
then, you can also properly index and query that table, as needed, with JOINS, EXISTS or any other technique that is needed to achieve the query results.
0
 

Author Comment

by:andieje
ID: 39783758
Really? Im not doubting your expertise - would just like to understand. Wouldn't having another table for the values in the set create an additional join every time I needed to do a query and slow things down? Couldnt mysql just use an index on the set column if the SET column was in a WHERE clause? thanks for any clarification
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 39783806
yes, additional join. but not necessarily a "slow down". it all comes down to properly design the tables and the queries
0
Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

 
LVL 51

Expert Comment

by:Steve Bink
ID: 39784533
While I would wait to see the benchmarks, I'm inclined to agree with Guy Hengel.  

MySQL implements sets as a bit-flag field.  For example, in the set ('a','b','c'), 'a' has a bit value of 1, 'b' is 2, and 'c' is 4.  Translated into storage, this is likely some kind of integer field.  My untested hypothesis is that MySQL handles this in the same manner as an integer index.  In other words, the index would be useful only if you are looking for a collection of certain flags, as opposed to the inclusion of any single flag.  

Following that idea, you end up with two possible scenarios:

1) Looking for a collection of set values: an index would probably work better here
2) Looking for any record including a specific value: a JOIN table would probably work better here

Again, this is untested mental leg-work.  A use-case benchmark would provide your answer easily enough.
0
 

Author Comment

by:andieje
ID: 39786126
Hi - please could you explain this comment

"In other words, the index would be useful only if you are looking for a collection of certain flags, as opposed to the inclusion of any single flag.  "

I am looking for all vehicles that have a certain value for a flag. So the SET field has 6 values. I need to be able to select all vehicles that have one particular value for that SET.

I am confused becuase you say you agree with Kevin (ie. use a join) but say an index is best if you are looking for a collection of certain flags.

My requirement is the latter. I apologise if my question didnt make my requirement clear enough. If it helps, I probably should have put the query I am trying to optimise

select *
from tablex where
(display = 'yes' or (display in ('normal','sale') and status = 'instock'))

The display and status fields are SETs.
0
 
LVL 143

Expert Comment

by:Guy Hengel [angelIII / a3]
ID: 39786278
in that case, your field/index looks like it's appropriate.
and I presume that with "bad" it was meant that it referred to queries that are not using the "set" of values like you do
0
 
LVL 51

Accepted Solution

by:
Steve Bink earned 2000 total points
ID: 39786543
I'd like to preface this explanation with another reminder that I'm making some substantial assumptions about how set indexes work in MySQL.  Benchmarking is going to provide your absolute answer.

My assertion is that MySQL, which is documented as handling sets as bit-flag fields, will store the index of a set as something similar to a standard integer field index.  Given my set ('a','b','c'), the bit-flag values of those set members will be 1, 2, and 4, respectively.  If I am looking for all records which include set value 'a', and ONLY set value 'a', then MySQL can match an integer-based index very quickly - it will look for all rows where the `set_field` bit-flag value is equal to 1.  The same holds true if I'm looking for all rows containing 'a' and 'b', but not 'c'.  MySQL can search the index for `set_field`=3.  In these cases, an index is likely to have much better performance than a join.

On the other hand, look at a case where I'm looking for all rows that contains 'a', and it does not matter if 'b' and 'c' are included or not.  I'm assuming that MySQL would have to do some bit-level math on the index to determine if it should be included in the result set (i.e., `set_field` AND 1 = 1).  I'm guessing you would see better performance from a joined table for this scenario.

In your last comment, I noted that the status field is a SET type.  Can a record in tablex have more than one associated status?  If not, you should change that to an ENUM type.
0
 

Author Comment

by:andieje
ID: 39801608
sorry for the delay in replying. will look into this more and award points soon
0
 

Author Comment

by:andieje
ID: 39807236
Hi - i think the question got a bit sidetracked which is probably my fault for not posting the query in question. Given the query would it be better (quicker) if the display and status fields were sets or enums?

thanks
0
 
LVL 51

Expert Comment

by:Steve Bink
ID: 39807261
In general, ENUM fields are faster than SET field.  ENUM types can only have one value.  SET types can have multiple.  So which is faster - checking for a single value, or checking for multiple possible values?
0
 

Author Comment

by:andieje
ID: 39808656
i didnt write this database so i dont even know if a set type is required. How could i count the number of values in the fields of type set? Presumbably if the field only has one value for each row and i simply do an alter table from set to enum type? thanks a lot
0
 
LVL 51

Expert Comment

by:Steve Bink
ID: 39809057
You can use the BIT_COUNT() function to determine how many set members are in any particular data point.

http://dev.mysql.com/doc/refman/5.5/en/bit-functions.html#function_bit-count
0

Featured Post

Back Up Your Microsoft Windows Server®

Back up all your Microsoft Windows Server – on-premises, in remote locations, in private and hybrid clouds. Your entire Windows Server will be backed up in one easy step with patented, block-level disk imaging. We achieve RTOs (recovery time objectives) as low as 15 seconds.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When table data gets too large to manage or queries take too long to execute the solution is often to buy bigger hardware or assign more CPUs and memory resources to the machine to solve the problem. However, the best, cheapest and most effective so…
In this blog post, we’ll look at how ClickHouse performs in a general analytical workload using the star schema benchmark test.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…

800 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question