What are the benefits of integer indexed fields compared to non integer indexed fields?

What are the benefits of integer indexed fields compared to non integer indexed fields?

For example, in my MySQL database, I have a table for video games and one of the fields is developer.  Right now those fields are filled in with the developer names but I was wondering if it'd be faster for my database if I made a separate table for developers and than referred to the developer using the id.

If doing this is better and/or faster, by how much?  And what other benefits are there?
Who is Participating?
PortletPaulConnect With a Mentor Commented:
>> if I made a separate table for developers and then referred to the developer using the id.
sounds like something that should be done anyway (normalization)

as a "general rule" integers are faster than strings (particularly long strings) and indexed   integers faster than indexed strings

if your data volumes aren't huge you may not notice any or much difference, but as data volumes increase the difference can become noticeable.

>>"by how much"?
unable to quantify

>>what other benefits are there?
also hard to say, but "immutability" is often a positive side effect
e.g. you can reference a Developer via his/her unique integer (which does not change) but change/correct the Developer's 'code' or 'name' (just an example!)
PortletPaulConnect With a Mentor Commented:
Keep in mind that eventually everything is stored and computed via 1s and 0s

this set of tests is sql server, but interesting:

Is the use of integer data types better for join columns than strings? It certainly does appear so, and not insignificantly either. Bear in mind though that what I was doing here was a bit extreme. 2.5 million rows in a a query with no filters applied. This shouldn’t be something that ever gets done in a real system. So it’s not a case that YMMV, it’s a case that it almost certainly will. You probably will see different results, but it is definitely something worth testing when planning data types for a DB

While this may not the final nail in the coffin for natural keys, it is worth keeping in mind when choosing between natural and artificial keys for a system, especially one likely to process large numbers of rows, such as a decision support/datawarehouse system. Test carefully with expected data volumes and expected loads if you’re considering natural keys and decide based on the result of those tests.

Repo code: Int vs String joins

(1) YMMV = ‘Your mileage may vary’, colloquialism that means that you may get a different result from me.
this is based on sql server, but is relevant:

less detailed, but both about the topic and MySQL:

Most Data Warehouses designs extensively employ integers for joins (and may even do it for things like "dates" in extreme cases). If you consider that every character is stored as a number ('a' has a number 'A' is a different number, and 'abcd' is a series of numbers which is different to 'Abcd') then not only is each string larger it is also a larger set of numbers and larger in storage terms (leading to some more I/O for example).

Think it would be very hard to find many references arguing that strings are faster than integers, but it's extremely easy to find information on the benefits of integers over strings just using a casual web search.

btw: some other possible advantages just hit me: "case sensitivity", "spelling error", "internationalization" (i.e. impacts of "natural languages" are avoided or minimized by using integers)
Ray PaseurConnect With a Mentor Commented:
Make a Google search for the exact phrase Should I Normalize My Database and read the very thoughtful and interesting arguments on both sides of the question.  In your case, the answer is "Yes" and a bit of study of normalization will show you why.  With 400,000 of anything optimization is important and is a good idea!

One of our colleagues here at EE has written an article that may be helpful, too.

Best regards, ~Ray
Tomas Helgi JohannssonConnect With a Mentor Commented:

It highly depends on what your data requirements are. As Ray says you should consider if you need to normalize your table design. Then you will need to look at your sql queries and index your tables properly according to the where clauses of your queries.

If you have one or more developer working on many video games then I would go with many to many relationship between
developer_table (id integer , name varchar,....)
developer_video_games_table ( d_id, vg_id)
video_games(id, video_game_name)

Index it properly and you have fast and easy to manage/query design.

     Tomas Helgi
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.