MySQL large table design question

Posted on 2006-05-05
Last Modified: 2008-02-01

I am using MySQL 5 and designing some tables that will hold quite a large amount of data which in time will probably exceed 10,000 rows and some others will be over a million rows.

These tables have some optional data, such as “description” that will be applicable to some rows but not all. I have designed the tables so that there is an Int id that links to a content table where the optional data is stored. If the row has no description, then the content id is set to 0, but if data has been stored for it, it’s set to the relevant key id of the content row in the content table.

This means I have designed the large tables to be a bunch of Ints – linking to other tables for the different data they hold. Since much of the data is optional, this avoids having TEXT and large VARCHAR’s being stored in the big table. Also, because different parts of the system will require different parts of the data this means they aren’t dealing with one big table – they just fetch data from the relevant sub-tables as needed. For example – when calculating stock, the description, images, title and so on are not required.

My logic for this is based on trying to make the heavy tables as light as possible, trying to avoid them becoming physically huge with both data and numbers of rows.

I would like to check this is the right thing to do as it will require a little more work on the server in terms of doing a few extra short joins and also a few extra queries in some cases, but I would hope having the tables nice and clean will help with performance.

Is this correct? Is this a favourable way of designing large databases?

Also, I plan on using Transactions to ensure everything stays valid and reliable and ensure linked rows don’t get separated from parents/children. I know I will need to use InoDB tables for this, but should I be using InoDB for all tables, or should I be using MyISAM for everything except the important transaction based tables?

Is InoDB much less efficient compared to MyISAM?

Many Thanks,
Question by:SamWitney
    LVL 7

    Expert Comment

    Your approach is a valid one; however, you may find that due to that drastically varying ways in which databases optimise storage and queries - your attempt to improve performance at a high level may not actually give any benefit -- but it depends on the queries and volumes there-of that you will be executing.

    I used an equivalent table layout to you when I built a large database with conditional relations, and it worked well.
    Sticking to a normalised form is considered good practice from an information architecture point-of-view --- but when accounting for database performance optimisations, your approach is debatably better.
    Any large pieces of data will be best in seperate tables unless they are used almost as heavily as the rest of the data -- your example above suggests that seperation is a good thing in your case.

    Your main priority for performance should be ensuring that you create the relevant indexes -- this somewhat gets arround the issues of variable length records, though fixed length will always be the fastest.

    Unless you have large, unimportant tables which will be heavily used, then I would keep everything as InnoDB to ensure concurrency throughout - the efficiency difference is very small given the increased robustness which should be a priority in almost all business systems.

    Author Comment

    Thanks for your reply, that helps a lot.

    If I use Views to make de-normalised versions of certain groups of tables (such as "products in stock" with all the product info, including content, stock level etc), will this increase performance when performing queries on the view, rather than large join queries each time? IE - are Views cached, and updated when the main tables are updated...?

    That query would involve about 8 tables, as stock is quite complex as is some of the product info.

    Also, based on my design above is MySQL 5 good at doing the joins to link the "content" to the tables, given that both tables are correctly indexed on the PK/FK's?
    LVL 7

    Accepted Solution

    I don't believe that views are as efficient as you describe - anyway there would be a huge overhead in keeping them up to date if that was the case - the only caching will be the optimised versions of the qeries to "build" the "view".
    The main purpose of views is to make the queries easier to write; optimisations will be made at query level, but not at data storage level -- again, ensure you have appropriate indexes for views.

    In the past I've actually gone down the route of having duplicated data in order to achieve acceptable performance - however this should be avoided everywhere possible; it was only used as a last resort.

    JOINs accross 8 tables will have quite a high overhead - be sure to use the most appropriate JOIN for the task at hand, and use subqueries where possible aswell (they are cached seperately and thus can give large performance boost)

    Make sure you allow MySQL to keep a large query cache in memory and you should notice a good performance boost.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Threat Intelligence Starter Resources

    Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

    Loading csv or delimited data files to MySQL database is a very common task frequently questioned about and almost every time LOAD DATA INFILE comes to the rescue. Here we will try to understand some of the very common scenarios for loading data …
    More Fun with XML and MySQL – Parsing Delimited String with a Single SQL Statement Are you ready for another of my SQL tidbits?  Hopefully so, as in this adventure, I will be covering a topic that comes up a lot which is parsing a comma (or other…
    This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor ( If you're looking for how to monitor bandwidth using netflow or packet s…
    Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    19 Experts available now in Live!

    Get 1:1 Help Now