I am in the middle of a project where overall speed of the system is highly important. Part of this system involves, of course, reading data stored in an RDBMS -- in this case, MySQL. At the current moment, I am researching the "best" schema to handle storage and retrieval of application settings. There are at least a couple of good ways to do what I need, but I need to ensure that the table(s) in question will perform well, even while handling many, many simultaneous read operations (writes will be less frequent). Here are the basics of the situation:
1. The system has many components, each requiring its own set of settings (keys/values)
2. The number and type of settings for each component varies (but is consistent from component to component)
3. Component settings are specific to their owners (a user)
4. Schema needs to handle hundreds of simultaneous reads
5. Settings table(s) could reach into the million(s) of rows
For example:
User 1 has Component A (isactive = true, color = brown) and Component B (width = 12, weight = 15.0, description = '')
User 2 has Component A (isactive = true, color = grey) and Component B (width = 8, weight = 25.0, description = 'lorem ipsum foo bar hello world and some other stuff too')
I could store this data in one table, like so:
--------------------------
--------
TABLE: SETTINGS
--------------------------
--------
settings_id (int, primary key)
user_id (int)
key (varchar(50))
value (blob)
--------------------------
--------
Using this method, I retain a great deal of flexibility because I can add new settings for any component virtually "on the fly". Lookups for a specific setting can be sped up through creating an index on the "key" field. The table can even be further normalized by creating an additional table to store a unique set of "keys" so that the key field in the SETTINGS table is a numeric, indexed to the lookup table, instead of a varchar. The down sides include having to set the "value" field as a blob for the few values that need it, even though most will be very small. Also, because all settings data is in one table, the number of rows and lookups in/to the one table might cause it to slow down.
I could also store the data in multiple tables, like so:
--------------------------
----------
-----
TABLE: COMPONENTA_SETTINGS
--------------------------
----------
-----
settings_id (int, primary key)
user_id (int)
isactive (tinyint)
color (varchar(25))
--------------------------
----------
-----
--------------------------
----------
-----
TABLE: COMPONENTB_SETTINGS
--------------------------
----------
-----
settings_id (int, primary key)
user_id (int)
width (smallint)
weight (decimal)
description (blob)
--------------------------
----------
-----
By using this method, key fields are specifically defined and are typed more appropriately. Also, instead of having hundreds of lookups a minute to one table, now those lookups are spread across multiple tables. However, this method means that I give up some flexibility in the case where I want to quickly add new settings -- now I'd have to alter my table to add settings fields instead of doing it programatically/dynamicall
y as the situation arises.
Basically, speed and the ability to scale is my biggest concern, but if there is little difference between the speed of these two methods, then I want the most flexible option. Are there better ways to do this that I haven't explored? Which method is "best" for my needs? Any advice, suggestions, or better ways to do this would be much appreciated. If possible, please validate your answer with external reference. Thank you!
Start Free Trial