Link to home
Start Free TrialLog in
Avatar of Faith Victory
Faith VictoryFlag for United States of America

asked on

mysql table open definition cache alert

Percona monitoring management is alerting on too many table open cache. 


Labels
 alert name = MySQLTableOpenDefinition
 node_name = prod-sql02
 service = mysql
 severity = critical
 Annotations

description = MySQL is opening too many table defintions, try increasing table_open_definition_cache summary = CRITICAL: MySQL is opening too many table definitions on (instance prod-sql02)


Here is my output when I checked the last time the database was last rebooted:


Uptime           Open_table       Openned_tables           OpenTableFactor

110402767   8000                   208634981                398.88869834344797


It has been 120 days since the database was restarted. It is a production server with master-slave replication. This alert is coming from the master node.  


How do you resolve this without rebooting the production box? 


my.cnf settings 


# InnoDB Settings
innodb_file_per_table = 1
innodb_flush_method = O_DIRECT
innodb_log_file_size = 1G
innodb_log_buffer_size = 8M
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_buffer_pool_size = 16G
innodb_buffer_pool_instances = 4
innodb_buffer_pool_dump_at_shutdown = ON
innodb_buffer_pool_load_at_startup = ON
innodb-defragment = 1
innodb_autoinc_lock_mode  = 2
innodb_io_capacity = 200
innodb_io_capacity_max = 4000
#innodb_file_format=barracuda
#innodb_file_format_max=barracuda


## server settings:
collation-server          = utf8_general_ci
character-set-server      = utf8
max_allowed_packet        = 12M
innodb_strict_mode        = 0
max_prepared_stmt_count   = 16384
#optimizer_search_depth    = 8

## Replication
report_host         = prod-sql02
server-id               = 2
binlog_format      = ROW
log_slave_updates
log-bin                 = mysql-bin
expire_logs_days        = 1
relay_log_recovery      = ON
sync_binlog             = 1000
sync_master_info        = 0
sync_relay_log          = 0
sync_relay_log_info     = 0


# MySQL settings
ignore_db_dirs = BACKUP
#max_connections = 1500
max_connections =  3000
default_storage_engine = InnoDB
transaction-isolation = READ-COMMITTED
event_scheduler= ON
character-set-server = utf8
collation-server = utf8_general_ci
max_allowed_packet = 16M
sql_mode ="TRADITIONAL"
wait_timeout  = 28800
log_output="FILE"
log_queries_not_using_indexes = 0
long_query_time = 5
min_examined_row_limit  = 0
thread_cache_size  = 400
thread_pool_size  = 64
table_open_cache  = 8000
key_buffer_size  = 16
join_buffer_size  = 128
table_definition_cache  = 2000

Avatar of gr8gonzo
gr8gonzo
Flag of United States of America image

I don't use Percona but are you certain that someone hasn't been manually modifying that alert? Unless Percona has defined it, the setting table_open_definition_cache doesn't exist...
Avatar of Faith Victory

ASKER

@arnold
I don’t know how to resolve the too many open table alerts. This is the  production server and currently the settings  (table_open_cache  = 8000) and (table_definition_cache  = 2000)
It gives you the solution: try increasing table_open_definition_cache
You probably have a lot of tables in the database and parallel queries...
Hi,

Table_open_cache can be increased dynamically as well as table_definition_cache, that is without restarting by issuing

SET GLOBAL table_open_cache = <some value>;
SET GLOBAL table_definition_cache = <some value>;

Open in new window


By your configuration and the given information then your system has opened 26079 tables daily
You could increase both the table_open_cache and table_definition_cache but do it gradually.
You could start with something like

SET GLOBAL table_open_cache = 16000;
SET GLOBAL table_definition_cache = 4000;

Open in new window


https://mariadb.com/kb/en/server-system-variables/#table_open_cache
https://mariadb.com/kb/en/server-system-variables/#table_definition_cache
https://mariadb.com/kb/en/optimizing-table_open_cache/

Best regards,
     Tomas Helgi
Faith, you replied to @arnold but he's not on this comment, so I'm assuming you were responding to me.

My concern here is the alert suggesting the value called table_open_definition_cache, which doesn't exist in standard MySQL nor in MariaDB. Additionally, the alert misspelled "Opened" which is why I'm asking if this alert was manually created by someone, because it is bad to have an alert telling you to change a setting that doesn't exist.

As others have said, you have two settings:

table_open_cache
table_definition_cache

The table_definition_cache is fairly easy to increase - its main limitation is simply memory. I would suggest setting the table_definition_cache to slightly exceed the number of tables you have.

The table_open_cache is trickier. Database tables are stored as separate files on the filesystem. When MySQL has a table open, it has to ask the operating system / filesystem for that file. This "access" to the file is called a handle. The operating system / file system enforce their own limits of how many handles can exist at the same time.

If you have 5 connections at the same time, and all 5 connections are each running 1 query, and each query references 3 tables, like:

SELECT * FROM table1 JOIN table2 ... JOIN table3 ...

...then the number of handles it requires is 5 connections * 1 query * 3 tables = 15 open tables = 15 handles.

So picking the right number for table_open_cache is about figuring out the average number of tables referenced per query, and multiplying that by your max connections.

Tomas already linked to the optimizing article for table_open_cache but you should really read this to understand how this all works (remember that Maria is just a fork off of MySQL):

MySQL :: MySQL 8.0 Reference Manual :: 8.4.3.1 How MySQL Opens and Closes Tables 

So you need to ensure that you're NOT increasing table_open_cache too high or else you're going to run into database errors when the server cannot open a table. 

Do I set the values on both master node and slave node? Though, the alert is coming from the master node.

Please what's the formula you used to arrive at this values?

SET GLOBAL table_open_cache = 16000; 
SET GLOBAL table_definition_cache = 4000; 
table_open_cache =  maximum number of tables the server can keep open in any one table cache instance.
table_definition_cache = maximum number of table definitions that can be stored in cache.

- Table open cache can have multiple cache entry for one table based on the number of threads accessing it while table definition cache is single instance per table.
- Table open cache tuning needs a bit of consideration for threads connected, max number of tables a query is using (join etc)...
- Best way to track it is by observing the "opened_tables" status variable. If you see that number increasing highly, meaning that cache is not sufficiently provisioned, consider raising the table cache.

Since you're using PMM, there is a chart for "MySQL table open cache status" that notes "Table Open Cache Hit Ratio". If it is near 100%, we're good.

Thanks.
> Do I set the values on both master node and slave node?

You don't -have- to make them match here. These settings are all about usage. If 95% of your users access the master node and the other 5% access your slave node, then you probably don't need to increase the values on the slave because you won't have that many connections so the number of open tables will be lower.

Regarding the values Tomas suggested, they're simply twice what you have. You can keep increasing it until the errors go away but doing it that way is always a bit risky since you don't know what the values -should- be and as I mentioned before, there are risks to having it set too high.

You can always use mysqltuner (a perl script) to analyze your statistics and give you some recommended settings. Just make sure MySQL has been running for several days before you run it, so that there are good statistics.
Can I execute Flush Table without performance impact in production and not even bother to increase the cache size? Can Flush Table free up the open table? 
That would not help you. In fact, it would likely create worse problems.

Bear in mind you don't NEED to increase the cache size. Everything will continue to run even if you don't make any changes at all.

The point of the cache is simply to improve performance by keeping tables open so they can be accessed faster. Increasing the cache is simply increasing the number of table handles you can keep open so you don't have to have the performance hit of opening a table.

If you flush tables, you're doing exactly what you want to avoid. You're telling MySQL to close all table handles so that it needs to reopen them the next time a query asks for them.

So the goal of the cache is to avoid an extra step that takes a long time to run (relatively speaking). Ideally the cache will be large enough that it can keep all the tables open for all connections.

Let's say you have one database with 50 tables and 100 max connections. You could have a table_open_cache of 5000 and you would pretty much cover everything.

However, chances are that most people will not access all 50 tables. You might only have 20 tables that are part of your average, day-to-day traffic. So your size could easily be 20 x 50 = 1000 open tables and you would still probably be in good shape.

So as I said earlier, if you want to get the best numbers for your cache size, you should know what your average queries look like, then take the number of tables they open, and multiply it by your max connections. That should be your starting point.

Then check the limits on your database server OS to see what the overall max number of open file handles is. Just make sure your open table cache size doesn't get near that number.

I mentioned MySQL tuner before because it will gather statistics, like the number of cache misses vs. your connection count, and make be fairly intelligent recommendations.
@theGhost_k8  @gr8gonzo    @Tomas Helgi Johannsson                                                                                                                        
A couple of issues relating to this that I’d like to resolve.

First one, relates to HA (Maxscale) how to deal with an automatic failure of the current Primary to it’s Secondary, if the above settings on the current HA pair of servers are different?

How do we maintain control and consistency across environments and being able to ensure that we maintain sufficient control of these settings?

We are moving to Infrastructure as Code, and these settings need to be controlled careful and provisioning scripts to ensure that future maintenance does not overwrite previous changes.

ASKER CERTIFIED SOLUTION
Avatar of gr8gonzo
gr8gonzo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial