We've been chasing a mariadb problem for months and eliminated all we could to the point of staring blankly when everything would stop dead and loads on clients would go up to 100.
I have been monitoring using pmm for a week or two now and nothing really obvious came up yet we'd watch things come to a crawl.
No locks found but today, I finally found something in the log which is beyond me that this was never looked at but here we are.
The server is mariadb 10.5 running on Centos 7.9 and I found that the server got to the point where it actually restarted but, what caused this and and IS causing this?
(I corrected the above from 10.6 originally posted)
Here is some of the logging and note that there are a huge number of these too.
Warning] Aborted connection 1- Got an error reading communication packets
May 1 09:05:28 db01 mariadbd: number of readers 0, waiters flag 1, lock_word: 0
May 1 09:05:28 db01 mariadbd: Last time write locked in file mtr0mtr.inl line 146
May 1 09:05:28 db01 mariadbd: 2022-05-01 9:04:17 0 [Note] InnoDB: A semaphore wait:
May 1 09:05:28 db01 mariadbd: --Thread 139651702830848 has waited at row0ins.cc line 2633 for 252.00 seconds the semaphore:
May 1 09:05:28 db01 mariadbd: SX-lock on RW-latch at 0x7f186411cb78 created in file buf0buf.cc line 1226
May 1 09:05:28 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode exclusive
May 1 09:05:28 db01 mariadbd: number of readers 0, waiters flag 1, lock_word: 0
May 1 09:05:28 db01 mariadbd: Last time write locked in file mtr0mtr.inl line 146
May 1 09:05:28 db01 mariadbd: 2022-05-01 9:04:17 0 [Note] InnoDB: A semaphore wait:
May 1 09:05:28 db01 mariadbd: --Thread 139651687778048 has waited at row0ins.cc line 2633 for 252.00 seconds the semaphore:
May 1 09:05:28 db01 mariadbd: SX-lock on RW-latch at 0x7f186411cb78 created in file buf0buf.cc line 1226
May 1 09:05:28 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode exclusive
May 1 09:05:28 db01 mariadbd: number of readers 0, waiters flag 1, lock_word: 0
May 1 09:05:28 db01 mariadbd: Last time write locked in file mtr0mtr.inl line 146
May 1 09:05:28 db01 mariadbd: 2022-05-01 9:04:17 0 [Note] InnoDB: A semaphore wait:
May 1 09:05:28 db01 mariadbd: --Thread 139627240601344 has waited at btr0cur.cc line 1487 for 252.00 seconds the semaphore:
May 1 09:05:28 db01 mariadbd: SX-lock on RW-latch at 0x7efdfc031760 created in file dict0dict.cc line 2165
May 1 09:05:28 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode SX
May 1 09:05:28 db01 mariadbd: number of readers 136, waiters flag 1, lock_word: fffff78
May 1 09:05:28 db01 mariadbd: Last time write locked in file btr0cur.cc line 1487
May 1 09:05:28 db01 mariadbd: 2022-05-01 9:04:17 0 [Note] InnoDB: A semaphore wait:
May 1 09:05:28 db01 mariadbd: --Thread 139650269484800 has waited at row0ins.cc line 2633 for 252.00 seconds the semaphore:
May 1 09:05:28 db01 mariadbd: SX-lock on RW-latch at 0x7f186411cb78 created in file buf0buf.cc line 1226
May 1 09:05:28 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode exclusive
May 1 09:05:28 db01 mariadbd: number of readers 0, waiters flag 1, lock_word: 0
May 1 09:05:28 db01 mariadbd: Last time write locked in file mtr0mtr.inl line 146
May 1 09:05:28 db01 mariadbd: 2022-05-01 9:04:17 0 [Note] InnoDB: A semaphore wait:
May 1 09:05:28 db01 mariadbd: --Thread 139651696072448 has waited at btr0cur.cc line 1487 for 252.00 seconds the semaphore:
May 1 09:05:28 db01 mariadbd: SX-lock on RW-latch at 0x7efdfc031760 created in file dict0dict.cc line 2165
May 1 09:05:28 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode SX
May 1 09:05:28 db01 mariadbd: number of readers 136, waiters flag 1, lock_word: fffff78
May 1 09:05:28 db01 mariadbd: Last time write locked in file btr0cur.cc line 1487
May 1 09:05:28 db01 mariadbd: 2022-05-01 9:04:17 0 [Note] InnoDB: A semaphore wait:
May 1 09:05:28 db01 mariadbd: --Thread 139626700867328 has waited at row0ins.cc line 2633 for 252.00 seconds the semaphore:
May 1 09:05:28 db01 mariadbd: SX-lock on RW-latch at 0x7f186411cb78 created in file buf0buf.cc line 1226
May 1 09:05:28 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode exclusive
May 1 09:05:28 db01 mariadbd: number of readers 0, waiters flag 1, lock_word: 0
May 1 09:05:28 db01 mariadbd: Last time write locked in file mtr0mtr.inl line 146
May 1 09:05:28 db01 mariadbd: 2022-05-01 9:04:17 0 [Note] InnoDB: A semaphore wait:
May 1 09:05:28 db01 mariadbd: --Thread 139626718992128 has waited at btr0cur.cc line 1487 for 252.00 seconds the semaphore:
May 1 09:05:28 db01 mariadbd: SX-lock on RW-latch at 0x7efdfc031760 created in file dict0dict.cc line 2165
May 1 09:05:28 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode SX
May 1 09:05:28 db01 mariadbd: number of readers 136, waiters flag 1, lock_word: fffff78
May 1 09:05:28 db01 mariadbd: Last time write locked in file btr0cur.cc line 1487
May 1 09:05:28 db01 mariadbd: 2022-05-01 9:04:17 0 [Note] InnoDB: A semaphore wait:
May 1 09:05:28 db01 mariadbd: --Thread 139764948297472 has waited at btr0cur.cc line 1487 for 252.00 seconds the semaphore:
May 1 09:05:28 db01 mariadbd: SX-lock on RW-latch at 0x7efdfc031760 created in file dict0dict.cc line 2165
May 1 09:05:28 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode SX
May 1 09:06:03 db01 mariadbd: --Thread 139650272556800 has waited at btr0cur.cc line 1487 for 357.00 seconds the semaphore:
May 1 09:06:03 db01 mariadbd: SX-lock on RW-latch at 0x7efdfc031760 created in file dict0dict.cc line 2165
May 1 09:06:03 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode SX
May 1 09:06:03 db01 mariadbd: number of readers 136, waiters flag 1, lock_word: fffff78
May 1 09:06:03 db01 mariadbd: Last time write locked in file btr0cur.cc line 1487
May 1 09:06:03 db01 mariadbd: 2022-05-01 9:06:02 0 [Note] InnoDB: A semaphore wait:
May 1 09:06:03 db01 mariadbd: --Thread 139650530449152 has waited at btr0cur.cc line 1487 for 357.00 seconds the semaphore:
May 1 09:06:03 db01 mariadbd: SX-lock on RW-latch at 0x7efdfc031760 created in file dict0dict.cc line 2165
May 1 09:06:09 db01 kernel: INFO: task xfsaild/dm-0:664 blocked for more than 120 seconds.
May 1 09:06:09 db01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 1 09:06:09 db01 kernel: xfsaild/dm-0 D ffff9ebffa4c9660 0 664 2 0x00000000
May 1 09:06:09 db01 kernel: Call Trace:
May 1 09:06:09 db01 kernel: [<ffffffffa798c3e9>] schedule+0x29/0x70
May 1 09:06:09 db01 kernel: [<ffffffffc0438727>] xfs_log_force+0x157/0x2e0 [xfs]
May 1 09:06:09 db01 kernel: [<ffffffffa72dadf0>] ? wake_up_state+0x20/0x20
May 1 09:06:09 db01 kernel: [<ffffffffc0444a70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
May 1 09:06:09 db01 kernel: [<ffffffffc0444c00>] xfsaild+0x190/0x780 [xfs]
May 1 09:06:09 db01 kernel: [<ffffffffc0444a70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
May 1 09:06:09 db01 kernel: [<ffffffffa72c5e61>] kthread+0xd1/0xe0
May 1 09:06:09 db01 kernel: [<ffffffffa72c5d90>] ? insert_kthread_work+0x40/0x40
May 1 09:06:09 db01 kernel: [<ffffffffa7999df7>] ret_from_fork_nospec_begin+0x21/0x21
May 1 09:06:09 db01 kernel: [<ffffffffa72c5d90>] ? insert_kthread_work+0x40/0x40
May 1 09:06:09 db01 kernel: INFO: task mariadbd:18328 blocked for more than 120 seconds.
May 1 09:06:09 db01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 1 09:06:09 db01 kernel: mariadbd D ffff9ea942cfe8e0 0 18328 1 0x00000080
May 1 09:06:09 db01 kernel: Call Trace:
May 1 09:06:09 db01 kernel: [<ffffffffa798c3e9>] schedule+0x29/0x70
May 1 09:06:09 db01 kernel: [<ffffffffa798a0b1>] schedule_timeout+0x221/0x2d0
May 1 09:06:09 db01 kernel: [<ffffffffa7553f29>] ? __blk_run_queue+0x39/0x50
May 1 09:06:09 db01 kernel: [<ffffffffa7557c5c>] ? blk_queue_bio+0x31c/0x400
May 1 09:06:09 db01 kernel: [<ffffffffa726d39e>] ? kvm_clock_get_cycles+0x1e/0x20
May 1 09:06:09 db01 kernel: [<ffffffffa798bc9d>] io_schedule_timeout+0xad/0x130
May 1 09:06:09 db01 kernel: [<ffffffffa798ca1d>] wait_for_completion_io+0xfd/0x140
May 1 09:06:09 db01 kernel: [<ffffffffa72dadf0>] ? wake_up_state+0x20/0x20
May 1 09:06:09 db01 kernel: [<ffffffffa75596a4>] blkdev_issue_flush+0xb4/0x110
May 1 09:06:09 db01 kernel: [<ffffffffc04310f9>] xfs_blkdev_issue_flush+0x19/0x20 [xfs]
May 1 09:06:09 db01 kernel: [<ffffffffc0418e99>] xfs_file_fsync+0x199/0x1c0 [xfs]
May 1 09:06:09 db01 kernel: [<ffffffffa7484227>] do_fsync+0x67/0xb0
May 1 09:06:09 db01 kernel: [<ffffffffa7999ec9>] ? system_call_after_swapgs+0x96/0x13a
May 1 09:06:09 db01 kernel: [<ffffffffa7484533>] SyS_fdatasync+0x13/0x20
May 1 09:06:09 db01 kernel: [<ffffffffa7999f92>] system_call_fastpath+0x25/0x2a
May 1 09:06:09 db01 kernel: [<ffffffffa7999ed5>] ? system_call_after_swapgs+0xa2/0x13a
May 1 09:06:09 db01 kernel: INFO: task mariadbd:18806 blocked for more than 120 seconds.
May 1 09:06:09 db01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 1 09:06:09 db01 kernel: mariadbd D ffff9ebffd9f5860 0 18806 1 0x00000080
May 1 09:06:09 db01 kernel: Call Trace:
May 1 09:06:09 db01 kernel: [<ffffffffa798c3e9>] schedule+0x29/0x70
May 1 09:06:09 db01 kernel: [<ffffffffc0438c03>] xfs_log_force_lsn+0x1a3/0x370 [xfs]
May 1 09:06:09 db01 kernel: [<ffffffffa72dadf0>] ? wake_up_state+0x20/0x20
May 1 09:06:09 db01 kernel: [<ffffffffc0418dfd>] xfs_file_fsync+0xfd/0x1c0 [xfs]
May 1 09:06:09 db01 kernel: [<ffffffffa7484227>] do_fsync+0x67/0xb0
May 1 09:06:09 db01 kernel: [<ffffffffa7999ec9>] ? system_call_after_swapgs+0x96/0x13a
May 1 09:06:09 db01 kernel: [<ffffffffa7484533>] SyS_fdatasync+0x13/0x20
May 1 09:06:09 db01 kernel: [<ffffffffa7999f92>] system_call_fastpath+0x25/0x2a
May 1 09:06:09 db01 kernel: [<ffffffffa7999ed5>] ? system_call_after_swapgs+0xa2/0x13a
May 1 09:06:09 db01 kernel: INFO: task kworker/3:0:17452 blocked for more than 120 seconds.
May 1 09:06:09 db01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 1 09:06:09 db01 kernel: kworker/3:0 D ffff9eb22df65860 0 17452 2 0x00000080
May 1 09:06:09 db01 kernel: Workqueue: xfs-cil/dm-0 xlog_cil_push_work [xfs]
May 1 09:06:09 db01 kernel: Call Trace:
May 1 09:06:55 db01 pmm-agent: #033[36mINFO#033[0m[2022-05-01T09:06:55.745-07:00] 2022-05-
01T16:06:29.836Z#011error#011VictoriaMetrics/lib/promscrape/scrapework.go:354#011error when scraping "http://127.0.0.1:42000/metrics?
collect%5B%5D
=buddyinfo&collect%5B%5D=cpu&collect%5B%5D=diskstats&collect%5B%5D=filefd&collect%5B%5D=filesystem&collect%5B%5D=loadavg&collect%5B
%5D=meminfo&collect%5B%5D=meminfo_numa&collect%5B%5D=netdev&collect%5B%5D=netstat&collect%5B%5D=processes&
collect%5B%5D=standard.go&collect%5B%5D=standard.process&collect%5B%5D=stat&collect%5B%5D=textfile.hr&collect%5B%5D=time&collect%5B
%5D=vmstat" from job "node_exporter_agent_id_42c51671-7d6b-4426-be45-59f78a59dbf7_hr-5s" with labels {agen
t_id="/agent_id/42c51671-7d6b-4426-be45-59f78a59dbf7",agent_type="node_exporter",instance="/agent_id/42c51671-7d6b-4426-be45-
59f78a59dbf7",job="node_exporter_agent_id_42c51671-7d6b-4426-be45-59f78a59dbf7_hr-5s",machine_id="/machine_id/cc
4015d81dd64439ae059694373f1630",node_id="/node_id/34b0314f-1f30-42b2-94f4-0950ace8320e",node_name="db01.phx.loc",node_type="generic"}:
error when scraping "http://127.0.0.1:42000/metrics?collect%5B%5D=buddyinfo&collect%5B%5D=cpu&collect%
5B%5D=diskstats&collect%5B%5D=filefd&collect%5B%5D=filesystem&collect%5B%5D=loadavg&collect%5B%5D=meminfo&collect%5B
%5D=meminfo_numa&collect%5B%5D=netdev&collect%5B%5D=netstat&collect%5B%5D=processes&collect%5B%5D=standard.go&collect%5B%
5D=standard.process&collect%5B%5D=stat&collect%5B%5D=textfile.hr&collect%5B%5D=time&collect%5B%5D=vmstat" with timeout 4.5s: timeout
#033[36magentID#033[0m=/agent_id/b0fbda86-3080-411e-805c-1be676a7c9f8 #033[36mcomponent#033[0m=agent-process #033[36mtype#033[0m=vm_agent
May 1 09:09:11 db01 mariadbd: 2022-05-01 9:09:11 0 [Warning] InnoDB: A long semaphore wait:
May 1 09:09:11 db01 mariadbd: --Thread 139645429868288 has waited at btr0cur.cc line 1487 for 546.00 seconds the semaphore:
May 1 09:09:11 db01 mariadbd: SX-lock on RW-latch at 0x7efdfc031760 created in file dict0dict.cc line 2165
May 1 09:09:11 db01 mariadbd: a writer (thread id 139764553910016) has reserved it in mode SX
May 1 09:09:11 db01 mariadbd: number of readers 136, waiters flag 1, lock_word: fffff78
May 1 09:09:11 db01 mariadbd: Last time write locked in file btr0cur.cc line 1487
May 1 09:09:11 db01 mariadbd: 2022-05-01 9:09:11 0 [Warning] InnoDB: A long semaphore wait:
May 1 09:09:11 db01 mariadbd: --Thread 139626710697728 has waited at row0ins.cc line 2633 for 546.00 seconds the semaphore:
May 1 09:09:11 db01 mariadbd: SX-lock on RW-latch at 0x7f186411cb78 created in file buf0buf.cc line 1226
May 1 09:11:22 db01 rsyslogd: imjournal: 8218 messages lost due to rate-limiting
May 1 09:14:41 db01 systemd: mariadb.service: main process exited, code=killed, status=6/ABRT
May 1 09:14:41 db01 systemd: Unit mariadb.service entered failed state.
May 1 09:14:41 db01 systemd: mariadb.service failed.
May 1 09:14:46 db01 systemd: mariadb.service holdoff time over, scheduling restart.
May 1 09:14:46 db01 systemd: Stopped MariaDB 10.5.15 database server.
May 1 09:14:46 db01 systemd: Starting MariaDB 10.5.15 database server...
May 1 09:14:46 db01 mariadbd: 2022-05-01 9:14:46 0 [Note] /usr/sbin/mariadbd (mysqld 10.5.15-MariaDB) starting as process 20403 ...
May 1 09:14:46 db01 mariadbd: 2022-05-01 9:14:46 0 [Warning] Could not increase number of max_open_files to more than 32768
(request: 81075)
May 1 09:14:46 db01 mariadbd: 2022-05-01 9:14:46 0 [Note] Using unique option prefix 'myisam_recover' is error-prone and can break in the future. Please use the full name 'myisam-recover-options' instead.
May 1 09:14:46 db01 mariadbd: 2022-05-01 9:14:46 0 [Note] mariadbd: Aria engine: starting recovery
May 1 09:14:47 db01 mariadbd: tables to flush: 1 0
May 1 09:14:47 db01 mariadbd: (0.5 seconds);
May 1 09:14:46 db01 mariadbd: 2022-05-01 9:14:46 0 [Warning] Could not increase number of
max_open_files to more than 32768 (request: 81075)
The log indicates that the database is waiting for locks and that may because you have reached the open_file_limit.SHOW GLOBAL VARIABLES LIKE 'open_files_limit';
1 09:14:46 db01 systemd: Stopped MariaDB 10.5.15 database server.
May 1 09:14:46 db01 systemd: Starting MariaDB 10.5.15 database server...
Aftur you did the binary upgrade from 10.5 to 10.6 did you run the commandI'll check that link right now. Yes, it's what I did before. Can it be run after the fact, now that the db has been running this way for maybe a couple of months?Yes it can be. It is mostly the catalog that is updated to reflect the binary changes and maybe some other tables.
SHOW GLOBAL VARIABLES LIKE 'open_files_limit';
SHOW GLOBAL VARIABLES LIKE 'max_connections';
show status where variable_name = 'threads_connected';
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'open_files_limit';
+------------------+-------+
| open_files_limit | 32768 |
+------------------+-------+
1 row in set (0.001 sec)
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE 'max_connections';
+-----------------+-------+
| max_connections | 1000 |
+-----------------+-------+
1 row in set (0.001 sec)
MariaDB [(none)]> show status where variable_name = 'threads_connected';
+-------------------+-------+
| Threads_connected | 11 |
+-------------------+-------+
1 row in set (0.001 sec)
mysqltuner says no problem with files open;
-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 2d 5h 38m 26s (290M q [1K qps], 17M conn, TX: 6601G, RX: 71G)
[--] Reads / Writes: 89% / 11%
[--] Binary logging is disabled
[--] Physical Memory : 122.8G
[--] Max MySQL memory : 123.7G
[--] Other process memory: 0B
[--] Total buffers: 105.3G global + 18.9M per thread (1000 max threads)
[--] P_S Max memory usage: 72B
[--] Galera GCache Max memory usage: 0B
[!!] Maximum reached memory usage: 118.7G (96.63% of installed RAM)
[!!] Maximum possible memory usage: 123.7G (100.76% of installed RAM)
[!!] Overall possible memory usage with other process exceeded memory
[!!] Slow queries: 16% (48M/290M)
[OK] Highest usage of available connections: 72% (725/1000)
[OK] Aborted connections: 0.00% (5/17936966)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[OK] Sorts requiring temporary tables: 1% (232K temp sorts / 16M sorts)
[!!] Joins performed without indexes: 47041862
[!!] Temporary tables created on disk: 33% (47M on disk / 142M total)
[OK] Thread cache hit rate: 94% (944K created / 17M connections)
[OK] Table cache hit rate: 99% (284M hits / 284M requests)
[OK] table_definition_cache(4096) is upper than number of tables(302)
[OK] Open file limit used: 0% (62/32K)
[OK] Table locks acquired immediately: 100% (44K immediate / 44K locks)
tuning-primer;
MAX CONNECTIONS
Current max_connections = 1000
Current threads_connected = 14
Historic max_used_connections = 725
The number of used connections is 72% of the configured maximum.
Your max_connections variable seems to be fine.
OPEN FILES LIMIT
Current open_files_limit = 32768 files
The open_files_limit should typically be set to at least 2x-3x
that of table_cache if you have heavy MyISAM usage.
Your open_files_limit value seems to be fine
Joins performed without indexes: 47041862
this is crazy and indicates you are missing indexesMaximum possible memory usage: 123.7G (100.76% of installed RAM)
this requires attention : you need to be under 100 including leaving enough space for the OS. maybe limit the number of connections or buffers.Joins performed without indexes: 47041862
You should log all queries not using indexes, look at them and add suitable indexes on your tables.Yes, this has been an ongoing thing :). I've brought it up to the db dev many many times and he assures me that everything is indexed. Every report I run however says otherwise but he insists all is indexed so I'm kind of stuck on this one.The best way to address this is to log slow queries using proper configuration and set log_queries_not_using_indexes to ON in my.cnf and gather all queries that are logged into a report and present it to the db dev.
log_slow_verbosity=query_plan,explain
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
user=mysql
collation-server = utf8_unicode_ci
init-connect='SET NAMES utf8'
character-set-server = utf8
#slow_query_log = 1
#slow_query_log_file = /var/log/mariadb/mariadb_slow.log
log-queries-not-using-indexes = 1
long_query_time = 1
performance_schema = On
# CACHES AND LIMITS
max_connections = 1000 # Changed from 800 to 1000 9-24-2021
# Effective in-memory tmp_table_size is limited to max_heap_table_size
# Changed to 1GB from 536870912 4/2/22
# SET max_heap_table_size=1073741824;
# SET tmp_table_size=1073741824;
# Larger blew up the server by filling /tmp
max_heap_table_size = 1024M # From 512M to 1024M 4/3/22
tmp_table_size = 1024M # Changed together. Done in memory on this date.
open_files_limit = 65535
query_cache_type = Off
query_cache_size = 0
table_definition_cache = 4096
table_open_cache = 5000
thread_cache_size = 50
# TIMEOUTS
#connect_timeout = 10
#interactive_timeout = 3600
wait_timeout = 120 # Changed from 8 days to 120 10/17/21
# MyISAM
#key_buffer_size = 32M # Commented out Jan/29/2021
myisam_recover = FORCE,BACKUP
# SAFETY
innodb = FORCE
innodb_strict_mode = 1
# This doesn't need to be any larger than the max blob/packet/message size.
max_allowed_packet = 16M # Changed from 1G to 16M 4/3/22
max_connect_errors = 1000000
skip_name_resolve # Enabled 9-24-2021
# INNODB
innodb_buffer_pool_size = 104G # 80% of 256B
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 0
#innodb_file_per_table = 1 # Commented out 9-24-2021
#innodb_log_files_in_group = 2 # Commented out 9-24-2021
# Was 16GB. Changed to 1GB 4/2/22
innodb_log_file_size = 1G
default-storage-engine = INNODB
[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid
!includedir /etc/my.cnf.d
Issue | Recommendation |
---|---|
There is a lot of slow queries compared to the overall amount of Queries. | You might want to increase long_query_time or optimize the queries listed in the slow query log |
The query cache is not enabled. | The query cache is known to greatly improve performance if configured correctly. Enable it by setting query_cache_size to a 2 digit MiB value and setting query_cache_type to 'ON'. Note: If you are using memcached, ignore this recommendation. |
Too many sorts are causing temporary tables. | Consider increasing sort_buffer_size and/or read_rnd_buffer_size, depending on your system memory limits |
There are lots of rows being sorted. | While there is nothing wrong with a high amount of row sorting, you might want to make sure that the queries which require a lot of sorting use indexed columns in the ORDER BY clause, as this will result in much faster sorting |
There are too many joins without indexes. | This means that joins are doing full table scans. Adding indexes for the columns being used in the join conditions will greatly speed up table joins |
The rate of reading the first index entry is high. | This usually indicates frequent full index scans. Full index scans are faster than table scans but require lots of CPU cycles in big tables, if those tables that have or had high volumes of UPDATEs and DELETEs, running 'OPTIMIZE TABLE' might reduce the amount of and/or speed up full index scans. Other than that full index scans can only be reduced by rewriting queries. |
The rate of reading data from a fixed position is high. | This indicates that many queries need to sort results and/or do a full table scan, including join queries that do not use indexes. Add indexes where applicable. |
The rate of reading the next table row is high. | This indicates that many queries are doing full table scans. Add indexes where applicable. |
Many temporary tables are being written to disk instead of being kept in memory. | Increasing max_heap_table_size and tmp_table_size might help. However some temporary tables are always being written to disk, independent of the value of these variables. To eliminate these you will have to rewrite your queries to avoid those conditions (Within a temporary table: Presence of a BLOB or TEXT column or presence of a column bigger than 512 bytes) as mentioned in the beginning of an Article by the Pythian Group |
MyISAM key buffer (index cache) % used is low. | You may need to decrease the size of key_buffer_size, re-examine your tables to see if indexes have been removed, or examine queries and expectations about what indexes are being used. |
The rate of opening tables is high. | Opening tables requires disk I/O which is costly. Increasing table_open_cache might avoid this. |
There are too many threads that are slow to launch. | This generally happens in case of general system overload as it is pretty simple operations. You might want to monitor your system load carefully. |
Too many clients are aborted. | Clients are usually aborted when they did not close their connection to MySQL properly. This can be due to network issues or code not closing a database handler properly. Check your network and code. |
The InnoDB log file size is inadequately large. | It is usually sufficient to set innodb_log_file_size to 25% of the size of innodb_buffer_pool_size. A very big innodb_log_file_size slows down the recovery time after a database crash considerably. See also this blog entry |
The largest db is 1GB1 GB or 1TB ?
May 5 15:20:00 cp01 kernel: lockd: server tn01 not responding, still trying
**snip**
May 5 15:20:31 cp01 kernel: lockd: server tn01 OK
May 5 15:20:32 cp01 kernel: lockd: server tn01 not responding, still trying
**snip**
May 5 15:22:11 cp01 kernel: lockd: server tn01 OK
May 5 15:22:12 cp01 kernel: lockd: server tn01 not responding, still trying
**snip**
May 5 15:25:31 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
May 5 15:25:31 cp01 kernel: lockd: server tn01 not responding, still trying
May 5 15:25:31 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
**snip**
May 5 15:25:31 cp01 kernel: lockd: server tn01 not responding, still trying
May 5 15:25:31 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
May 5 15:25:31 cp01 kernel: lockd: server tn01 not responding, still trying
May 5 15:25:31 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
**snip**
May 5 15:25:31 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
May 5 15:25:31 cp01 kernel: lockd: server tn01 not responding, still trying
May 5 15:25:31 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
**snip**
May 5 15:25:31 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
May 5 15:25:32 cp01 kernel: lockd: server tn01 OK
May 5 15:25:32 cp01 kernel: lockd: server tn01 OK
May 5 15:26:32 cp01 kernel: lockd: server tn01 not responding, still trying
May 5 15:26:32 cp01 kernel: lockd: server tn01 not responding, still trying
May 5 15:27:38 cp01 kernel: lockd: server tn01 OK
**snip**
May 5 15:27:38 cp01 kernel: lockd: server tn01 OK
May 5 15:28:38 cp01 kernel: lockd: server tn01 not responding, still trying
**snip**
May 5 15:28:38 cp01 kernel: lockd: server tn01 not responding, still trying
May 5 15:29:44 cp01 kernel: call_decode: 90 callbacks suppressed
May 5 15:29:44 cp01 kernel: lockd: server tn01 OK
May 5 15:30:18 cp01 systemd-logind[1019]: New session 6 of user root.
May 5 15:30:18 cp01 systemd[1]: Started Session 6 of user root.
May 5 15:30:59 cp01 kernel: lockd: server tn01 OK
May 5 15:33:38 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
**snip**
May 5 15:33:38 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
**snip**
May 5 15:33:38 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
May 5 15:33:39 cp01 kernel: lockd: server tn01 OK
May 5 16:07:06 cp01 systemd[1]: Reloading The Apache HTTP Server.
May 5 16:07:06 cp01 systemd[1]: Reloading The PHP FastCGI Process Manager.
May 5 16:07:06 cp01 systemd[1]: Reloaded The PHP FastCGI Process Manager.
May 5 16:07:06 cp01 systemd[1]: Reloaded The Apache HTTP Server.
May 5 16:07:06 cp01 httpd[1442]: Server configured, listening on: port 443, port 80
May 5 16:10:18 cp01 kernel: rpc_check_timeout: 89 callbacks suppressed
May 5 16:10:18 cp01 kernel: lockd: server tn01 not responding, still trying
May 5 16:10:18 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
May 5 16:10:18 cp01 kernel: lockd: server tn01 not responding, still trying
**snip**
May 5 16:10:18 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
May 5 16:10:55 cp01 kernel: rpc_check_timeout: 182 callbacks suppressed
May 5 16:10:55 cp01 kernel: lockd: server tn01 not responding, still trying
May 5 16:14:11 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
May 5 16:21:38 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
May 5 16:21:43 cp01 systemd[1]: Starting Cleanup of Temporary Directories...
May 5 16:21:43 cp01 systemd[1]: systemd-tmpfiles-clean.service: Succeeded.
May 5 16:21:43 cp01 systemd[1]: Started Cleanup of Temporary Directories.
May 5 16:22:13 cp01 systemd[1]: Starting dnf makecache...
May 5 16:22:14 cp01 dnf[62326]: CentOS Linux 8 - AppStream 20 kB/s | 4.3 kB 00:00
May 5 16:22:14 cp01 dnf[62326]: CentOS Linux 8 - BaseOS 20 kB/s | 3.9 kB 00:00
May 5 16:22:15 cp01 dnf[62326]: CentOS Linux 8 - Extras 20 kB/s | 1.5 kB 00:00
May 5 16:22:15 cp01 dnf[62326]: Extra Packages for Enterprise Linux Modular 8 - 52 kB/s | 19 kB 00:00
May 5 16:22:15 cp01 dnf[62326]: Extra Packages for Enterprise Linux 8 - x86_64 53 kB/s | 16 kB 00:00
May 5 16:22:16 cp01 dnf[62326]: Remi's Modular repository for Enterprise Linux 5.5 kB/s | 3.5 kB 00:00
May 5 16:22:17 cp01 dnf[62326]: Safe Remi's RPM repository for Enterprise Linux 5.8 kB/s | 3.0 kB 00:00
May 5 16:22:19 cp01 dnf[62326]: Webmin Distribution Neutral 1.4 kB/s | 2.9 kB 00:02
May 5 16:22:19 cp01 dnf[62326]: Zabbix Official Repository - x86_64 39 kB/s | 2.9 kB 00:00
May 5 16:22:19 cp01 dnf[62326]: Zabbix Official Repository non-supported - x86_ 82 kB/s | 3.0 kB 00:00
May 5 16:22:20 cp01 dnf[62326]: Module yaml error: Unexpected key in data: static_context [line 9 col 3]
May 5 16:22:20 cp01 dnf[62326]: Module yaml error: Unexpected key in data: static_context [line 9 col 3]
May 5 16:22:20 cp01 dnf[62326]: Metadata cache created.
May 5 16:22:20 cp01 systemd[1]: dnf-makecache.service: Succeeded.
May 5 16:22:20 cp01 systemd[1]: Started dnf makecache.
May 5 16:25:23 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
**snip**
May 5 16:25:23 cp01 kernel: xs_tcp_setup_socket: connect returned unhandled error -107
What do you mean by knowing the server specs? Is there something else I should be sharing and haven't?are you running a galera cluster ? is this a vm ? what is your disks setup including raid levels ?
Yes, fully protected.
load of 100are you talking of load average ? if that is the case and the CPU is still low, you have an issue with I/O. if the cpu is the one reaching 100, you probably have a loop in whichever program is using 100% cpu.
kernel: xs_tcp_setup_socket: connect returned unhandled error -107
kernel: lockd: server tn01 OK
Considering that the NFS share is disconnected, it is strange to see these errors in the log now.I dumped the database.
# time tar cvf dump.tar dump.sql
dump.sql
I lost connection to proxmox. Load was only showing 28 form the CLI of the DB server.
Memory seems to stay around the same, 80% used according to proxmox.
^C^C^C^C^C^@^C
Message from syslogd@db01 at May 6 12:31:28 ...
kernel:NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [mariadbd:38124]
Looking on the net, there are some posts talking about a kernel error in proxmox that doesn't
seem to be fixed yet. My version is 7.1.7.
So maybe I have two problems, proxmox and what ever this gateway timeout thing is since
the last server as well.
Looks like the pmm agent is causing some sort of problem.
May 7 16:54:08 db01 pmm-agent: #033[36mINFO#033[0m[2022-05-07T16:54:08.513-07:00] 2022-05-07T23:54:08.512Z#011error#011VictoriaMetrics/lib/promscrape/scrapework.go:354#011error when scraping "http://127.0.0.1:42001/metrics?collect%5B
%5D=buddyinfo&collect%5B%5D=cpu&collect%5B%5D=diskstats&collect%5B%5D=filefd&collect%5B%5D=filesystem&collect%5B%5D=loadavg&collect%5B%5D=meminfo&collect%5B%5D=meminfo_numa&collect%5B%5D=netdev&collect%5B%5D=netstat&collect%5B%5D=proc esses&collect%5B%5D=standard.go&collect%5B%5D=standard.process&collect%5B%5D=stat&collect%5B%5D=textfile.hr&collect%5B%5D=time&collect%5B%5D=vmstat" from job "node_exporter_agent_id_5d859f79-ba0b-4aea-af91-d51a5a4a4fb5_hr-5s" with lab els {agent_id="/agent_id/5d859f79-ba0b-4aea-af91-d51a5a4a4fb5",agent_type="node_exporter",instance="/agent_id/5d859f79-ba0b-4aea-af91-d51a5a4a4fb5",job="node_exporter_agent_id_5d859f79-ba0b-4aea-af91-d51a5a4a4fb5_hr-5s",machine_id="/m achine_id/cc4015d81dd64439ae059694373f1630",node_id="/node_id/2fc49a42-7c1a-459b-9d8c-5c6311b78ec2",node_name="db01.x.loc",node_type="generic"}: error when scraping "http://127.0.0.1:42001/metrics?collect%5B%5D=buddyinfo&collect%5B% 5D=cpu&collect%5B%5D=diskstats&collect%5B%5D=filefd&collect%5B%5D=filesystem&collect%5B%5D=loadavg&collect%5B%5D=meminfo&collect%5B%5D=meminfo_numa&collect%5B%5D=netdev&collect%5B%5D=netstat&collect%5B%5D=processes&collect%5B%5D=stand ard.go&collect%5B%5D=standard.process&collect%5B%5D=stat&collect%5B%5D=textfile.hr&collect%5B%5D=time&collect%5B%5D=vmstat" with timeout 4.5s: timeout #033[36magentID#033[0m=/agent_id/1b0d973f-f754-43ce-9731-4c04909f3331 #033[36mcomp onent#033[0m=agent-process #033[36mtype#033[0m=vm_agent
There is no gain from setting the buffer larger than required to hold each matching row, and all joins allocate at least the minimum size, so use caution in setting this variable to a large value globally. It is better to keep the global setting small and change to a larger setting only in sessions that are doing large joins. Memory allocation time can cause substantial performance drops if the global size is larger than needed by most queries that
mysqltuner-1-9-8.pl
-------- Storage Engine Statistics -----------------------------------------------------------------
[--] Status: +Aria +CSV +InnoDB +MEMORY +MRG_MyISAM +MyISAM +PERFORMANCE_SCHEMA +SEQUENCE
[--] Data in InnoDB tables: 11.5G (Tables: 109)
[OK] Total fragmented tables: 0
-------- Performance Metrics -----------------------------------------------------------------------
[--] Up for: 4d 6h 42m 47s (579M q [1K qps], 29M conn, TX: 12505G, RX: 123G)
[--] Reads / Writes: 91% / 9%
[--] Binary logging is disabled
[--] Physical Memory : 255.9G
[--] Max MySQL memory : 106.0G
[--] Other process memory: 0B
[--] Total buffers: 96.8G global + 18.9M per thread (500 max threads)
[--] P_S Max memory usage: 0B
[--] Galera GCache Max memory usage: 0B
[OK] Maximum reached memory usage: 98.9G (38.66% of installed RAM)
[OK] Maximum possible memory usage: 106.0G (41.42% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (736/579M)
[OK] Highest usage of available connections: 23% (118/500)
[OK] Aborted connections: 0.00% (3/29471697)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[OK] Sorts requiring temporary tables: 1% (450K temp sorts / 30M sorts)
[!!] Joins performed without indexes: 77652387
[!!] Temporary tables created on disk: 33% (78M on disk / 234M total)
[OK] Thread cache hit rate: 99% (28K created / 29M connections)
[OK] Table cache hit rate: 99% (575M hits / 575M requests)
[OK] table_definition_cache(4096) is upper than number of tables(302)
[OK] Open file limit used: 0% (38/80K)
[OK] Table locks acquired immediately: 100% (15K immediate / 15K locks)
-------- InnoDB Metrics ----------------------------------------------------------------------------
[--] InnoDB is enabled.
[--] InnoDB Thread Concurrency: 0
[OK] InnoDB File per table is activated
[OK] InnoDB buffer pool / data size: 96.0G/11.5G
[!!] Ratio InnoDB log file size / InnoDB Buffer pool size (1.04166666666667 %): 1.0G * 1/96.0G should be equal to 25%
[--] Number of InnoDB Buffer Pool Chunk : 768 for 1 Buffer Pool Instance(s)
[OK] Innodb_buffer_pool_size aligned with Innodb_buffer_pool_chunk_size & Innodb_buffer_pool_instances
[OK] InnoDB Read buffer efficiency: 100.00% (888622438101 hits/ 888622768310 total)
[OK] InnoDB Write log efficiency: 98.08% (18256605 hits/ 18613479 total)
[OK] InnoDB log waits: 0.00% (0 waits / 356874 writes)
-------- Aria Metrics ------------------------------------------------------------------------------
[--] Aria Storage Engine is enabled.
[OK] Aria pagecache size / total Aria indexes: 128.0M/2.5M
[OK] Aria pagecache hit rate: 95.9% (1B cached / 78M reads)
-------- Recommendations ---------------------------------------------------------------------------
We will suggest raising the 'join_buffer_size' until JOINs not using indexes are found.
Temporary table size is already large - reduce result set size
Reduce your SELECT DISTINCT queries without LIMIT clauses
Performance schema should be activated for better diagnostics
Variables to adjust:
join_buffer_size (> 256.0K, or always use indexes with JOINs)
performance_schema=ON
innodb_log_file_size should be (=24G) if possible, so InnoDB total log files size equals to 25% of buffer pool size.
tuning-primer.sh
Uptime = 4 days 6 hrs 59 min 32 sec
Avg. qps = 1566
Total Questions = 580782849
Threads Connected = 7
SLOW QUERIES (slow/long logging enabled via CLI for testing)
The slow query log is NOT enabled.
Current long_query_time = 10.000000 sec.
You have 736 out of 580782919 that take longer than 10.000000 sec. to complete
Your long_query_time seems to be fine
WORKER THREADS
Current thread_cache_size = 50
Current threads_cached = 47
Current threads_per_sec = 0
Historic threads_per_sec = 0
Your thread_cache_size is fine
MAX CONNECTIONS
Current max_connections = 500
Current threads_connected = 8
Historic max_used_connections = 118
The number of used connections is 23% of the configured maximum.
Your max_connections variable seems to be fine.
INNODB STATUS
Current InnoDB index space = 3.37 G
Current InnoDB data space = 8.77 G
Current InnoDB buffer pool free = 87 %
Current innodb_buffer_pool_size = 96.00 G
Depending on how much space your innodb indexes take up it may be safe
to increase this value to up to 2 / 3 of total system memory
MEMORY USAGE
Max Memory Ever Allocated : 96.47 G
Configured Max Per-thread Buffers : 1.42 G
Configured Max Global Buffers : 96.14 G
Configured Max Memory Limit : 97.56 G
Physical Memory : 256.00 G
Max memory limit seem to be within acceptable norms
KEY BUFFER
No key reads?!
Seriously look into using some indexes
Current MyISAM index space = 0 bytes
Current key_buffer_size = 128 M
Key cache miss rate is 1 : 0
Key buffer free ratio = 81 %
Your key_buffer_size seems to be fine
QUERY CACHE
Query cache is supported but not enabled
Perhaps you should set the query_cache_size
SORT OPERATIONS
Current sort_buffer_size = 2 M
Current read_rnd_buffer_size = 256 K
Sort buffer seems to be fine
JOINS (digging into this tomorrow)
Current join_buffer_size = 260.00 K
You have had 77878027 queries where a join could not use an index properly
OPEN FILES LIMIT
Current open_files_limit = 80579 files
The open_files_limit should typically be set to at least 2x-3x
that of table_cache if you have heavy MyISAM usage.
Your open_files_limit value seems to be fine
TABLE CACHE
Current table_open_cache = 5000 tables
Current table_definition_cache = 4096 tables
You have a total of 219 tables
You have 847 open tables.
The table_cache value seems to be fine
TEMP TABLES
Current max_heap_table_size = 512 M
Current tmp_table_size = 512 M
Of 235007339 temp tables, 25% were created on disk
Perhaps you should increase your tmp_table_size and/or max_heap_table_size
to reduce the number of disk-based temporary tables
Note! BLOB and TEXT columns are not allow in memory tables.
If you are using these columns raising these values might not impact your
ratio of on disk temp tables.
TABLE SCANS
Current read_buffer_size = 128 K
Current table scan ratio = 16 : 1
read_buffer_size seems to be fine
TABLE LOCKING
Current Lock Wait ratio = 0 : 580786125
Your table locking seems to be fine
You have had 77878027 queries where a join could not use an index properlyThis is another indication indexing is not done correctly.......
Total buffers: 96.8G global + 18.9M per thread (500 max threads)
Maybe I have to do something with the configuration to get more threads, like checking the thread pool etc.mysqltuner-1-9-8
[--] Up for: 7d 7h 25m 50s (926M q [1K qps], 52M conn, TX: 21570G, RX: 182G)
[--] Reads / Writes: 91% / 9%
[--] Binary logging is disabled
[--] Physical Memory : 255.9G
[--] Max MySQL memory : 115.2G
[--] Other process memory: 0B
[--] Total buffers: 96.8G global + 18.9M per thread (1000 max threads)
[--] P_S Max memory usage: 0B
[--] Galera GCache Max memory usage: 0B
[OK] Maximum reached memory usage: 99.1G (38.74% of installed RAM)
[OK] Maximum possible memory usage: 115.2G (45.03% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (339K/926M)
[OK] Highest usage of available connections: 12% (129/1000)
[OK] Aborted connections: 0.00% (4/52855737)
[OK] Query cache is disabled by default due to mutex contention on multiprocessor machines.
[OK] Sorts requiring temporary tables: 1% (765K temp sorts / 53M sorts)
[!!] Joins performed without indexes: 89649292
[!!] Temporary tables created on disk: 33% (91M on disk / 270M total)
[OK] Thread cache hit rate: 99% (46K created / 52M connections)
[OK] Table cache hit rate: 99% (867M hits / 867M requests)
[OK] table_definition_cache(4096) is upper than number of tables(305)
[OK] Open file limit used: 0% (59/80K)
[OK] Table locks acquired immediately: 100% (16K immediate / 16K locks)
-------- InnoDB Metrics ----------------------------------------------------------------------------
[--] InnoDB is enabled.
[--] InnoDB Thread Concurrency: 0
[OK] InnoDB File per table is activated
[OK] InnoDB buffer pool / data size: 96.0G/12.4G
[!!] Ratio InnoDB log file size / InnoDB Buffer pool size (1.04166666666667 %): 1.0G * 1/96.0G should be equal to 25%
[--] Number of InnoDB Buffer Pool Chunk : 768 for 1 Buffer Pool Instance(s)
[OK] Innodb_buffer_pool_size aligned with Innodb_buffer_pool_chunk_size & Innodb_buffer_pool_instances
[OK] InnoDB Read buffer efficiency: 100.00% (1526151447460 hits/ 1526151777669 total)
[OK] InnoDB Write log efficiency: 98.26% (34430040 hits/ 35041321 total)
[OK] InnoDB log waits: 0.00% (0 waits / 611281 writes)
Variables to adjust:
join_buffer_size (> 256.0K, or always use indexes with JOINs) (it was set to 512 the other day)
performance_schema=ON
tuning-primer.sh
Uptime = 7 days 7 hrs 36 min 2 sec
Avg. qps = 1467
Total Questions = 927673293
Threads Connected = 13
SLOW QUERIES
Current long_query_time = 2.000000 sec.
You have 339769 out of 927673547 that take longer than 2.000000 sec. to complete
Your long_query_time seems to be fine
WORKER THREADS
Current thread_cache_size = 50
Current threads_cached = 44
Current threads_per_sec = 0
Historic threads_per_sec = 0
Your thread_cache_size is fine
MAX CONNECTIONS
Current max_connections = 1000
Current threads_connected = 11
Historic max_used_connections = 129
The number of used connections is 12% of the configured maximum.
Your max_connections variable seems to be fine.
INNODB STATUS
Current InnoDB index space = 3.43 G
Current InnoDB data space = 9.00 G
Current InnoDB buffer pool free = 86 %
Current innodb_buffer_pool_size = 96.00 G
Depending on how much space your innodb indexes take up it may be safe
to increase this value to up to 2 / 3 of total system memory
MEMORY USAGE
Max Memory Ever Allocated : 96.50 G
Configured Max Per-thread Buffers : 2.84 G
Configured Max Global Buffers : 96.14 G
Configured Max Memory Limit : 98.98 G
Physical Memory : 256.00 G
Max memory limit seem to be within acceptable norms
KEY BUFFER
Current MyISAM index space = 0 bytes
Current key_buffer_size = 128 M
Key cache miss rate is 1 : 0
Key buffer free ratio = 81 %
Your key_buffer_size seems to be fine
QUERY CACHE
Query cache is supported but not enabled
Perhaps you should set the query_cache_size
SORT OPERATIONS
Current sort_buffer_size = 2 M
Current read_rnd_buffer_size = 256 K
Sort buffer seems to be fine
JOINS
Current join_buffer_size = 260.00 K
You have had 89649292 queries where a join could not use an index properly
OPEN FILES LIMIT
Current open_files_limit = 80579 files
The open_files_limit should typically be set to at least 2x-3x
that of table_cache if you have heavy MyISAM usage.
Your open_files_limit value seems to be fine
TABLE CACHE
Current table_open_cache = 5000 tables
Current table_definition_cache = 4096 tables
You have a total of 222 tables
You have 866 open tables.
The table_cache value seems to be fine
TEMP TABLES
Current max_heap_table_size = 512 M
Current tmp_table_size = 512 M
Of 270641808 temp tables, 25% were created on disk
TABLE SCANS
Current read_buffer_size = 128 K
Current table scan ratio = 13 : 1
read_buffer_size seems to be fine
TABLE LOCKING
Current Lock Wait ratio = 0 : 927676855
Your table locking seems to be fine
So as we can see, things have gotten better and we'll make some additional changes next week.
it may or may not have recovered by itself if left alone
this looks like a deadlock but the log lines are a bit different from the usual.
googling suggests the lock may occur in the adaptive hash index. you probably should try to disable it in order to check whether it is indeed the root cause.
other actions include searching for the actual dead lock starting with innodb status page.