Rossonero224
asked on
Percona cluster member randomly crash
Hi, i have a percona cluster with 3 nodes. Every 3 or 4 days, 1 node in the cluster randomly crash, in the log show that :
Can any MySQL/Percona expert help me?
Some other information
05:05:00 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster
key_buffer_size=8388608
read_buffer_size=4194304
max_used_connections=45
max_threads=802
thread_count=27
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6590372 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0xa09eba0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fa617fedd38 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace +0x35)[0x8 f97d5]
/usr/sbin/mysqld(handle_fatal_signal +0x4b4)[0x 6655c4]
/lib64/libpthread.so.0(+0xf710)[0x7f a78c98e710 ]
/usr/sbin/mysqld(_Z11ull_get_keyPKhP mc+0x14)[0 x5fb2b4]
/usr/sbin/mysqld(my_hash_first_from_ hash_value +0x6b)[0x8 e2c4b]
/usr/sbin/mysqld(my_hash_search+0x11 )[0x8e2e31 ]
/usr/sbin/mysqld(_ZN22Item_func_rele ase_lock7v al_intEv+0 x10f)[0x60 068f]
/usr/sbin/mysqld(_ZN4Item4sendEP8Pro tocolP6Str ing+0x1c4) [0x5b06d4]
/usr/sbin/mysqld(_ZN8Protocol19send_ result_set _rowEP4Lis tI4ItemE+0 xc7)[0x65e f47]
/usr/sbin/mysqld(_ZN11select_send9se nd_dataER4 ListI4Item E+0x67)[0x 6ae287]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x5 21)[0x6c9e 81]
/usr/sbin/mysqld(_Z12mysql_selectP3T HDP10TABLE _LISTjR4Li stI4ItemEP S4_P10SQL_ I_ListI8st _orderESB_ S7_yP13sel ect_result P18st_sele ct_lex_uni tP13st_sel ect_lex+0x 250)[0x712 0e0]
/usr/sbin/mysqld(_Z13handle_selectP3 THDP13sele ct_resultm +0x187)[0x 712967]
/usr/sbin/mysqld[0x6e836d]
/usr/sbin/mysqld(_Z21mysql_execute_c ommandP3TH D+0x3cdb)[ 0x6ed50b]
/usr/sbin/mysqld(_ZN18Prepared_state ment7execu teEP6Strin gb+0x40e)[ 0x7002ae]
/usr/sbin/mysqld(_ZN18Prepared_state ment12exec ute_loopEP 6StringbPh S2_+0xde)[ 0x7044ae]
/usr/sbin/mysqld(_Z22mysql_sql_stmt_ executeP3T HD+0xbe)[0 x70500e]
/usr/sbin/mysqld(_Z21mysql_execute_c ommandP3TH D+0x1324)[ 0x6eab54]
/usr/sbin/mysqld(_Z11mysql_parseP3TH DPcjP12Par ser_state+ 0x658)[0x6 f0338]
/usr/sbin/mysqld[0x6f0491]
/usr/sbin/mysqld(_Z16dispatch_comman d19enum_se rver_comma ndP3THDPcj +0x19d5)[0 x6f2675]
/usr/sbin/mysqld(_Z10do_commandP3THD +0x22b)[0x 6f3b5b]
/usr/sbin/mysqld(_Z24do_handle_one_c onnectionP 3THD+0x17f )[0x6bc30f ]
/usr/sbin/mysqld(handle_one_connecti on+0x47)[0 x6bc4f7]
/usr/sbin/mysqld(pfs_spawn_thread+0x 12a)[0xaf3 8ba]
/lib64/libpthread.so.0(+0x79d1)[0x7f a78c9869d1 ]
/lib64/libc.so.6(clone+0x6d)[0x7fa78 ae8a8fd]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fa4dc004178): is an invalid pointer
Connection ID (thread ID): 6222895
Status: NOT_KILLED
You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
150608 12:05:00 mysqld_safe Number of processes running now: 0
150608 12:05:00 mysqld_safe WSREP: not restarting wsrep node automatically
150608 12:05:00 mysqld_safe mysqld from pid file /var/lib/mysql/RR-Cluster-DB2.pid ended
Can any MySQL/Percona expert help me?
Some other information
1. Server version: 5.6.21-70.1-56-log Percona XtraDB Cluster (GPL), Release rel70.1, Revision 938, WSREP version 25.8, wsrep_25.8.r4150
2. Ram 32G, CPU 24 core, 4 HDD - raid 10
3. datafile: 4G
4. file: /etc/my.conf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
## REPLICATE ##
# Path to Galera library
wsrep_provider=/usr/lib64/libgalera_ smm.so
wsrep_provider_options="gcache.size = 1G; gcache.page_size = 512M; gcs.fc_limit = 512"
wsrep_slave_threads=24
wsrep_restart_slave=1
wsrep_forced_binlog_format=ROW
# Cluster connection URL contains IPs of node#1, node#2 and node#3
wsrep_cluster_address=gcomm://192.16 8.1.xxx,19 2.168.1.yy y,192.168. 1.zzz
# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB
# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node #2 address
wsrep_node_address=192.168.1.xxx
# Cluster name
wsrep_cluster_name=my_centos_cluster
# SST method
wsrep_sst_method=xtrabackup-v2
#Authentication for SST method
wsrep_sst_auth="aaabbbccc:dddeeefff"
# Maximum number of rows in write set
wsrep_max_ws_rows=262144
# Maximum size of write set
wsrep_max_ws_size=2147483648
#################### TUNNING ########################
###### Slow query log
slow_query_log=1
slow_query_log_file =/var/log/mysql/slow_queries.log
long_query_time=4
connect_timeout=300
skip_name_resolve
innodb_flush_log_at_trx_commit=2
innodb_file_per_table=1
max_allowed_packet=1G
max_connect_errors=1000000
innodb_buffer_pool_size=4G
read_buffer_size=4M
read_rnd_buffer_size=4M
join_buffer_size=8M
sort_buffer_size=4M
innodb_log_buffer_size=16M
thread_cache_size=256
innodb_additional_mem_pool_size=32M
innodb_flush_method=O_DIRECT
log_queries_not_using_indexes=1
innodb_thread_concurrency=0
wait_timeout=300
interactive_timeout=300
max_connections=800
innodb_fast_shutdown=0
open_files_limit=10000
table_open_cache=3000
tmp_table_size=32M
max_heap_table_size=32M
##### Set Ramdisk #####
tmpdir = /usr/mysqltmp
#######################
ASKER
thank you for your answer. I attach mysqltuner.pl and mysqlbug result in file mysqltuner.pl-VISUAL.doc
Please take a look
mysqltuner.pl-VISUAL.doc
Please take a look
mysqltuner.pl-VISUAL.doc
DB1 is 32bit - is it intentional?
ASKER
Sorry, the result of DB1 is wrong, my teammate, he got the wrong data from his VM. here is the right data.
mysqltuner.pl-VISUAL-new.doc
mysqltuner.pl-VISUAL-new.doc
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
When run the database with only 1 node, it can run ok without crash.
So I think the problem might be related to galera software.
So I think the problem might be related to galera software.
The stack contains mysql functions, not galera.
Does all nodes crash at times or just one? In later case run memtest86(+) for nightor so and get rid of bad ram.
Does all nodes crash at times or just one? In later case run memtest86(+) for nightor so and get rid of bad ram.
ASKER
only 1 node fail each time. It crashes even in low utilisation time.
I will try to run memtest86 but I don't think memory is bad in both 3 node :(
I will try to run memtest86 but I don't think memory is bad in both 3 node :(
Just test 1 node per night and sunday you will have the idea if it is bug or bad hardware.
ASKER
Tested it for 2 days, and don't see any error with memory.
Anyone has any other idea ?
Anyone has any other idea ?
So you caught a software bug and need to tell percona.
https:#a40819842 - it has to be investigated by Percona.
Can you run mysqltuner.pl after DB is running for 24h? To me it seems 800 thread limit is overkill for 45 user connections, and no thread cache....
32GB can keep all DB in RAM, but does not seem to be the case.
Can you post MySQL build options?
VISUAL=cat mysqlbug | grep ^Co