Percona cluster member randomly crash

Hi, i have a percona cluster with 3 nodes. Every 3 or 4 days, 1 node in the cluster randomly crash, in the log show that :

05:05:00 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona XtraDB Cluster better by reporting any
bugs at https://bugs.launchpad.net/percona-xtradb-cluster

key_buffer_size=8388608
read_buffer_size=4194304
max_used_connections=45
max_threads=802
thread_count=27
connection_count=2
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6590372 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0xa09eba0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fa617fedd38 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8f97d5]
/usr/sbin/mysqld(handle_fatal_signal+0x4b4)[0x6655c4]
/lib64/libpthread.so.0(+0xf710)[0x7fa78c98e710]
/usr/sbin/mysqld(_Z11ull_get_keyPKhPmc+0x14)[0x5fb2b4]
/usr/sbin/mysqld(my_hash_first_from_hash_value+0x6b)[0x8e2c4b]
/usr/sbin/mysqld(my_hash_search+0x11)[0x8e2e31]
/usr/sbin/mysqld(_ZN22Item_func_release_lock7val_intEv+0x10f)[0x60068f]
/usr/sbin/mysqld(_ZN4Item4sendEP8ProtocolP6String+0x1c4)[0x5b06d4]
/usr/sbin/mysqld(_ZN8Protocol19send_result_set_rowEP4ListI4ItemE+0xc7)[0x65ef47]
/usr/sbin/mysqld(_ZN11select_send9send_dataER4ListI4ItemE+0x67)[0x6ae287]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x521)[0x6c9e81]
/usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x250)[0x7120e0]
/usr/sbin/mysqld(_Z13handle_selectP3THDP13select_resultm+0x187)[0x712967]
/usr/sbin/mysqld[0x6e836d]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3cdb)[0x6ed50b]
/usr/sbin/mysqld(_ZN18Prepared_statement7executeEP6Stringb+0x40e)[0x7002ae]
/usr/sbin/mysqld(_ZN18Prepared_statement12execute_loopEP6StringbPhS2_+0xde)[0x7044ae]
/usr/sbin/mysqld(_Z22mysql_sql_stmt_executeP3THD+0xbe)[0x70500e]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1324)[0x6eab54]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x658)[0x6f0338]
/usr/sbin/mysqld[0x6f0491]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x19d5)[0x6f2675]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x22b)[0x6f3b5b]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x17f)[0x6bc30f]
/usr/sbin/mysqld(handle_one_connection+0x47)[0x6bc4f7]
/usr/sbin/mysqld(pfs_spawn_thread+0x12a)[0xaf38ba]
/lib64/libpthread.so.0(+0x79d1)[0x7fa78c9869d1]
/lib64/libc.so.6(clone+0x6d)[0x7fa78ae8a8fd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fa4dc004178): is an invalid pointer
Connection ID (thread ID): 6222895
Status: NOT_KILLED

You may download the Percona XtraDB Cluster operations manual by visiting
http://www.percona.com/software/percona-xtradb-cluster/. You may find information
in the manual which will help you identify the cause of the crash.
150608 12:05:00 mysqld_safe Number of processes running now: 0
150608 12:05:00 mysqld_safe WSREP: not restarting wsrep node automatically
150608 12:05:00 mysqld_safe mysqld from pid file /var/lib/mysql/RR-Cluster-DB2.pid ended


Can any MySQL/Percona expert help me?
Some other information

1. Server version: 5.6.21-70.1-56-log Percona XtraDB Cluster (GPL), Release rel70.1, Revision 938, WSREP version 25.8, wsrep_25.8.r4150

2. Ram 32G, CPU 24 core, 4 HDD - raid 10

3. datafile: 4G

4. file: /etc/my.conf

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

## REPLICATE ##
# Path to Galera library
wsrep_provider=/usr/lib64/libgalera_smm.so

wsrep_provider_options="gcache.size = 1G; gcache.page_size = 512M; gcs.fc_limit = 512"

wsrep_slave_threads=24

wsrep_restart_slave=1

wsrep_forced_binlog_format=ROW

# Cluster connection URL contains IPs of node#1, node#2 and node#3
wsrep_cluster_address=gcomm://192.168.1.xxx,192.168.1.yyy,192.168.1.zzz
# In order for Galera to work correctly binlog format should be ROW
binlog_format=ROW
# MyISAM storage engine has only experimental support
default_storage_engine=InnoDB
# This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
innodb_autoinc_lock_mode=2
# Node #2 address
wsrep_node_address=192.168.1.xxx
# Cluster name
wsrep_cluster_name=my_centos_cluster
# SST method
wsrep_sst_method=xtrabackup-v2
#Authentication for SST method
wsrep_sst_auth="aaabbbccc:dddeeefff"

# Maximum number of rows in write set
wsrep_max_ws_rows=262144
# Maximum size of write set
wsrep_max_ws_size=2147483648

#################### TUNNING ########################

###### Slow query log

slow_query_log=1

slow_query_log_file =/var/log/mysql/slow_queries.log

long_query_time=4

connect_timeout=300

skip_name_resolve

innodb_flush_log_at_trx_commit=2

innodb_file_per_table=1

max_allowed_packet=1G

max_connect_errors=1000000

innodb_buffer_pool_size=4G

read_buffer_size=4M

read_rnd_buffer_size=4M

join_buffer_size=8M

sort_buffer_size=4M

innodb_log_buffer_size=16M

thread_cache_size=256

innodb_additional_mem_pool_size=32M

innodb_flush_method=O_DIRECT

log_queries_not_using_indexes=1

innodb_thread_concurrency=0

wait_timeout=300

interactive_timeout=300

max_connections=800

innodb_fast_shutdown=0

open_files_limit=10000

table_open_cache=3000

tmp_table_size=32M

max_heap_table_size=32M

##### Set Ramdisk #####

tmpdir = /usr/mysqltmp

#######################
Rossonero224Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

gheistCommented:
i'd doube system stack as it looks corrupt,  i.e. rolled over.
Can you run mysqltuner.pl after DB is running for 24h? To me it seems 800 thread limit is overkill for 45 user connections, and no thread cache....
32GB can keep all DB in RAM, but does not seem to be the case.

Can you post MySQL build options?
VISUAL=cat mysqlbug | grep ^Co
Rossonero224Author Commented:
thank you for your answer. I attach mysqltuner.pl and mysqlbug result in file mysqltuner.pl-VISUAL.doc
Please take a look
mysqltuner.pl-VISUAL.doc
gheistCommented:
DB1 is 32bit - is it intentional?
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Rossonero224Author Commented:
Sorry, the result of DB1 is wrong, my teammate, he got the wrong data from his VM. here is the right data.
mysqltuner.pl-VISUAL-new.doc
gheistCommented:
There is nothing dangerous (leading to crashes) in your configuration.
It has to be investigated by percona.
around this:
my_hash_first_from_hash_value
Could be hash collision etc

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Rossonero224Author Commented:
When run the database with only 1 node, it can run ok without crash.
So I think the problem might be related to galera software.
gheistCommented:
The stack contains mysql functions, not galera.
Does all nodes crash at times or just one? In later case run memtest86(+) for nightor so and get rid of bad ram.
Rossonero224Author Commented:
only 1 node fail each time. It crashes even in low utilisation time.
I will try to run memtest86 but I don't think memory is bad in both 3 node :(
gheistCommented:
Just test 1 node per night and sunday you will have the idea if it is bug or bad hardware.
Rossonero224Author Commented:
Tested it for 2 days, and don't see any error with memory.
Anyone has any other idea ?
gheistCommented:
So you caught a software bug and need to tell percona.
gheistCommented:
https:#a40819842 - it has to be investigated by Percona.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
MySQL Server

From novice to tech pro — start learning today.