asked on

roundcube slow response cyrus-imap

I am facing issue with cyrus-imap roundcube. Round cube is taking long time to login and loading very slow. I am using kolab-community server. can any one help me in this regards?

Thanks

David Favor

1) Start by installing Kolab 16 on Ubuntu Bionic. CentOS kernels, even for 7.x series, are many years old. Use Ubuntu Bionic to ensure you have all Kernel bug fixes + speed enhancements.

2) Make sure you cyrus-imap + RoundCube are both running on same machine or inside same container.

3) Ensure your host is set correctly...

$config['default_host'] = 'localhost';

Open in new window

So user localhost, not an IP or host/domain name, as local host runs a loopback speed, any other setting involves the entire TCP Stack.

4) Verify you only have one authentication method. If you have a series of authentication methods, then each failing method must timeout before authentication can complete.

Best to set only one authentication method.

5) Best to remove PAM from the entire authentication equation.

6) If performance problems persist, then you'll debug your system.

Swapping is the biggest killer. You can see this easily with top.

Next check all related logs - cyrus-imap, RoundCube, MariaDB/MySQL (if they're involved).

Most likely you'll find the problem in your logs.

Going over logs can be grueling.

Hang in there!

noci

Use IMAP Proxy, then a web based front end's connection will be cached to the backend.
(Rondcube will not cache it's current connection data when disconnected, the IMAP backend also looses interest on disconnect).

Imapproxy keeps a link to the IMAP backend live for some time while the fron can be disconnected / reconnected without too much of a hassle.
http://www.imapproxy.org/

That save a lot of reestablishing context for the IMAP backend.

vijay kumar

ASKER

Thanks for the prompt reply David Favor, I did all what you mentioned. I was using my server since 3 years. It was good since yesterday... I am facing the issue since yesterday.
Please let me know how can I enable the cyrus-imapd debug logs?

Thanks

vijay kumar

ASKER

Thanks for reply noci, We are alreay using kolab-guam as a proxy for roundcube. Please let me know do you want me to share any logs.

Thanks

vijay kumar

ASKER

I am still waiting for response from Experts.. requesting mandator to keep this ticket open..
Thanks

David Favor

Ah... If problem began yesterday, then likely problem relates to some change made.

Mention any site changes that were made. This will likely help.

Log files normally live under /var/log/* unless some custom log location has been input.

Log file names vary. You'll just have to get into /var/log/* + look for correct logs.

Big Tip: Don't make any other changes. If your system was working + abruptly stopped, likely the problem will be simple to identify + fix will be easy. If you start making many changes... in a hurry... doing problem identification + resolution becomes infinitely more difficult.

vijay kumar

ASKER

No changes were made. Please see attached few logs for your reference.
postfix.txt
cyrus.txt
roundcube.PNG
log1.txt
log2.txt
curruntload.PNG

David Favor

Right off I'd say your system is in serious trouble, as swap is nearly full.

Any swapping == slow down or process death, when the OOM Killer runs (out of memory killer).

First step, 4G memory... wow... super low for any type of load at all.

Bump up to 32G (memory is cheap) + add more if you ever see any swapping.

Look for output like this, where KiB Swap is always zero. If any number ever shows up here, you're likely close to trouble.

top - 19:45:16 up 117 days, 12:03,  1 user,  load average: 3.00, 3.46, 3.29
Tasks: 841 total,   2 running, 745 sleeping,   0 stopped,   0 zombie
%Cpu(s): 12.4 us,  3.1 sy,  0.0 ni, 81.0 id,  2.5 wa,  0.0 hi,  1.0 si,  0.0 st
KiB Mem : 65857760 total,  4218692 free, 23153036 used, 38486032 buff/cache
KiB Swap: 12287588+total, 12287588+free,        0 used. 39418756 avail Mem 

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                            
367059 1000033   20   0 1385404  37056  27308 R 100.0  0.1  39541:13 php                                                                                                                                                
864401 1000033   20   0 3959404  61924   8280 S  19.1  0.1  13:01.69 apache2                                                                                                                                            
892681 1000033   20   0 3953900  51620   8272 S  18.2  0.1   3:57.44 apache2                                                                                                                                            
864400 1000033   20   0 3957924  51948   8452 S  15.5  0.1   8:35.58 apache2                                                                                                                                            
 27827 1000033   20   0 1538660 284404 230000 S  11.2  0.4   1:44.71 php-fpm7.2                                                                                                                                         
392759 1000000   20   0  745120  26884  10340 S   1.3  0.0 536:22.79 fail2ban-server                                                                                                                                    
831599 1000033   20   0 1482076  69916  58668 S   1.3  0.1   0:35.71 php-fpm7.3                                                                                                                                         
832587 1000033   20   0 1481928  70160  58924 S   1.3  0.1   0:35.42 php-fpm7.3                                                                                                                                         
832187 1000033   20   0 1481740  67408  56480 S   1.0  0.1   0:35.29 php-fpm7.3                                                                                                                                         
832415 1000033   20   0 1481976  70108  58860 S   1.0  0.1   0:35.65 php-fpm7.3                                                                                                                                         
832813 1000033   20   0 1482176  67892  56480 S   1.0  0.1   0:35.73 php-fpm7.3                                                                                                                                         
  3493 1000000   20   0  302484  22056   7660 S   0.7  0.0  73:43.94 fail2ban-server                                                                                                                                    
 25772 1000033   20   0 4776800  38128   7644 S   0.7  0.1   3:22.17 apache2

Open in new window

David Favor

Also your load average is high, likely because you're swapping.

A system running IMAP only should be running near zero load + zero swap.

Fix these items first, then proceed onto additional debugging, if problem persists.

vijay kumar

ASKER

I do see the swap is utilizing high.. but.. the swap load is fluctuating.. it was since long back... it is a OPENVZ server... It is not possible to increase the swap partition.
I think some other is causing the issues...
we are using 1 TB storage containing attached through sshfs to this server...but it was since past 3 years until 2 days back it was working fine without any issues with same infrastructure and same settings. I don't know why facing the issue.
Please let me know If I provide any additional logs to debug this issue.

Thanks

noci

Increasing SWAPspace will NOT help, you need to increase the amount of RAM.

Exchanging data between Swap & RAM will cause delays for all programs being swapped out.
Programs can only actually do somethings when stuff is in RAM Swap is a trick to fake more memory (extending the moment where you have to kill processes due to lack of memory.
When swapping occurs a lot, then you programs will run like the RAM is as slow as disks... That also will explain slowness.

With networked systems this may cause timeouts & disconnects if thresholds are passed. Throwing more oil on the fire. (Swapped out processes that need to be rundown, and need swap-in before files are closed etc.) new connections causing even more processes to get swapped out.

For virtual systems regular start of swapping is an indication the system getting overloaded. And they should get more memory. (Virtualized IO is worse compared to bare metal IO as the hypervozor needs to get involved in handling all IO commands.).

vijay kumar

ASKER

Hi David Favor,
Can you please let me know can we disconnect imapd idle sessions. I am thinking old inactive sessions are still showing in services.

Please let me know how can I reduce cyrus-imap threads.

Thanks

vijay kumar

ASKER

attached is the current load on the server...
swap-part.PNG

David Favor

1) Noci is correct. Increasing swap space won't help. Your first step is to increase RAM.

2) You asked, "Can you please let me know can we disconnect imapd idle sessions."

This may be difficult. You'll refer to your version's docs to determine if there's some way to do this.

And likely this is a very bad idea. Each connecting client will attempt to hold open some number of connections to speed up access to actual IMAP files. If the server starts killing these off, likely this will cause all connected users to start getting errors in their client.

This means for every message, they may get multiple blocking/modal dialog boxes which require them to click on before they can proceed.

This would be a nightmare for people interacting with this mail server.

3) You asked, "Please let me know how can I reduce cyrus-imap threads."

You'll find this in your docs...

And again, this is likely a bad idea + will have no effect on your problem.

As I mentioned above, I run many IMAP servers with many simultaneous users, with no decrease in speed.

4) Your latest data shows 0 swap space used. This suggests you either rebooted or flushed your swap space.

This fails to address the issue which originally created swapping. Every time this conditions occurs again, your swap space will fill + the OOM Killer will begin running, which will create even more difficult to debug problems.

5) Be sure you're running latest version of Cyrus. If you're running an old version, you may be hitting an already fixed bug.

6) Same with OpenVZ. Be sure your running the latest version.

7) Since you're running in an OpenVZ environment, this means other containers may be contributing to your container's problems.

If one container hogs most of physical disk i/o, then other containers i/o defers. When the initial container finished i/o + releases physical i/o channel, then one or more containers get an i/o slice.

If this process occurs such that your container begins to backup processes, waiting on i/o, then this may also be the problem.

So, be sure you're running latest OpenVZ.

If this problem persists + you can't solve it, try moving your IMAP container as an LXD container (so no OpenVZ on machine) + see if all problems seem to magically clear up.

David Favor

Hang in there.

This type of problem tends to be difficult to debug.

Tip: Try running this command at machine level...

iotop -P -a -d 1

Open in new window

Then hit the left arrow till you're tracking write i/o.

Run this command when the problem occurs + you may find the offending process, which might be an IMAP process... likely some other process in another container.

noci

IMAP IDLE uses a "kind of REST" like interface, the client will issue a request to the server expecting to only return when mail arrives.
So IDLE tasks "have work to do".
(This way a client doesn't need to poll the server, which only causes more work for the server).

vijay kumar

ASKER

Please see attached logs when i run
iotop -P -a -d 1
iotop.txt

noci

iotop doesn't seem to indicate a very busy system.
Other issues for slownes can be DNS resolving, reverse DNS DNS resevolving...
latency between systems. (hard to predict / tell)

I case you have a RAID array alls disks still working OK?

vijay kumar

ASKER

RDNS is resolving fine .. and no issues with RAID arrays.
checking iotop periodically, but getting the same output as sent earlier...

vijay kumar

ASKER

Please see attached logs when i run
iotop -P -a -d 1

David Favor

As noci said, system seems very quiet.

My guess is sometimes load increases, causing swapping, then swap overflow.

First fix will be to double you're memory, since you swap size == RAM + swap was near 100% at some point.

This will handle the case where load rises rapidly.

Note: Memory is dirt cheap. Pack out your machine with a good amount, 128G+ or max of motherboard.

vijay kumar

ASKER

Thanks for reply. We have increased the RAM.
Is there any possibility to take remote support on pay basis, so that you can get an idea what exactly happening.
Please consider this let me know the possibility.

Thanks

David Favor

Sometimes tracking down problems like this can be difficult, especially with OpenVZ Kernels, as OpenVZ uses old Kernels + requires a custom Kernel build, since OpenVZ doesn't exist in the mainline Kernel source. (At least last time I checked, OpenVZ requires a custom Kernel build.)

So long as you have access to the machine where your OpenVZ containers are running + this is an actual physical machine (not a parent OpenVZ container), you'll have a chance.

And debugging this will likely require a very long time + budget, along with giving someone root access to your physical machine + all containers on machine. This means if you hire someone, they must be trust worthy.

Fast/Cheap/Secure fix is to just track your swap space + increase memory any time swapping occurs.

My guess is problem may never recur, if you doubled RAM size.

vijay kumar

ASKER

Thanks for the reply. We already doubled the RAM..but the server load is going really high 10-60...but when I check the processes at the moment all are running in normal...

Thanks

noci

Keep an eye on the swapping usage. If it is above 0 (give or take a few MB), then it is your wakeup call.
(consider is a to be a mine-workers canary) .
(Maybe the knee jerk ... double the size might become a tad too much).
While the system is still running (not dropping processes), the best guess for the amount of memory to add, is the amount of swap space usedduring the peak usage (+ a small margin).

This question needs an answer!

Become an EE member today

7 DAY FREE TRIAL

Members can start a 7-Day Free trial then enjoy unlimited access to the platform.

View membership options

Learn why we charge membership fees

We get it - no one likes a content blocker. Take one extra minute and find out why we block content.