Link to home
Start Free TrialLog in
Avatar of abcd ab01
abcd ab01Flag for United States of America

asked on

How load balancing works in xen

We have 3 xen servers and having 14 citrix servers. We use citrix servers for our emr system. Almost 400 users. All servers are VM of windows 2008 server. Emr system performance very slow and crashes. Is there any way to check load balancing and how much memory each vm could have? How to improve performance.

In emr r database-The log file size has grown to 18+ GB. Support truncates every time .
Can we find why RAM utilization was 100% on certain servers and not on others? How can we make sure it stays within the acceptable limits?

Any advise?
Avatar of Dirk Kotte
Dirk Kotte
Flag of Germany image

you may use the taskmanager and ressource-monitor at the affected servers.
here you see memory usage per proccess & User and which files are written currently and many more.

I also like to use controlup to get an overview of the systems and identify bottlenecks.
https://www.controlup.com/solutions/citrix/

Loadbalancing within XEN simple select the server with most resources while a new VM starts.
It don't work, if you specify a "Home-Server" for the VM.
The additional WorkLoadBalancing from Citrix XenServer check many parameters all the time and move a VM to the least loaded server if necessary.
Avatar of abcd ab01

ASKER

Should we let users not to run many applications in citrix desktop? That may be causing too much memory  spikes and problems? Any advise?
i think the users start the apps they need for the daily work.
But if you find an app eating all your memory ... you should look for other options to run this app ... or more memory for the server.
The load distribution within Xenapp takes the user count as the parameter, but you can also use CPU-load, HDD queue and other parameters for LB
What would be a good practice for citrix desktop users? To alert them?
because of what do you want to inform or alert the users.
the servers should just work without burdening the users.
what could users do with this information?
How to block google chrome from the citrix receiver desktop
don#t install chrome.
with shared desktops (server published desktop) users should not be able to install apps.
another good practice (especially for systems used for web-access) is using of SRP (Software Restriction Polcies) via AD.
here you whitelist allowed apps / proccesses. there are special apps to do this work too.
So nobody is able to execute downloaded programs.
One of the best tools to determine what is using RAM in that environment is VMMap from System Internals.  To use most effectively you need the MS Symbols and if possible the vendor symbols.

VMMap
https://docs.microsoft.com/en-us/sysinternals/downloads/vmmap

Then review the Windows Internal documentation
https://docs.microsoft.com/en-us/sysinternals/learn/windows-internals

And detailed Video
https://channel9.msdn.com/shows/defrag-tools/defrag-tools-7-vmmap

To answer one of you other questions you might need to scale more horizontal in that you add a fourth XenServer and spread 400 users across 20 VM's per host so that you have less users per virtual machine spread across more physical hosts.

But this depends on the physical hardware itself.  You might want to check Process Explorer first
https://docs.microsoft.com/en-us/sysinternals/downloads/process-explorer

Look at Disk IO first rather than RAM.  Disk latency can cause high RAM usage or the perception of it.

I bring this up because you mentioned large log files.  If every user is logging to log files that is Disk IO, not memory.

The first question is why is every user writing logs that are not "transaction logs"?

Was this done for debug purposes and not turned off when debug was complete?

If logs keep getting truncated by support is there a way to instead automate this and limit the size?

Point being, you can isolate the process that is causing 100% memory but if that process is stuck in Disk Queue to write the logs then memory for that process will peg at 100%.

Hope this helps.
Hi Brian,

Looks like you are expert in Citrix/XEN applications and high level of troubleshooting. We are looking for an expert Citrix professional to provide consultation. Is there anyway to contact you? Thanks.
I have three XEN servers and 2 datacore. 14 Citrix servers. How to view all devices performance in one single unified console?
try controlup .
I have a demo with controlup
Your going to need to look at each Linux XenServer, IMO.  And I've provided a few links below that will get you started relative to performance of the VM's that run on XenServer and XenServer itself.  Any third party software you utilize would need to be able to pull metrics from each XenServer (Linux) versus Windows VM's although having the Windows related metrics might also be useful it can also skew your results being Disk IO can present as high memory usage in Windows relative to disk queues.

Starting with:
How to measure storage performance on XenServer at various layers in the datapath
https://support.citrix.com/article/CTX217628 

This is relative to 6.x versions.
CTX137828 – XenServer 6.2.0 Administrator’s Guide Chapter 9, this details a vast array of metrics available from XenServer both new and old.  Includes details of how to view the metrics via XenCenter and a new rrd2csv tool.
CTX134951 – Configuring dom0 Memory in XenServer 6.1.0
CTX136861 – XenServer 6.1.0 Storage Performance Guide
CTX117960 – How to Configure the Virtual CPU Management
CTX139714 – How to use host-cpu-tune to fine tune XenServer 6.2.0 performance (CPU pinning)

If version 7 or higher would take a look at this multi-part series
XENSERVER 7.0 PERFORMANCE IMPROVEMENTS PART 1: LOWER LATENCY STORAGE DATAPATH
https://xenserver.org/blog/entry/dundee-tapdisk3-polling.html

And too, virtualized storage performance
https://xenserver.org/blog/entry/karcygwins.html
grouping users in citrix SERVERS, ALLOCATE THEM BY dEPARTMENT, how load baLANCE  will work with heAVY USERS
The same as it always has....

Load balancing by users matters not because it doesn't sound like LB'ing is the issue.

You can load balance by users or you can load balance by processor, memory, disk IO or combination - or one of the aforementioned.  And, by LB, I'm not referring to the conventional LB'ing as it pertains to F5 or Netscaler.  This is at the software layer to the Citrix XenApp software.  The Director, StoreFront, XML, STA, and so forth.  

If you want to load balance by department then your going to need Provisioning Services and a dedicated virtual machine "image".

You create a XenApp Server Image that only has those users applications.

Then, using Active Directory you create a Domain Local Group that has those users as member.

Then you create Published Apps - Preferred - and assign that DLG.  The point being, your only allowing users to access those published applications on that image.

That is the only way to segment application to user to server.  Then, you assign the LB metric.  Then you scale horizontal as needed.

If your using Citrix Provisioning Services or in particular with something like Nutanix then your disk IO for the PVS image itself is a non-issue but the applications in the image not-so-much.

Originally you presented a scenario with this EMR application and said it crashes and what not. I think I've already provided a good explanation to this in that the CPU and memory usage relative to that application appears to be tainted by the disk IO usage.

If I were to look at the application the # of users you might get per server might be 5.  

Despite that, some vendors still recommend Citrix being even one session on a workstation is sub-par compared to 5 users per server for something like an Analytics application used to crunch heavy numbers and using a distributed processing model on the back end.

What you seem to have is an application problem.  Vendor issue.  Configuration of the application issue.   Not Citrix.

Citrix is a conduit to the business application.

It cannot fix bad applications no more than it can fix applications logging everything to a file regardless of necessity.

That is simply the way it is?

I've built solutions that hosted 400+ applications and for 50,000 plus users on XenApp.  Every application required an Application LIfecycle and approval just to get hosted in Citrix because Citrix is a conduit for the business application.  It is not the business application.  But, it is an Enterprise solution....in the right hands.

 I've been able to get 140 users per server down to 4 users per server on heavy analytical applications with the caveat that developers kept their code updated....even at 4 it still made sense and was more cost effective when you properly set your ICA keepalive and other timeouts.

See my article relative to saving 3 million dollars on Citrix licensing:  (hint, it's not about the licensing)
https://www.experts-exchange.com/articles/25022/Maximize-Citrix-Concurrent-Licensing-To-Reduce-Cost-Session-Timeouts-3-Millon-Dollar-Cost-Save.html

One of the most critical factors is how that application performs.  Some companies use Citrix as a 'dumping' ground for applications.  Perhaps it was hosted in XenApp 6.0 or 6.5 years ago and this was kept as an excuse to update the code?  

Again, Citrix is merely a conduit for the application.

The application runs on Microsoft Windows Server...for the most part.

And someone or someone's developed it.  Perhaps they are no longer around.  A lot of developers "inherit" things with zero documentation and no code repository...very challenging.  Regardless, code must be updated or replaced, that simple.

You have to fix the application...or replace it.  Not a Citrix issue.  This issue existed way before Citrix.  Citrix cannot address a logging to a over sized log file problem.  Not their job.

Vendor, or developer or both.

Citrix presents the application as it exist across the 'ether thin'.  It cannot make the application work better than it should.  It excels at moving the application to the DC with the data and presenting the client portion of the application to the end-user.  

The end-user gets the result of what the application will provide and in some cases in the oldest iteration of that application.

Perhaps it doesn't make sense to host the application on Citrix?

This isn't a Citrix issue, it is an application issue. Whether or not the application creates a log that is 15 GB or 20 GB has no bearing on Citrix.

This is an application specific issue that must be escalated to the developer or the vendor.  More than likely it is misconfigured or simply outdated...ready for sunset.  

The application was most likely developed for single user instance on Windows 7 or earlier operating system.

It was not properly ported to run in server 2008R2.  Then, it was not updated properly to run in 2012R2.  Now, it most likely won't run in server 2016.

Perhaps it will run in Windows 10?

Citrix is a great platform when used properly.  It is not a dumping ground for bad or misconfigured applications or applications that need to be sunset back in 2005, or earlier.

Citrix as a conduit to the business application runs on Microsoft Operating System.

Server, not workstation.

It is this inability to translate the application properly from workstation to multi user server that causes this confusion.

IMO, every application hosted in Citrix is hosted on Microsoft OS which is RDS + Citrix and must "earn the right" to be hosted.  There is a port that exist between single user instance OS to server OS and regression testing.  What does this single OS instance application translate to on the server side when you have more than 5 users running the application that was once designed to run in a distributed model?  How does that further translate when running in a shared Dll methodology or do some of those Dll's need to be isolated to the instance?  What else is being hosted on that same shared OS?  Multiple Oracle client versions? Multiple versions of Sybase client and Oracle and Java and...and....

Talk to your developers...again, they might have inherited the problem just as much....

Talk to the vendor.

Otherwise, you need a last resort.  Someone that come in and make everyone else understand that this simply is not a Citrix issue.  It's a fundamental issue relative to the application and that is not a slight against the developer or the vendor.  Many companies, IT teams, inherit these problems.  Or, they lack the funding from leadership to fix the problem properly and that dates back - to my knowledge - more than 20 years when I first implemented Citrix Winframe on NT 3.51.  

I don't see that changing anytime soon for this application until the right person or person's come along to change it.

Hard to say.  Either way, I'm here to help.
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.