Vmware Performance issue on Win 2008 R2 SSRS server

I have a Virtual Server running on Vmware
Server: windows 2008 R2 x64
It has 4 CPU and 6GB memory.

This server is a SSRS/CRM/SQL server, and when they run their jobs, reports and etc.
The server crawls to a hault. All kinds of performance issue.

I looked at the performance on both the VM and the ESX(i) host, and attached print screens to this.

Monitored CPU Usage and Ready

While looking at Usage, it doesnt appear to be too bad,
I am not quiet sure how to monitor CPU ready, and from the chart it appears high, but you have USage in Percent on one side, and Milliseconds on the other..
I am not sure.
I need further explaination of Ready, whats good numbers, whats bad, and by loooking at the chart and fiqures below, do we have an issue? what needs to happen, Reduction in CPU, Additional CPU?

If you look at the VM-CPU-REAL-TIME picture, READY is highlighted

Does the VM exhibit concerns? What milliseconds do we need to stay under? What can we do to fix issue... There is no memory ballooning.

REAL-TIME:
(Virtual Machine Server) VM Usage/Ready  (Real-Time)

Usage (Percent)                  
Latest: 1.09   Maximum:  16.41   Minimum: .98     Average: 1.813

Ready (Milliseconds)      
Latest: 209    Maximum:  840     Minimum: 197     Average: 277.289

DATES: 8/31/2013 - 9/12/2013
(Virtual Machine Server) VM Usage/Ready  (dates 8/31/2013 - 9/12/2013)

Usage (Percent)
Latest: 3.02   Maximum:  18.32   Minimum: 1.64    Average: 3.38
Ready (Milliseconds)      
Latest: 272955 Maximum:  975315  Minimum: 105347  Average: 297449.89

DATES: 8/31/2013 - 9/12/2013
(ESX Host VM is running on) Usage/Ready  (dates 8/31/2013 - 9/12/2013)

Usage (Percent)
Latest: 32.55    Maximum:  35.78     Minimum: 14.44    Average: 23.42

Ready (Milliseconds)      
Latest: 12854700 Maximum:  16548164  Minimum: 4276967  Average: 8912704.2
VM-Real-Time.jpg
vm.jpg
esxhost.jpg
LVL 5
IndyrbAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

wshtyCommented:
please.. more information.

what is your vmware version?
is this the only server running on the host?
do you use a storage subsystem(SAN, etc) or integrated raid/raidcontroller
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
if you knock back the vCPU does it get better or worse.

Add more vCPUs

Make sure the VM is not running on a snapshot disk, and VMXNET3 interface is in use.
0
IndyrbAuthor Commented:
I haven't scaled back nor added CPU, as this will require downtime and I needed assistance in determining if there was indead a issue with CPU and CPU ready state.
Advise on Milliseconds (good numbers) and what to do if it fails out of this realm.
HOT Add is disableed.


VMware versions is ESXi, 5.1.0, 799733
There are an est of 15-20 VMS running on this host, there is not any alarms.

The VM is using as follows:

CPU: 4
Memory: 6GB
SCSI ControllerL   LSI Logic SAS
Network Adapter:  E1000
Vm version 7


SQLSERVER Reporting Services

SQLSERVERNAME\MSSQLSERVER
URL
http://sqlservername:80/Reports


We are having a bit of a hard time getting G5 detail report to work for timeframe of 9/1 to 9/11. We had big DL disbursement on 9/6 and so the counts are significantly higher than the normal timeframe. When we try to run the G5 details, it just spins for a long time ((HOURS))  and doesn’t return anything.

When we run the detail query in TOAD, it returns in less than 5min.

The report via web brings up several Accounts
When expanding the Account showing all the Names, is when there is issue, and even timeouts.

Some error via browser is
Webpage error details
User Agent: Mozilla/4.0 (compatible MSIE 7.0; Windows NT 5.1; Trident/4.0; FunWebProducts; .Net CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; MS-RTC LM 8; .NET4.0C; ASKTBORJ/5.13.1.18107; FunWebProducts; .Net4.0E)

Timestamp Fri, 13 SEP 2013 13:23:11 UTC

""MESAAGE: NOT ENOUGH STORAGE AVAILABLE TO COMPLETE THIS OPERATION""

Line: 5
Char: 63805
Code: 0
URI http://server/reports/scriptresource.adx?d=...................................................



The actual server has over 10GB of free space.






Scale-Out Deployment shows
SQL SERVER Name SQLCLUV2\CRM2011
DataBase Name ReportServer
Report Server Mode:  Native

SQLSERNAME   INSTANCE MSSQLSERVER  JOINED

WENT to DB ReportServer
Files:
ReportServer Rows Data Primary Initial size 102MB Autogrow by 1mb, unrestricted growth

ReportServer LOG Not applicable Initial size 47MB Autogrow by 10%, restricted growth to 2097152 MB


ON SQLCLU
ALL CLUSTER DISK HAS Plenty of DISKSPACE, and all services are online.

Events in SQLCLU eventlog: Thousands of these:

Source Control Manager: (event id:7031)The SMS Agent Host service terminated unexpectedly.  It has done this 305 time(s).  The following corrective action will be taken in 300000 milliseconds: Restart the service.

VDS Basic Provider: (event id:1) Unexpected failure. Error code: 490@01010004

Distrubuted COM: (event id:10016)The application-specific permission settings do not grant Local Launch permission for the COM Server application with CLSID
{24FF4FDC-1D9F-4195-8C79-0DA39248FF48}
 and APPID
{B292921D-AF50-400C-9B75-0C57A7F29BA1}
 to the user NT AUTHORITY\SYSTEM SID (S-1-5-18) from address LocalHost (Using LRPC). This security permission can be modified using the Component Services administrative tool.

Application Error:  (event id:1000)Faulting application name: CcmExec.exe, version: 4.0.6487.2000, time stamp: 0x4ab33e4d
Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7ba58
Exception code: 0xc0000005
Fault offset: 0x0009ce04
Faulting process id: 0xf04
Faulting application start time: 0x01ceb0842595c275
Faulting application path: C:\Windows\SysWOW64\CCM\CcmExec.exe
Faulting module path: C:\Windows\SysWOW64\ntdll.dll
Report Id: e3827f13-1c7f-11e3-b234-d067e5fd2512

SQLAGENT$CRM2011 (event id: 208SQL Server Scheduled Job 'CRM2011_CHECKDB_All_DBs.CheckDB_Execution' (0xA008FBC1F6345045B0BD21355FB3F537) - Status: Failed - Invoked on: 2013-09-12 01:00:00 - Message: The job failed.  The Job was invoked by Schedule 28 (CRM2011_CHECKDB_All_DBs.CheckDB_Execution).  The last step to run was step 1 (CheckDB_Execution).

PerfNET  (event id:2006)Unable to read Server Queue performance data from the Server service. The first four bytes (DWORD) of the Data section contains the status code, the second four bytes contains the IOSB.Status and the next four bytes contains the IOSB.Information.






Events in Report Server eventlog: Thousands of these:

VMWare tools (event id: 1000)
[ warning] [vmusr:vmusr] Error in the RPC receive loop: RpcIn: Unable to send.

** I made the tools.conf file in c:\programdata\vmware\vmware tools\
[logging]
vmusr.level = error


Application Error (event id: 1000)
Faulting application name: CcmExec.exe, version: 4.0.6487.2000, time stamp: 0x4ab33e4d
Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7ba58
Exception code: 0xc0000005
Fault offset: 0x0009ce04
Faulting process id: 0xa1c
Faulting application start time: 0x01ceb081f0cf0cd2
Faulting application path: C:\Windows\SysWOW64\CCM\CcmExec.exe
Faulting module path: C:\Windows\SysWOW64\ntdll.dll
Report Id: a12f919a-1c7d-11e3-814c-005056b00041


DistrubutedCOM (event id: 10016)

The application-specific permission settings do not grant Local Launch permission for the COM Server application with CLSID
{24FF4FDC-1D9F-4195-8C79-0DA39248FF48}
 and APPID
{B292921D-AF50-400C-9B75-0C57A7F29BA1}

 to the user NT AUTHORITY\SYSTEM SID (S-1-5-18) from address LocalHost (Using LRPC). This security permission can be modified using the Component Services administrative tool.

Serive Control Manager (event id: 7031)
The SMS Agent Host service terminated unexpectedly.  It has done this 368 time(s).  The following corrective action will be taken in 300000 milliseconds: Restart the service.
0
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

IndyrbAuthor Commented:
Yes there is a SAN.
There are 4 or 5 nodes in cluster
HA/DRS enabled.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Any VMware snapshots?

Is this VM be moved by vMotion - DRS?
0
IndyrbAuthor Commented:
No Snapshots.

And looked at DRS History and there was only 1 Vmotion and that was at 12:30am this morning a different VM and that was it. This issue has been going on for a while.

DRS=Fully Autamted
Migration Threshold is in the middle.

The VM uses the default (default fully automated)


Inside the VM running Task Manager when chosing the Dates and running.

Overall Windows TaskManager:
CPU  3%
Memory 40%

When expanding the accounts after report summary
CPU 39%
Memory 46%

Again this is in taskmanager.


In VMware Vcenter Perfomrance on VM
During the time of the job

Meassurement:                        Latest:                  Maximum:       Minimum:         Average
Ready: (Milliseconds)              1626                     3204                 210                    416.939
Usage: (Percent)                      1.89                      42.64                1.39                   4.943


I attached another print screen with ready

Also when running the report within the VM. Got error.

Message from Website:
Out of memory at line: 5
CPU1.jpg
0
IndyrbAuthor Commented:
Webpage error details

User Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E)
Timestamp: Fri, 13 Sep 2013 15:27:10 UTC

Hundreds of these errors:

Message: 'null' is null or not an object
Line: 3234
Char: 9
Code: 0
URI: http://Server/Reports/Reserved.ReportViewerWebControl.axd?OpType=Resource&Version=10.50.1600.1&Name=ViewerScript
0
IndyrbAuthor Commented:
Log Name:      Application
Source:        ASP.NET 2.0.50727.0
Date:          9/13/2013 10:56:43 AM
Event ID:      1309
Task Category: Web Event
Event code: 3005
Event message: An unhandled exception has occurred.
 
Application information:
    Application domain: ReportManager_MSSQLSERVER_0-6-130235578004559384
    Trust level: RosettaMgr
    Application Virtual Path: /Reports
    Application Path: C:\Program Files\Microsoft SQL Server\MSRS10_50.MSSQLSERVER\Reporting Services\ReportManager\
    Machine name: SQLSERVER
 
Process information:
    Process ID: 4116
    Process name: ReportingServicesService.exe
    Account name: NT AUTHORITY\NETWORK SERVICE
 
Exception information:
    Exception type: COMException
    Exception message: This network connection does not exist. (Exception from HRESULT: 0x800708CA)
 
Request information:
    Request URL: http://server/Reports/Pages/Report.aspx?ItemPath=An unhandled exception has occurred.fSMS+ReportsAn unhandled exception has occurred.fAccountingAn unhandled exception has occurred.fPell_G5_Summary
    Request path: /Reports/Pages/Report.aspx
    User host address: ::1
    User:  
    Is authenticated: False
    Authentication Type:  
    Thread account name: NT AUTHORITY\NETWORK SERVICE
 
Thread information:
    Thread ID: 25
    Thread account name: NT AUTHORITY\NETWORK SERVICE
    Is impersonating: False
    Stack trace:    at Microsoft.ReportingServices.HostingInterfaces.IRsHttpPipeline.GetAuthType()
   at ReportingServicesHttpRuntime.RsWorkerRequest.GetUserToken()


Log Name:      Application
Source:        Application Error
Event ID:      1000

Description:
Faulting application name: CcmExec.exe, version: 4.0.6487.2000, time stamp: 0x4ab33e4d
Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7ba58
Exception code: 0xc0000005
Fault offset: 0x0009ce04
Faulting process id: 0x1010
Faulting application start time: 0x01ceb08b168e685a
Faulting application path: C:\Windows\SysWOW64\CCM\CcmExec.exe
Faulting module path: C:\Windows\SysWOW64\ntdll.dll
Report Id: d7298c9a-1c86-11e3-814c-005056b00041
0
IndyrbAuthor Commented:
Attached is the Demand Measurement which reports the amount of CPU resources the VM would use if there was not contention.

Latency, % of time the virtual machine is unable to run because it is contending for access to the physical CPU

Meassurement:                        Latest:                  Maximum:       Minimum:         Average
Latency: (Percent)                     .64                        8.56                  .210                    .909


So is the CPU ready state high (milliseconds) do we need to reduce CPUs?
demand.jpg
0
IndyrbAuthor Commented:
Can anybody go through all my comments above and advise, I tried to include as much detail and information as possible. I put several comments for different things
0
IndyrbAuthor Commented:
Disk Performance attached.
disk.jpg
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
Would you know how the HOST BIOS is configured?

It's not in Eco Mode, rather than Performance Mode?

Setting in the BIOS, to reduce power?
0
IndyrbAuthor Commented:
I will have to check on bios.what's your thoughts on cpu ready for the vm?
0
wshtyCommented:
could you please check the following for me: connect to the host the trouble VM is on via SSH.
type esxtop
enter
press "d" on your keyboard.
now start the task which causes performance issues on the server and monitor the DAVG/cmd KAVG/cmd GAVG/cmd and QAVG/cmd values.
please take a screenshot if any of these values come up very high (more than 30)

thanks.
0
compdigit44Commented:
Can you please upload a screen shot of the memory Resource Allocation for this VM.

I am interested to seeing the current: reserved, ballooned, swapped memory stats.

1)How is the overall memory usage on the host?
2)Are there othetr VM's on the host?
3)What type of SAN do you have?
4)The the VM perform poorly when you move it to another host and I have the same question for the datastore as well.

On a side note, have you check to make sure your SQL server is not suffering from an expense report query. I have seen a poorly written query bring down an entire SQL server in the past.
0
wshtyCommented:
0
IndyrbAuthor Commented:
Looks like there are three things going on in the infrastructure.

1. a lot of vms are configured with multiple vCPU -- some 4 and 8 even
a lot of VMS have larger memory, but then have limits.
Example memory 16GB and limit of 8GB

There also appears to be disk latency, but that could bew because of the memory limits and swapping.

Then on some VMS I am seeing CPU utilization errors. But most of them appear to be CPU ready...

So if a vm on host A is having a lot of CPU ready - will that effect other vms on the same host.

Why would or should a vm have memory limits other than tricking applications and installers that require a certain amount of memory.

Whats the best way to reconfig when you have 300 VMs and HOT CPU and HOT MEMORY is not enabled.
0
Andrew Hancock (VMware vExpert / EE MVE^2)VMware and Virtualization ConsultantCommented:
1. That's not good unless really needed.

2. I would not recommended reservations, reduce to memory required.

You will have to schedule downtime, and change configuration, you cannot change Hot CPU or Hot Memory, if not currently enabled on the VM.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
IndyrbAuthor Commented:
closing request for now, awareding points as it helped. however issue is still underway. Thanks EE
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
VMware

From novice to tech pro — start learning today.