Link to home
Start Free TrialLog in
Avatar of Indyrb
IndyrbFlag for United States of America

asked on

Vmware Performance issue on Win 2008 R2 SSRS server

I have a Virtual Server running on Vmware
Server: windows 2008 R2 x64
It has 4 CPU and 6GB memory.

This server is a SSRS/CRM/SQL server, and when they run their jobs, reports and etc.
The server crawls to a hault. All kinds of performance issue.

I looked at the performance on both the VM and the ESX(i) host, and attached print screens to this.

Monitored CPU Usage and Ready

While looking at Usage, it doesnt appear to be too bad,
I am not quiet sure how to monitor CPU ready, and from the chart it appears high, but you have USage in Percent on one side, and Milliseconds on the other..
I am not sure.
I need further explaination of Ready, whats good numbers, whats bad, and by loooking at the chart and fiqures below, do we have an issue? what needs to happen, Reduction in CPU, Additional CPU?

If you look at the VM-CPU-REAL-TIME picture, READY is highlighted

Does the VM exhibit concerns? What milliseconds do we need to stay under? What can we do to fix issue... There is no memory ballooning.

REAL-TIME:
(Virtual Machine Server) VM Usage/Ready  (Real-Time)

Usage (Percent)                  
Latest: 1.09   Maximum:  16.41   Minimum: .98     Average: 1.813

Ready (Milliseconds)      
Latest: 209    Maximum:  840     Minimum: 197     Average: 277.289

DATES: 8/31/2013 - 9/12/2013
(Virtual Machine Server) VM Usage/Ready  (dates 8/31/2013 - 9/12/2013)

Usage (Percent)
Latest: 3.02   Maximum:  18.32   Minimum: 1.64    Average: 3.38
Ready (Milliseconds)      
Latest: 272955 Maximum:  975315  Minimum: 105347  Average: 297449.89

DATES: 8/31/2013 - 9/12/2013
(ESX Host VM is running on) Usage/Ready  (dates 8/31/2013 - 9/12/2013)

Usage (Percent)
Latest: 32.55    Maximum:  35.78     Minimum: 14.44    Average: 23.42

Ready (Milliseconds)      
Latest: 12854700 Maximum:  16548164  Minimum: 4276967  Average: 8912704.2
VM-Real-Time.jpg
vm.jpg
esxhost.jpg
Avatar of S Z
S Z
Flag of Germany image

please.. more information.

what is your vmware version?
is this the only server running on the host?
do you use a storage subsystem(SAN, etc) or integrated raid/raidcontroller
SOLUTION
Avatar of Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Indyrb

ASKER

I haven't scaled back nor added CPU, as this will require downtime and I needed assistance in determining if there was indead a issue with CPU and CPU ready state.
Advise on Milliseconds (good numbers) and what to do if it fails out of this realm.
HOT Add is disableed.


VMware versions is ESXi, 5.1.0, 799733
There are an est of 15-20 VMS running on this host, there is not any alarms.

The VM is using as follows:

CPU: 4
Memory: 6GB
SCSI ControllerL   LSI Logic SAS
Network Adapter:  E1000
Vm version 7


SQLSERVER Reporting Services

SQLSERVERNAME\MSSQLSERVER
URL
http://sqlservername:80/Reports


We are having a bit of a hard time getting G5 detail report to work for timeframe of 9/1 to 9/11. We had big DL disbursement on 9/6 and so the counts are significantly higher than the normal timeframe. When we try to run the G5 details, it just spins for a long time ((HOURS))  and doesn’t return anything.

When we run the detail query in TOAD, it returns in less than 5min.

The report via web brings up several Accounts
When expanding the Account showing all the Names, is when there is issue, and even timeouts.

Some error via browser is
Webpage error details
User Agent: Mozilla/4.0 (compatible MSIE 7.0; Windows NT 5.1; Trident/4.0; FunWebProducts; .Net CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; MS-RTC LM 8; .NET4.0C; ASKTBORJ/5.13.1.18107; FunWebProducts; .Net4.0E)

Timestamp Fri, 13 SEP 2013 13:23:11 UTC

""MESAAGE: NOT ENOUGH STORAGE AVAILABLE TO COMPLETE THIS OPERATION""

Line: 5
Char: 63805
Code: 0
URI http://server/reports/scriptresource.adx?d=...................................................



The actual server has over 10GB of free space.






Scale-Out Deployment shows
SQL SERVER Name SQLCLUV2\CRM2011
DataBase Name ReportServer
Report Server Mode:  Native

SQLSERNAME   INSTANCE MSSQLSERVER  JOINED

WENT to DB ReportServer
Files:
ReportServer Rows Data Primary Initial size 102MB Autogrow by 1mb, unrestricted growth

ReportServer LOG Not applicable Initial size 47MB Autogrow by 10%, restricted growth to 2097152 MB


ON SQLCLU
ALL CLUSTER DISK HAS Plenty of DISKSPACE, and all services are online.

Events in SQLCLU eventlog: Thousands of these:

Source Control Manager: (event id:7031)The SMS Agent Host service terminated unexpectedly.  It has done this 305 time(s).  The following corrective action will be taken in 300000 milliseconds: Restart the service.

VDS Basic Provider: (event id:1) Unexpected failure. Error code: 490@01010004

Distrubuted COM: (event id:10016)The application-specific permission settings do not grant Local Launch permission for the COM Server application with CLSID
{24FF4FDC-1D9F-4195-8C79-0DA39248FF48}
 and APPID
{B292921D-AF50-400C-9B75-0C57A7F29BA1}
 to the user NT AUTHORITY\SYSTEM SID (S-1-5-18) from address LocalHost (Using LRPC). This security permission can be modified using the Component Services administrative tool.

Application Error:  (event id:1000)Faulting application name: CcmExec.exe, version: 4.0.6487.2000, time stamp: 0x4ab33e4d
Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7ba58
Exception code: 0xc0000005
Fault offset: 0x0009ce04
Faulting process id: 0xf04
Faulting application start time: 0x01ceb0842595c275
Faulting application path: C:\Windows\SysWOW64\CCM\CcmExec.exe
Faulting module path: C:\Windows\SysWOW64\ntdll.dll
Report Id: e3827f13-1c7f-11e3-b234-d067e5fd2512

SQLAGENT$CRM2011 (event id: 208SQL Server Scheduled Job 'CRM2011_CHECKDB_All_DBs.CheckDB_Execution' (0xA008FBC1F6345045B0BD21355FB3F537) - Status: Failed - Invoked on: 2013-09-12 01:00:00 - Message: The job failed.  The Job was invoked by Schedule 28 (CRM2011_CHECKDB_All_DBs.CheckDB_Execution).  The last step to run was step 1 (CheckDB_Execution).

PerfNET  (event id:2006)Unable to read Server Queue performance data from the Server service. The first four bytes (DWORD) of the Data section contains the status code, the second four bytes contains the IOSB.Status and the next four bytes contains the IOSB.Information.






Events in Report Server eventlog: Thousands of these:

VMWare tools (event id: 1000)
[ warning] [vmusr:vmusr] Error in the RPC receive loop: RpcIn: Unable to send.

** I made the tools.conf file in c:\programdata\vmware\vmware tools\
[logging]
vmusr.level = error


Application Error (event id: 1000)
Faulting application name: CcmExec.exe, version: 4.0.6487.2000, time stamp: 0x4ab33e4d
Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7ba58
Exception code: 0xc0000005
Fault offset: 0x0009ce04
Faulting process id: 0xa1c
Faulting application start time: 0x01ceb081f0cf0cd2
Faulting application path: C:\Windows\SysWOW64\CCM\CcmExec.exe
Faulting module path: C:\Windows\SysWOW64\ntdll.dll
Report Id: a12f919a-1c7d-11e3-814c-005056b00041


DistrubutedCOM (event id: 10016)

The application-specific permission settings do not grant Local Launch permission for the COM Server application with CLSID
{24FF4FDC-1D9F-4195-8C79-0DA39248FF48}
 and APPID
{B292921D-AF50-400C-9B75-0C57A7F29BA1}

 to the user NT AUTHORITY\SYSTEM SID (S-1-5-18) from address LocalHost (Using LRPC). This security permission can be modified using the Component Services administrative tool.

Serive Control Manager (event id: 7031)
The SMS Agent Host service terminated unexpectedly.  It has done this 368 time(s).  The following corrective action will be taken in 300000 milliseconds: Restart the service.
Avatar of Indyrb

ASKER

Yes there is a SAN.
There are 4 or 5 nodes in cluster
HA/DRS enabled.
Any VMware snapshots?

Is this VM be moved by vMotion - DRS?
Avatar of Indyrb

ASKER

No Snapshots.

And looked at DRS History and there was only 1 Vmotion and that was at 12:30am this morning a different VM and that was it. This issue has been going on for a while.

DRS=Fully Autamted
Migration Threshold is in the middle.

The VM uses the default (default fully automated)


Inside the VM running Task Manager when chosing the Dates and running.

Overall Windows TaskManager:
CPU  3%
Memory 40%

When expanding the accounts after report summary
CPU 39%
Memory 46%

Again this is in taskmanager.


In VMware Vcenter Perfomrance on VM
During the time of the job

Meassurement:                        Latest:                  Maximum:       Minimum:         Average
Ready: (Milliseconds)              1626                     3204                 210                    416.939
Usage: (Percent)                      1.89                      42.64                1.39                   4.943


I attached another print screen with ready

Also when running the report within the VM. Got error.

Message from Website:
Out of memory at line: 5
CPU1.jpg
Avatar of Indyrb

ASKER

Webpage error details

User Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E)
Timestamp: Fri, 13 Sep 2013 15:27:10 UTC

Hundreds of these errors:

Message: 'null' is null or not an object
Line: 3234
Char: 9
Code: 0
URI: http://Server/Reports/Reserved.ReportViewerWebControl.axd?OpType=Resource&Version=10.50.1600.1&Name=ViewerScript
Avatar of Indyrb

ASKER

Log Name:      Application
Source:        ASP.NET 2.0.50727.0
Date:          9/13/2013 10:56:43 AM
Event ID:      1309
Task Category: Web Event
Event code: 3005
Event message: An unhandled exception has occurred.
 
Application information:
    Application domain: ReportManager_MSSQLSERVER_0-6-130235578004559384
    Trust level: RosettaMgr
    Application Virtual Path: /Reports
    Application Path: C:\Program Files\Microsoft SQL Server\MSRS10_50.MSSQLSERVER\Reporting Services\ReportManager\
    Machine name: SQLSERVER
 
Process information:
    Process ID: 4116
    Process name: ReportingServicesService.exe
    Account name: NT AUTHORITY\NETWORK SERVICE
 
Exception information:
    Exception type: COMException
    Exception message: This network connection does not exist. (Exception from HRESULT: 0x800708CA)
 
Request information:
    Request URL: http://server/Reports/Pages/Report.aspx?ItemPath=An unhandled exception has occurred.fSMS+ReportsAn unhandled exception has occurred.fAccountingAn unhandled exception has occurred.fPell_G5_Summary
    Request path: /Reports/Pages/Report.aspx
    User host address: ::1
    User:  
    Is authenticated: False
    Authentication Type:  
    Thread account name: NT AUTHORITY\NETWORK SERVICE
 
Thread information:
    Thread ID: 25
    Thread account name: NT AUTHORITY\NETWORK SERVICE
    Is impersonating: False
    Stack trace:    at Microsoft.ReportingServices.HostingInterfaces.IRsHttpPipeline.GetAuthType()
   at ReportingServicesHttpRuntime.RsWorkerRequest.GetUserToken()


Log Name:      Application
Source:        Application Error
Event ID:      1000

Description:
Faulting application name: CcmExec.exe, version: 4.0.6487.2000, time stamp: 0x4ab33e4d
Faulting module name: ntdll.dll, version: 6.1.7601.17514, time stamp: 0x4ce7ba58
Exception code: 0xc0000005
Fault offset: 0x0009ce04
Faulting process id: 0x1010
Faulting application start time: 0x01ceb08b168e685a
Faulting application path: C:\Windows\SysWOW64\CCM\CcmExec.exe
Faulting module path: C:\Windows\SysWOW64\ntdll.dll
Report Id: d7298c9a-1c86-11e3-814c-005056b00041
Avatar of Indyrb

ASKER

Attached is the Demand Measurement which reports the amount of CPU resources the VM would use if there was not contention.

Latency, % of time the virtual machine is unable to run because it is contending for access to the physical CPU

Meassurement:                        Latest:                  Maximum:       Minimum:         Average
Latency: (Percent)                     .64                        8.56                  .210                    .909


So is the CPU ready state high (milliseconds) do we need to reduce CPUs?
demand.jpg
Avatar of Indyrb

ASKER

Can anybody go through all my comments above and advise, I tried to include as much detail and information as possible. I put several comments for different things
Avatar of Indyrb

ASKER

Disk Performance attached.
disk.jpg
Would you know how the HOST BIOS is configured?

It's not in Eco Mode, rather than Performance Mode?

Setting in the BIOS, to reduce power?
Avatar of Indyrb

ASKER

I will have to check on bios.what's your thoughts on cpu ready for the vm?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Indyrb

ASKER

Looks like there are three things going on in the infrastructure.

1. a lot of vms are configured with multiple vCPU -- some 4 and 8 even
a lot of VMS have larger memory, but then have limits.
Example memory 16GB and limit of 8GB

There also appears to be disk latency, but that could bew because of the memory limits and swapping.

Then on some VMS I am seeing CPU utilization errors. But most of them appear to be CPU ready...

So if a vm on host A is having a lot of CPU ready - will that effect other vms on the same host.

Why would or should a vm have memory limits other than tricking applications and installers that require a certain amount of memory.

Whats the best way to reconfig when you have 300 VMs and HOT CPU and HOT MEMORY is not enabled.
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Indyrb

ASKER

closing request for now, awareding points as it helped. however issue is still underway. Thanks EE