Here’s the design in bullet form:
• The existing core switching is 1-Gbps (couple Cisco Catalyst 3650 48-port stacked switches, almost at full density), and the new 10-Gbps switches are Cisco Catalyst 3850 SFP+.
• The 3850s have UCS servers (as well as Veeam backup, etc.) connected at 10-Gbps, but are connected to the core switching at 1-Gbps.
• The 3850 SFP+ interfaces connecting to the core 3650s are configured for 1-Gbps operation.
• The core 3650s are connected to ASA firewalls at 1-Gbps, which provides a DMZ for externally-facing applications.
It turns out that a majority of server-to-server traffic is between internal SQL instances and public resources (web & application tiers) in the DMZ, so the traffic goes 10-Gbps from ESXi to the 3850s, then has to be sent over 1-Gbps to the core 3650s towards the DMZ. When we first tried to cutover to this deployment, all server access pretty much stopped. Troubleshooting revealed that the outgoing interfaces on the 3850s were exhibiting an extremely high number of interface drops/discards. Since then, the customer is only extending very limited backup traffic (a couple small applications) over these connections, and the interface discards are still outrageously high. (Not sure if related, but the 3850 switches are also running unexpectedly high CPU utilization of 70%, and again, aren't handling most of the server traffic yet.) As you can see in the design diagram below, the ESXi environment still has multiple 1-Gbps connections bypassing the 3850s so they're still up and running with the old design until we solve these problems, and can start sending all ESXi server traffic through the 3850s.
The images below are a VERY simplified design of the components (in reality, all of the connections and devices are redundant), a “show interface” snapshot from one of the 3850 interfaces leading to the 3650 core (again, configured as 1-Gbps), and a Solarwinds screenshot showing interface discards.
I’m looking for input regarding how to attack this problem. Can we easily solve using a shaping policy on the 3850 interfaces?