Avatar of andyalder
andyalder
 asked on

Non-uniform PCIe Access - are any OSs "NUPA" aware?

Wondering which PCIe slots to use on dual Intel CPU E5 based servers, and whether any operating system know how to load balance to keep PCIe access local to the CPU.

It's well known that modern OSs are NUMA aware and will try to allocate RAM/cores so that memory access is local rather than across the hypertransport/quickpath busses because modern CPUs have inbuilt memory controllers.

What about PCIe cards though now that the PCIe controllers are built into the CPUs rather than on the southbridge? Say for example I have a HP DL380 gen8 or Dell R720 with two CPUs and I put dual port NICs in slots 1 and 4 and teamed them together for 4*1Gb, would the OS be clever enough to know that slot 1 was on CPU1 and slot 4 was on CPU2 and direct the outbound packets to the local NIC or would half of the traffic go over the HT bus in a random manner? Is there any benefit of spreading NICs and FC HBAs over both CPUs inbuilt PCIe controllers?

I'm going to sleep on it and see if anyone knows for sure.
Server HardwareWindows Server 2012VMwareDell

Avatar of undefined
Last Comment
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

8/22/2022 - Mon
ASKER CERTIFIED SOLUTION
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
andyalder

ASKER
Did Dell come back with anything? It's really a question for the likes of VMware, MS and the NIC/HBA manufacturers to address rather than the server makers. I think we have to assume there's nothing that uses locality in load balancing at the moment so we might as well just put the cards on any CPU and get average 50% local traffic.

Not sure if there is a heat/turbo gain to spreading the cards over both CPUs, if the processors and airflow were identical then maybe the additional heat from the PCIe controller part of the chip might come into play and slow the CPU on that die down but that would be delving into the 1-2% speed improvements you can get by picking the fastest of a particular chip part number.
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

Dell still sleepy mode over Chrimbo!
andyalder

ASKER
EE is chasing as abandoned so I'll close. Guess it doesn't matter much which CPU the peripherals are on at the moment otherwise the manufacturers would be shouting about slight speed improvements from clever drivers.
Your help has saved me hundreds of hours of internet surfing.
fblack61
Andrew Hancock (VMware vExpert PRO / EE Fellow/British Beekeeper)

still in Dell Escalation Team, bookmarked the question, so will post back here, or catchmeup through profile!