Please excuse my ignorance in this area, I am from a risk management background as opposed to a technical infrastructure/windows networking background, however I am researching capacity monitoring of servers. 4 metrics I have read which are generally seen as useful to monitor in any IT environment, for each of your production servers include:
CPU capacity / utilisation
Memory capacity / utilisation
Network capacity / utilisation
Storage capacity / utilisation
I am familiar with the basic concepts of CPU, memory, and storage capacity monitoring, i.e. check to see CPU and memory utilisation to ensure it is not near 100% capacity or you may face performance issues and even system unavailablity, and again with space issues, i.e. 90% of capacity used could again lead to performance issues, potential data loss, system unavailability etc.
However, from a servers "network utilisation" standpoint I am not entirely sure what this is looking at? i.e. the server itself doesnt have any "network capacity limit", so my questions were (sorry if they are at a basic level):
1) what in terms of network utilisation in mb/per second is this metric looking at?
2) what would be defined as a dangerous utilisation %, and if the server is running at a dangerous threshold,
3) what can be done to improve things if the current network utilisation is at a dangerous threshold, i.e. you can add memory, CPU, storage capacity resources to a server, but what can you do or add to the server for network utilisation capacity problems? If anything, or what other area of the infrastructure do you need to look at if the server is using excessive network utilisation in terms of MB/S.
4) what causes a server to use excessive network utilisation in terms of mb/s - can there be specific events, or is it linked to insufficient hardware to deal with the requirements of the server?
Please keep your answers tech freindly and management freindly if possible.