Tuesday, April 10, 2012

Network I/O Control (NetIOC): Architecture, Performance and Best Practices

NetIOC is only supported with vNetwork Distributed Switch (vDS), new feature on vSphere 4.1 (vSphere 5 push this to VM level)

NetIOC Feature Set:
  • Traffic isolation
  • Shares: allows flexible networking capacity partitioning
  • Limits: enforce traffic bandwidth limit on the overall vDS set of dvUplinks
  • Load-Beased Teaming: effectively use a vDS set of dvUplinks for networking capacity. (LBT is not based on limit and shares, not the default teaming policy for DV port; reshuffle port binding; only start to work when an uplink utilization is over 75% for more than 30 seconds.)





















NetIOC defines how different network traffic traffics are propagated through each congested network adapter in vDS. 

NetIOC could be excluded from host configuration at physical network adapter advanced settings at Software section.

Network Resource pool represent different traffic at vDS (FT, VM, Management, iSCSI, NFS, vMotion)

Shares are in effect only when there is period of contention.

NetIOC Best Practices

NetIOC is a very powerful feature that will make your vSphere deployment even more suitable for your I/O-consolidated datacenter. However, follow these best practices to optimize the usage of this feature:
Best practice 1: When using bandwidth allocation, use “shares” instead of “limits,” as the former has greater flexibility for unused capacity redistribution. Partitioning the available network bandwidth among different types of network traffic flows using limits has shortcomings. For instance, allocating 2Gbps bandwidth by using a limit for the virtual machine resource pool provides a maximum of 2Gbps bandwidth for all the virtual machine traffic even if the team is not saturated. In other words, limits impose hard limits on the amount of the bandwidth usage by a traffic flow even when there is network bandwidth available.
Best practice 2: If you are concerned about physical switch and/or physical network capacity, consider imposing limits on a given resource pool. For instance, you might want to put a limit on vMotion traffic flow to help in situations where multiple vMotion traffic flows initiated on different ESX hosts at the same time could possibly oversubscribe the physical network. By limiting the vMotion traffic bandwidth usage at the ESX host level, we can prevent the possibility of jeopardizing performance for other flows going through the same points of contention.
Best practice 3: Fault tolerance is a latency-sensitive traffic flow, so it is recommended to always set the corresponding resource-pool shares to a reasonably high relative value in the case of custom shares. However, in the case where you are using the predefined default shares value for VMware FT, leaving it set to high is recommended.
Best practice 4: We recommend that you use LBT as your vDS teaming policy while using NetIOC in order to maximize the networking capacity utilization.
NOTE: As LBT moves flows among uplinks it may occasionally cause reordering of packets at the receiver.
Best practice 5: Use the DV Port Group and Traffic Shaper features offered by the vDS to maximum effect when configuring the vDS. Configure each of the traffic flow types with a dedicated DV Port Group. Use DV Port Groups as a means to apply configuration policies to different traffic flow types, and more important, to provide additional Rx bandwidth controls through the use of Traffic Shaper. For instance, you might want to enable Traffic Shaper for the egress traffic on the DV Port Group used for vMotion. This can
help in situations when multiple vMotions initiated on different vSphere hosts converge to the same destination vSphere server.

Conclusions
Consolidating the legacy GbE networks in a virtualized datacenter environment with 10GbE offers many benefits — ease of management, lower capital costs and better utilization of network resources. However, during the peak periods of contention, the lack of control mechanisms to share the network I/O resources among the traffic flows can result in significant performance drop of critical traffic flows. Such performance loss is unpredictable and uncontrollable if the access to the network I/O resources is unmanaged. NetIOC available in vSphere 4.1 provides a mechanism to manage the access to the network I/O resources when multiple traffic flows compete. The experiments conducted in VMware performance labs using industry standard workloads show that:
• Lack of NetIOC can result in unpredictable loss in performance of critical traffic flows during periods of contention.
• NetIOC can effectively provide service level guarantees to the critical traffic flows. Our test results showed that NetIOC eliminated a performance drop of as much as 67 percent observed in an unmanaged scenario.
• NetIOC in combination with Traffic Shaper provides a comprehensive network convergence solution enabling features that are not available with the any of the hardware solutions in the market today.


No comments:

Post a Comment