Tuning Linux networking

On 2011年09月16日, in tips, by netoearth

The linux kernel is good at self-tuning, but there are a number of limits for the sizes of various kernel resources that you will need to increase. Some of these are obvious – file descriptors and ephemeral ports for example – but many are subtle and can manifest themselves in strange ways. We suggest the following tunables as a starting point, but do monitor kernel logs (use ‘dmesg’) for unexpected problems.

The following tunings assume a machine with 2Gb or more of memory.

  • Ephemeral port range:
    1. echo “1024 65535″ > /proc/sys/net/ipv4/ip_local_port_range
  • Listen queue:SYN Cookies are recommended:
    1. echo 1 > /proc/sys/net/ipv4/tcp_syncookies
    2. echo 8192 > /proc/sys/net/ipv4/tcp_max_syn_backlog
    3. # restart Zeus software after changing somaxconn
    4. echo 1024 > /proc/sys/net/core/somaxconn

    Without SYN cookies, a much larger value for tcp_max_syn_backlog is required, but this consumes additional kernel memory and scales poorly (the hash table that stores the SYN records is of a fixed size).

You may also change the following settings, although the default kernel values on recent Linux kernels are generally well sized:

  • TIME_WAIT:
    1. echo 1800000 > /proc/sys/net/ipv4/tcp_max_tw_buckets
  • Network buffer size (1Gb):
    1. echo “128000 200000 262144″ > /proc/sys/net/ipv4/tcp_mem
  • File descriptors:
    1. echo 2097152 > /proc/sys/fs/file-max
  • Netfilter conntrack table size:If you’re getting the error message: “ip_conntrack: table full, dropping packet.” in dmesg, and you need to use the conntrack module or something that depends on it, then increase the table size by adding the lines
    1. options ip_conntrack hashsize=1310719
    2. options nf_conntrack hashsize=1310719

    to /etc/modprobe.conf or /etc/modules.conf (this depends on your linux distro), and reboot.

  • Spirent workaround:If you are using older Spirent test kit, you absolutely must set the following tunables to work around ‘optimizations’ in their TCP stack:
    1. echo 0 > /proc/sys/net/ipv4/tcp_timestamps
    2. echo 0 > /proc/sys/net/ipv4/tcp_window_scaling

Here are some suggestions when tweaking Linux machines for maximum network performance. For full tuning advice, including non-OS tunings, please see our main article, Tuning ZXTM for Maximum Performance.

Interrupts

Interrupts (IRQs) are wake-up calls to the CPU when new network traffic arrives. The CPU stops what it is doing and is diverted to handle the new network data. Most NICs will tune their interrupts to be as efficient as possible – for the full details, you will need to consult the documentation for the drivers. For instance, here is the documentation for the e1000 Intel Gigabit NIC. In general, the defaults are quite sensible. If you are on a machine with multiple CPUs/cores, the interrupt work should be spread out over as many CPUs as possible. Otherwise, one CPU can be the bottleneck in high network traffic. In Linux, you should install the irqbalance program, which will dynamically adjust how interrupts are handled by each CPU. Irqbalance is available as an installable package with most Linux distributions. Sometimes the IRQ balancing doesn’t work out well. If under high load you see one or more ‘ksoftirqd’ processes using lots of CPU (run top to check), then something is wrong. Tracking down the problem can be difficult, but changing Linux kernel versions, NIC drivers or installing a different version of irqbalance can help.

Network Card Features

Check that any supported NIC features are enabled, using ethtool. Network offload features can be shown with ethtool -k eth0. See the manpage for ethtool to see how to enable supported features. ethtool -S eth0 will show network statistics – check that you aren’t getting any packet errors, overruns, collisions, etc, as these are all signs of bad NICs or cabling. (Check that all the cables are plugged in fully – it really does help!) Finally, check that the NICs are all running at full speed. If the cabling isn’t good, then gigabit cards may fall back to 100MBits or less when there is lots of traffic. Always check NIC speeds before and after testing to ensure that the network is reliable. If your NIC is running at half-duplex, chances are that something is terribly wrong!

iptables

iptables performs IP filtering / firewalling for Linux. If you aren’t using such features, then be sure to:

  1. Turn it off
  2. REMOVE THE MODULES

The last step is very important. The mere presence of iptables modules can cause up to 30% performance loss, even when their features are not in use! Run lsmod and check for the following modules. rmmod modulename to get rid of them:

  • ip_conntrack
  • iptable_filter
  • ip_tables
  • Anything else with iptable in the name

SYN cookies

SYN cookies are a form of protection against a low-level network denial-of-service attack. If SYN cookies are enabled, Linux will start using them when the syn_backlog queue fills up. Otherwise, new connections will be dropped when the queue is full. In a production environment, SYN cookies should be enabled. In a benchmark environment, they may be disabled and the max_syn_backlog parameter increased significantly. If you are 100% certain that you will not suffer a SYN flood attack, you may wish to disable SYN cookies and increase the syn backlog:

  1. sysctl -w net.ipv4.tcp_syncookies=0
  2. sysctl -w net.ipv4.tcp_max_syn_backlog=32768

Other network tunings

These are some suggested tunings – please see the full list of options – but don’t go overboard, most options have sensible defaults.

  1. # Widen the range of local ports – needed when
  2. # making lots of outgoing connections
  3. sysctl -w “net.ipv4.ip_local_port_range=1024 65535″
  4. # Bigger backlog of SYN packets
  5. sysctl -w net.ipv4.tcp_max_syn_backlog=10240
  6. # Increase maximum backlog for accepting new connections
  7. sysctl -w net.core.somaxconn=1024
  8. # More efficient handling of lots of old connections
  9. # in the TIME_WAIT state
  10. sysctl -w net.ipv4.tcp_max_tw_buckets=1800000
Tagged with:  

Comments are closed.