views:

67

answers:

2

Hi

we're designing SOHO router based on MIPS processor, wired up with 24-ports switch. The CPU runs NAT (configured with iptables), iptables rules, dhcp etc. it doesn't have any H/W acceleration for these functions. When testing NAT in full-mesh mode (i.e. one WAN port and others are LAN port), we observe the significant system's slowdown, especially console responds very slowly, and there is also packets loss.

The 'top' shows that ksoftirqd consumes over 80% of CPU.

What can be the reason of such behaviour? Does the Linux NAT run in userland?

+1  A: 

ksoftirqd is the IRQ handler. You may check /proc/interrupts to see which IRQ is under load.

The CPU is overloaded: use a stronger model, or use simplier iptables rules. Linux NAT run in kernel space, ksoftirqd is in kernel space.

J-16 SDiZ
Your first sentence is incorrect :)
Nikolai N Fetissov
@Fetissov: it is the bottom halves of the irq handler. the details doesn't matter here.
J-16 SDiZ
Hmm, hardware IRQ handlers are generally called "top-halves", and details do matter - if your system spends 80% of time in hardware interrupt handling then something is really wrong. 80% in softirqs means you are asking the kernel for too much.
Nikolai N Fetissov
+4  A: 

ksoftirqds are kernel threads driving ... soft IRQs, things like TIMER_SOFTIRQ, SCSI_SOFTIRQ, TASKLET_SOFTIRQ, and what's relevant to your case, NET_TX_SOFTIRQ and NET_RX_SOFTIRQ. These are implemented in bottom halves of the kernel, as deffered work from top halves - the actual interrupt handlers in the device drivers where latency is critical.

Actual interrupt handler, or hardware IRQ, for a network card is concerned with getting data to/from the device as quickly as possible. It doesn't know anything about NAT and other TCP/IP processing. It knows about its bus handling (say PCI), its card specifics (ring buffers, control/config registers), DMA, and a bit about Ethernet. It hands/receives packets (skbufs to be exact) through queues to/from bottom half.

Take a look at the ethtool(8) if you haven't yet. See if you can tune the hardware/drivers to do checksum/segmentation offloading, etc. I don't have any suggestions on the NAT front, I don't use it.

Hope this helps a bit.

Edit:

As mentioned in the comments, check the NIC hardware for interrupt mitigation and the supporting driver for NAPI support.

Nikolai N Fetissov
The reference to bottom half vs top half is a little archaic. However the point remains that by splitting the workload the kernel doesn't try and achieve everything in the actual hardware IRQ while other IRQs are potentially masked.If the kernel is > 2.4.20 you should have NAPI available and one of the things a driver can support is interrupt mitigation. If the packet rate becomes too high a NAPI driver can turn off interrupts (which are fairly expensive - one per packet) and go into a polling mode. However the driver needs to support this.
stsquad
@stsquad, just the BH name as Linux in-kernel terminology is outdated, the split is still there, be it tasklets, timeouts, work-queues. Good point about the NAPI, will put that into the answer. Thanks.
Nikolai N Fetissov
Hi guys, thank you very much for your comments, I learned a lot. The driver indeed supports NAPI, but it was disabled by default, I enabled, recompiled it and installed back on the board. So now the performance seems to be better, 'top' doesn't show ksoftirqd eating all the CPU, but: CPU: 0.1% usr 0.7% sys 0.0% nice 47.2% idle 0.0% io 1.6% irq 50.1% softirq What does it mean now?
Mark
+1 for checking on NAPI
Noah Watkins