There are some pictures http://vger.kernel.org/~davem/tcp_output.html
Googled with tcp_transmit_skb()
which is a key part of tcp datapath. There are some more interesting thing on his site http://vger.kernel.org/~davem/
In user - tcp
transmit part of datapath there is 1 copy from user to skb with skb_copy_to_page
(when sending by tcp_sendmsg()
) and 0 copy with do_tcp_sendpages
(called by tcp_sendpage()
). Copy is needed to keep a backup of data for case of undelivered segment. skb buffers in kernel can be cloned, but their data will stay in first (original) skb. Sendpage can take a page from other kernel part and keep it for backup (i think there is smth like COW)
Call paths (manually from lxr). Sending tcp_push_one
/__tcp_push_pending_frames
tcp_sendmsg() <- sock_sendmsg <- sock_readv_writev <- sock_writev <- do_readv_writev
tcp_sendpage() <- file_send_actor <- do_sendfile
Receive tcp_recv_skb()
tcp_recvmsg() <- sock_recvmsg <- sock_readv_writev <- sock_readv <- do_readv_writev
tcp_read_sock() <- ... spliceread for new kernels.. smth sendfile for older
In receive there can be 1 copy from kernel to user skb_copy_datagram_iovec
(called from tcp_recvmsg
). And for tcp_read_sock() there can be copy. It will call sk_read_actor
callback function. If it correspond to file or memory, it may (or may not) copy data from DMA zone. If it is a other network, it has an skb of received packet and can reuse its data inplace.
For udp - receive = 1 copy -- skb_copy_datagram_iovec called from udp_recvmsg. transmit = 1 copy -- udp_sendmsg -> ip_append_data -> getfrag (seems to be ip_generic_getfrag with 1 copy from user, but may be a smth sendpage/splicelike without page copiing.)
Generically speaking there is must be at least 1 copy when sending from/receiving to userspace and 0 copy when using zero-copy (surprise!) with kernel-space source/target buffers for data. All headers are added without moving a packet, DMA-enabled (all modern) network card will take data from any place in DMA-enabled address space. For ancient cards PIO is needed, so there will be one more copy, from kernel space to PCI/ISA/smthelse I/O registers/memory.
UPD: In path from NIC (but this is nic-dependent, i checked 8139too) to tcp stack there is one more copy: from rx_ring to skb and the same for receive: from skb to tx buffer +1copy. You must to fill in ip and tcp header, but does skb contain them or place for them?