The NIC and the host-side driver must act in concert to implement checksum offloading. The LANai-5 and Alteon NICs support checksum offloading in the host-PCI DMA engine, which computes the raw 16-bit ones-complement checksum of each DMA transfer as it moves data to and from host memory. Using this checksum need not demand any significant change to the IP stack: simply setting a M_HWCKSUM flag in the header of an mbuf chain bypasses the software checksum computation in in_cksum. However, using hardware checksumming for IP protocol family is complicated by three factors:
Trapeze currently supports TCP checksum offloading only on LANai-5 receivers. Checksum offloading is not supported on the sending side, in part because Trapeze uses message pipelining to minimize latency of large packets. With message pipelining the front of a packet may be transmitted on the link before the tail of the packet arrives on the NIC, and therefore before the checksum can be determined. One solution is to append the end-to-end checksum to the tail of the outgoing packet; while this would depart from the standard IP packet format, it is transparent to the end hosts because the Trapeze firmware and driver can reconstruct the packet at the receiving side. Of course, this approach would compromise interoperability in a standards-based network containing some endstations that do not support checksum offloading. The alternative, apparently implemented in Alteon's NICs, is to use store-and-forward packet transmission at the sender, which increases large-packet latencies (see Section 3.4).
Trapeze uses the NIC DMA engines to checksum packet data, but header checksums are computed by special-case code for TCP/IP in the Trapeze network driver. The Trapeze firmware combines partial checksums for all DMA operations on the payload portion of the message, then passes the partial checksum to the host-side driver through a logical control register. The driver then computes an IP header checksum, computes the layer-4 header checksum using a scratch copy of the IP header, combines the layer-4 header checksum with the payload checksum to determine the complete end-to-end checksum, and compares the computed checksums with those transmitted in the packet. Our philosophy is that any instructions that manipulate the IP header should be executed on the fast host CPU rather than on the NIC. In contrast, the Alteon NICs perform both header and data checksums in the NIC firmware.