| Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 1 |   =========================================================================== | 
 | 2 |                       The UDP-Lite protocol (RFC 3828) | 
 | 3 |   =========================================================================== | 
 | 4 |  | 
 | 5 |  | 
 | 6 |   UDP-Lite is a Standards-Track IETF transport protocol whose characteristic | 
 | 7 |   is a variable-length checksum. This has advantages for transport of multimedia | 
 | 8 |   (video, VoIP) over wireless networks, as partly damaged packets can still be | 
 | 9 |   fed into the codec instead of being discarded due to a failed checksum test. | 
 | 10 |  | 
 | 11 |   This file briefly describes the existing kernel support and the socket API. | 
 | 12 |   For in-depth information, you can consult: | 
 | 13 |  | 
 | 14 |    o The UDP-Lite Homepage: http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/ | 
 | 15 |        Fom here you can also download some example application source code. | 
 | 16 |  | 
 | 17 |    o The UDP-Lite HOWTO on | 
 | 18 |        http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/files/UDP-Lite-HOWTO.txt | 
 | 19 |  | 
 | 20 |    o The Wireshark UDP-Lite WiKi (with capture files): | 
 | 21 |        http://wiki.wireshark.org/Lightweight_User_Datagram_Protocol | 
 | 22 |  | 
 | 23 |    o The Protocol Spec, RFC 3828, http://www.ietf.org/rfc/rfc3828.txt | 
 | 24 |  | 
 | 25 |  | 
 | 26 |   I) APPLICATIONS | 
 | 27 |  | 
 | 28 |   Several applications have been ported successfully to UDP-Lite. Ethereal | 
 | 29 |   (now called wireshark) has UDP-Litev4/v6 support by default. The tarball on | 
 | 30 |  | 
 | 31 |    http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/files/udplite_linux.tar.gz | 
 | 32 |  | 
 | 33 |   has source code for several v4/v6 client-server and network testing examples. | 
 | 34 |  | 
 | 35 |   Porting applications to UDP-Lite is straightforward: only socket level and | 
 | 36 |   IPPROTO need to be changed; senders additionally set the checksum coverage | 
 | 37 |   length (default = header length = 8). Details are in the next section. | 
 | 38 |  | 
 | 39 |  | 
 | 40 |   II) PROGRAMMING API | 
 | 41 |  | 
 | 42 |   UDP-Lite provides a connectionless, unreliable datagram service and hence | 
 | 43 |   uses the same socket type as UDP. In fact, porting from UDP to UDP-Lite is | 
 | 44 |   very easy: simply add `IPPROTO_UDPLITE' as the last argument of the socket(2) | 
 | 45 |   call so that the statement looks like: | 
 | 46 |  | 
 | 47 |       s = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDPLITE); | 
 | 48 |  | 
 | 49 |                       or, respectively, | 
 | 50 |  | 
 | 51 |       s = socket(PF_INET6, SOCK_DGRAM, IPPROTO_UDPLITE); | 
 | 52 |  | 
 | 53 |   With just the above change you are able to run UDP-Lite services or connect | 
 | 54 |   to UDP-Lite servers. The kernel will assume that you are not interested in | 
 | 55 |   using partial checksum coverage and so emulate UDP mode (full coverage). | 
 | 56 |  | 
 | 57 |   To make use of the partial checksum coverage facilities requires setting a | 
 | 58 |   single socket option, which takes an integer specifying the coverage length: | 
 | 59 |  | 
 | 60 |     * Sender checksum coverage: UDPLITE_SEND_CSCOV | 
 | 61 |  | 
 | 62 |       For example, | 
 | 63 |  | 
 | 64 |         int val = 20; | 
 | 65 |         setsockopt(s, SOL_UDPLITE, UDPLITE_SEND_CSCOV, &val, sizeof(int)); | 
 | 66 |  | 
 | 67 |       sets the checksum coverage length to 20 bytes (12b data + 8b header). | 
 | 68 |       Of each packet only the first 20 bytes (plus the pseudo-header) will be | 
 | 69 |       checksummed. This is useful for RTP applications which have a 12-byte | 
 | 70 |       base header. | 
 | 71 |  | 
 | 72 |  | 
 | 73 |     * Receiver checksum coverage: UDPLITE_RECV_CSCOV | 
 | 74 |  | 
 | 75 |       This option is the receiver-side analogue. It is truly optional, i.e. not | 
 | 76 |       required to enable traffic with partial checksum coverage. Its function is | 
 | 77 |       that of a traffic filter: when enabled, it instructs the kernel to drop | 
 | 78 |       all packets which have a coverage _less_ than this value. For example, if | 
 | 79 |       RTP and UDP headers are to be protected, a receiver can enforce that only | 
 | 80 |       packets with a minimum coverage of 20 are admitted: | 
 | 81 |  | 
 | 82 |         int min = 20; | 
 | 83 |         setsockopt(s, SOL_UDPLITE, UDPLITE_RECV_CSCOV, &min, sizeof(int)); | 
 | 84 |  | 
 | 85 |   The calls to getsockopt(2) are analogous. Being an extension and not a stand- | 
 | 86 |   alone protocol, all socket options known from UDP can be used in exactly the | 
 | 87 |   same manner as before, e.g. UDP_CORK or UDP_ENCAP. | 
 | 88 |  | 
 | 89 |   A detailed discussion of UDP-Lite checksum coverage options is in section IV. | 
 | 90 |  | 
 | 91 |  | 
 | 92 |   III) HEADER FILES | 
 | 93 |  | 
 | 94 |   The socket API requires support through header files in /usr/include: | 
 | 95 |  | 
 | 96 |     * /usr/include/netinet/in.h | 
 | 97 |         to define IPPROTO_UDPLITE | 
 | 98 |  | 
 | 99 |     * /usr/include/netinet/udplite.h | 
 | 100 |         for UDP-Lite header fields and protocol constants | 
 | 101 |  | 
 | 102 |   For testing purposes, the following can serve as a `mini' header file: | 
 | 103 |  | 
 | 104 |     #define IPPROTO_UDPLITE       136 | 
 | 105 |     #define SOL_UDPLITE           136 | 
 | 106 |     #define UDPLITE_SEND_CSCOV     10 | 
 | 107 |     #define UDPLITE_RECV_CSCOV     11 | 
 | 108 |  | 
 | 109 |   Ready-made header files for various distros are in the UDP-Lite tarball. | 
 | 110 |  | 
 | 111 |  | 
 | 112 |   IV) KERNEL BEHAVIOUR WITH REGARD TO THE VARIOUS SOCKET OPTIONS | 
 | 113 |  | 
 | 114 |   To enable debugging messages, the log level need to be set to 8, as most | 
 | 115 |   messages use the KERN_DEBUG level (7). | 
 | 116 |  | 
 | 117 |   1) Sender Socket Options | 
 | 118 |  | 
 | 119 |   If the sender specifies a value of 0 as coverage length, the module | 
 | 120 |   assumes full coverage, transmits a packet with coverage length of 0 | 
 | 121 |   and according checksum.  If the sender specifies a coverage < 8 and | 
 | 122 |   different from 0, the kernel assumes 8 as default value.  Finally, | 
 | 123 |   if the specified coverage length exceeds the packet length, the packet | 
 | 124 |   length is used instead as coverage length. | 
 | 125 |  | 
 | 126 |   2) Receiver Socket Options | 
 | 127 |  | 
 | 128 |   The receiver specifies the minimum value of the coverage length it | 
 | 129 |   is willing to accept.  A value of 0 here indicates that the receiver | 
 | 130 |   always wants the whole of the packet covered. In this case, all | 
 | 131 |   partially covered packets are dropped and an error is logged. | 
 | 132 |  | 
 | 133 |   It is not possible to specify illegal values (<0 and <8); in these | 
 | 134 |   cases the default of 8 is assumed. | 
 | 135 |  | 
 | 136 |   All packets arriving with a coverage value less than the specified | 
 | 137 |   threshold are discarded, these events are also logged. | 
 | 138 |  | 
 | 139 |   3) Disabling the Checksum Computation | 
 | 140 |  | 
 | 141 |   On both sender and receiver, checksumming will always be performed | 
 | 142 |   and can not be disabled using SO_NO_CHECK. Thus | 
 | 143 |  | 
 | 144 |         setsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK,  ... ); | 
 | 145 |  | 
 | 146 |   will always will be ignored, while the value of | 
 | 147 |  | 
 | 148 |         getsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK, &value, ...); | 
 | 149 |  | 
 | 150 |   is meaningless (as in TCP). Packets with a zero checksum field are | 
 | 151 |   illegal (cf. RFC 3828, sec. 3.1) will be silently discarded. | 
 | 152 |  | 
 | 153 |   4) Fragmentation | 
 | 154 |  | 
 | 155 |   The checksum computation respects both buffersize and MTU. The size | 
 | 156 |   of UDP-Lite packets is determined by the size of the send buffer. The | 
 | 157 |   minimum size of the send buffer is 2048 (defined as SOCK_MIN_SNDBUF | 
 | 158 |   in include/net/sock.h), the default value is configurable as | 
 | 159 |   net.core.wmem_default or via setting the SO_SNDBUF socket(7) | 
 | 160 |   option. The maximum upper bound for the send buffer is determined | 
 | 161 |   by net.core.wmem_max. | 
 | 162 |  | 
 | 163 |   Given a payload size larger than the send buffer size, UDP-Lite will | 
 | 164 |   split the payload into several individual packets, filling up the | 
 | 165 |   send buffer size in each case. | 
 | 166 |  | 
 | 167 |   The precise value also depends on the interface MTU. The interface MTU, | 
 | 168 |   in turn, may trigger IP fragmentation. In this case, the generated | 
 | 169 |   UDP-Lite packet is split into several IP packets, of which only the | 
 | 170 |   first one contains the L4 header. | 
 | 171 |  | 
 | 172 |   The send buffer size has implications on the checksum coverage length. | 
 | 173 |   Consider the following example: | 
 | 174 |  | 
 | 175 |   Payload: 1536 bytes          Send Buffer:     1024 bytes | 
 | 176 |   MTU:     1500 bytes          Coverage Length:  856 bytes | 
 | 177 |  | 
 | 178 |   UDP-Lite will ship the 1536 bytes in two separate packets: | 
 | 179 |  | 
 | 180 |   Packet 1: 1024 payload + 8 byte header + 20 byte IP header = 1052 bytes | 
 | 181 |   Packet 2:  512 payload + 8 byte header + 20 byte IP header =  540 bytes | 
 | 182 |  | 
 | 183 |   The coverage packet covers the UDP-Lite header and 848 bytes of the | 
 | 184 |   payload in the first packet, the second packet is fully covered. Note | 
 | 185 |   that for the second packet, the coverage length exceeds the packet | 
 | 186 |   length. The kernel always re-adjusts the coverage length to the packet | 
 | 187 |   length in such cases. | 
 | 188 |  | 
 | 189 |   As an example of what happens when one UDP-Lite packet is split into | 
 | 190 |   several tiny fragments, consider the following example. | 
 | 191 |  | 
 | 192 |   Payload: 1024 bytes            Send buffer size: 1024 bytes | 
 | 193 |   MTU:      300 bytes            Coverage length:   575 bytes | 
 | 194 |  | 
 | 195 |   +-+-----------+--------------+--------------+--------------+ | 
 | 196 |   |8|    272    |      280     |     280      |     280      | | 
 | 197 |   +-+-----------+--------------+--------------+--------------+ | 
 | 198 |                280            560            840           1032 | 
 | 199 |                                     ^ | 
 | 200 |   *****checksum coverage************* | 
 | 201 |  | 
 | 202 |   The UDP-Lite module generates one 1032 byte packet (1024 + 8 byte | 
 | 203 |   header). According to the interface MTU, these are split into 4 IP | 
 | 204 |   packets (280 byte IP payload + 20 byte IP header). The kernel module | 
 | 205 |   sums the contents of the entire first two packets, plus 15 bytes of | 
 | 206 |   the last packet before releasing the fragments to the IP module. | 
 | 207 |  | 
 | 208 |   To see the analogous case for IPv6 fragmentation, consider a link | 
 | 209 |   MTU of 1280 bytes and a write buffer of 3356 bytes. If the checksum | 
 | 210 |   coverage is less than 1232 bytes (MTU minus IPv6/fragment header | 
 | 211 |   lengths), only the first fragment needs to be considered. When using | 
 | 212 |   larger checksum coverage lengths, each eligible fragment needs to be | 
 | 213 |   checksummed. Suppose we have a checksum coverage of 3062. The buffer | 
 | 214 |   of 3356 bytes will be split into the following fragments: | 
 | 215 |  | 
 | 216 |     Fragment 1: 1280 bytes carrying  1232 bytes of UDP-Lite data | 
 | 217 |     Fragment 2: 1280 bytes carrying  1232 bytes of UDP-Lite data | 
 | 218 |     Fragment 3:  948 bytes carrying   900 bytes of UDP-Lite data | 
 | 219 |  | 
 | 220 |   The first two fragments have to be checksummed in full, of the last | 
 | 221 |   fragment only 598 (= 3062 - 2*1232) bytes are checksummed. | 
 | 222 |  | 
 | 223 |   While it is important that such cases are dealt with correctly, they | 
 | 224 |   are (annoyingly) rare: UDP-Lite is designed for optimising multimedia | 
 | 225 |   performance over wireless (or generally noisy) links and thus smaller | 
 | 226 |   coverage lenghts are likely to be expected. | 
 | 227 |  | 
 | 228 |  | 
 | 229 |   V) UDP-LITE RUNTIME STATISTICS AND THEIR MEANING | 
 | 230 |  | 
 | 231 |   Exceptional and error conditions are logged to syslog at the KERN_DEBUG | 
 | 232 |   level.  Live statistics about UDP-Lite are available in /proc/net/snmp | 
 | 233 |   and can (with newer versions of netstat) be viewed using | 
 | 234 |  | 
 | 235 |                             netstat -svu | 
 | 236 |  | 
 | 237 |   This displays UDP-Lite statistics variables, whose meaning is as follows. | 
 | 238 |  | 
 | 239 |    InDatagrams:     Total number of received datagrams. | 
 | 240 |  | 
 | 241 |    NoPorts:         Number of packets received to an unknown port. | 
 | 242 |                     These cases are counted separately (not as InErrors). | 
 | 243 |  | 
 | 244 |    InErrors:        Number of erroneous UDP-Lite packets. Errors include: | 
 | 245 |                       * internal socket queue receive errors | 
 | 246 |                       * packet too short (less than 8 bytes or stated | 
 | 247 |                         coverage length exceeds received length) | 
 | 248 |                       * xfrm4_policy_check() returned with error | 
 | 249 |                       * application has specified larger min. coverage | 
 | 250 |                         length than that of incoming packet | 
 | 251 |                       * checksum coverage violated | 
 | 252 |                       * bad checksum | 
 | 253 |  | 
 | 254 |    OutDatagrams:    Total number of sent datagrams. | 
 | 255 |  | 
 | 256 |    These statistics derive from the UDP MIB (RFC 2013). | 
 | 257 |  | 
 | 258 |  | 
 | 259 |   VI) IPTABLES | 
 | 260 |  | 
 | 261 |   There is packet match support for UDP-Lite as well as support for the LOG target. | 
 | 262 |   If you copy and paste the following line into /etc/protcols, | 
 | 263 |  | 
 | 264 |   udplite 136     UDP-Lite        # UDP-Lite [RFC 3828] | 
 | 265 |  | 
 | 266 |   then | 
 | 267 |               iptables -A INPUT -p udplite -j LOG | 
 | 268 |  | 
 | 269 |   will produce logging output to syslog. Dropping and rejecting packets also works. | 
 | 270 |  | 
 | 271 |  | 
 | 272 |   VII) MAINTAINER ADDRESS | 
 | 273 |  | 
 | 274 |   The UDP-Lite patch was developed at | 
 | 275 |                     University of Aberdeen | 
 | 276 |                     Electronics Research Group | 
 | 277 |                     Department of Engineering | 
 | 278 |                     Fraser Noble Building | 
 | 279 |                     Aberdeen AB24 3UE; UK | 
 | 280 |   The current maintainer is Gerrit Renker, <gerrit@erg.abdn.ac.uk>. Initial | 
 | 281 |   code was developed by William  Stanislaus, <william@erg.abdn.ac.uk>. |