As per Relevance of the word computing, we have this rfc below:
Network Working Group R.
Request for Comments: 1071
D.
Cray
C.
BBN
September 1988
Computing the Internet
Status of This
This memo summarizes techniques and algorithms for
computing the Internet checksum. It is not a standard, but a set
useful implementation techniques. Distribution of this memo
unlimited
1.
This memo discusses methods for efficiently computing the
checksum that is used by the standard Internet protocols IP, UDP,
TCP
An efficient checksum implementation is critical to good performance
As advances in implementation techniques streamline the rest of
protocol processing, the checksum computation becomes one of
limiting factors on TCP performance, for example. It is
appropriate to carefully hand-craft the checksum routine,
every machine-dependent trick possible; a fraction of a
per TCP data byte can add up to a significant CPU time
overall
In outline, the Internet checksum algorithm is very simple
(1) Adjacent octets to be checksummed are paired to form 16-
integers, and the 1's complement sum of these 16-bit integers
formed
(2) To generate a checksum, the checksum field itself is cleared
the 16-bit 1's complement sum is computed over the
concerned, and the 1's complement of this sum is placed in
checksum field
(3) To check a checksum, the 1's complement sum is computed over
same set of octets, including the checksum field. If the
is all 1 bits (-0 in 1's complement arithmetic), the
succeeds
Suppose a checksum is to be computed over the sequence of
Braden, Borman, & Partridge [Page 1]
RFC 1071 Computing the Internet Checksum September 1988
A, B, C, D, ... , Y, Z. Using the notation [a,b] for the 16-
integer a*256+b, where a and b are bytes, then the 16-bit 1'
complement sum of these bytes is given by one of the following
[A,B] +' [C,D] +' ... +' [Y,Z] [1]
[A,B] +' [C,D] +' ... +' [Z,0] [2]
where +' indicates 1's complement addition. These
correspond to an even or odd count of bytes, respectively
On a 2's complement machine, the 1's complement sum must
computed by means of an "end around carry", i.e., any
from the most significant bits are added into the
significant bits. See the examples below
Section 2 explores the properties of this checksum that may
exploited to speed its calculation. Section 3 contains
numerical examples of the most important
techniques. Finally, Section 4 includes examples of
algorithms for a variety of common CPU types. We are
to Van Jacobson and Charley Kline for their contribution
algorithms to this section
The properties of the Internet checksum were
discussed by Bill Plummer in IEN-45, entitled "Checksum
Design". Since IEN-45 has not been widely available, we
it as an extended appendix to this RFC
2. Calculating the
This simple checksum has a number of wonderful
properties that may be exploited to speed its calculation, as
will now discuss
(A) Commutative and
As long as the even/odd assignment of bytes is respected,
sum can be done in any order, and it can be arbitrarily
into groups
For example, the sum [1] could be split into
( [A,B] +' [C,D] +' ... +' [J,0] )
+' ( [0,K] +' ... +' [Y,Z] ) [3]
Braden, Borman, & Partridge [Page 2]
RFC 1071 Computing the Internet Checksum September 1988
(B) Byte Order
The sum of 16-bit integers can be computed in either byte order
Thus, if we calculate the swapped sum
[B,A] +' [D,C] +' ... +' [Z,Y] [4]
the result is the same as [1], except the bytes are swapped
the sum! To see why this is so, observe that in both orders
carries are the same: from bit 15 to bit 0 and from bit 7 to
8. In other words, consistently swapping bytes simply
the bits within the sum, but does not affect their
ordering
Therefore, the sum may be calculated in exactly the same
regardless of the byte order ("big-endian" or "little-endian")
of the underlaying hardware. For example, assume a "little
endian" machine summing data that is stored in memory in
("big-endian") order. Fetching each 16-bit word will
bytes, resulting in the sum [4]; however, storing the
back into memory will swap the sum back into network byte order
Byte swapping may also be used explicitly to handle
alignment problems. For example, the second group in [3] can
calculated without concern to its odd/even origin, as
[K,L] +' ... +' [Z,0]
if this sum is byte-swapped before it is added to the
group. See the example below
(C) Parallel
On machines that have word-sizes that are multiples of 16 bits
it is possible to develop even more efficient implementations
Because addition is associative, we do not have to sum
integers in the order they appear in the message. Instead
can add them in "parallel" by exploiting the larger word size
To compute the checksum in parallel, simply do a 1's
addition of the message using the native word size of
machine. For example, on a 32-bit machine we can add 4 bytes
a time: [A,B,C,D]+'... When the sum has been computed, we "fold
the long sum into 16 bits by adding the 16-bit segments.
16-bit addition may produce new end-around carries that must
added
Furthermore, again the byte order does not matter; we
instead sum 32-bit words: [D,C,B,A]+'... or [B,A,D,C]+'...
then swap the bytes of the final 16-bit sum as necessary.
the examples below. Any permutation is allowed that
Braden, Borman, & Partridge [Page 3]
RFC 1071 Computing the Internet Checksum September 1988
all the even-numbered data bytes into one sum byte and the odd
numbered data bytes into the other sum byte
There are further coding techniques that can be exploited to speed
the checksum calculation
(1) Deferred
Depending upon the machine, it may be more efficient to
adding end-around carries until the main summation loop
finished
One approach is to sum 16-bit words in a 32-bit accumulator,
the overflows build up in the high-order 16 bits. This
typically avoids a carry-sensing instruction but requires
as many additions as would adding 32-bit segments; which
faster depends upon the detailed hardware architecture
(2) Unwinding
To reduce the loop overhead, it is often useful to "unwind"
inner sum loop, replicating a series of addition commands
one loop traversal. This technique often provides
savings, although it may complicate the logic of the
considerably
(3) Combine with Data
Like checksumming, copying data from one memory location
another involves per-byte overhead. In both cases,
bottleneck is essentially the memory bus, i.e., how fast
data can be fetched. On some machines (especially
slow and simple micro-computers), overhead can be
reduced by combining memory-to-memory copy and the checksumming
fetching the data only once for both
(4) Incremental
Finally, one can sometimes avoid recomputing the entire
when one header field is updated. The best-known example is
gateway changing the TTL field in the IP header, but there
other examples (for example, when updating a source route).
these cases it is possible to update the checksum
scanning the message or datagram
To update the checksum, simply add the differences of
sixteen bit integers that have been changed. To see why
works, observe that every 16-bit integer has an additive
and that addition is associative. From this it follows
given the original value m, the new value m', and the
Braden, Borman, & Partridge [Page 4]
RFC 1071 Computing the Internet Checksum September 1988
checksum C, the new checksum C' is
C' = C + (-m) + m' = C + (m' - m
3. Numerical
We now present explicit examples of calculating a simple 1'
complement sum on a 2's complement machine. The examples show
same sum calculated byte by bye, by 16-bits words in normal
swapped order, and 32 bits at a time in 3 different orders.
numbers are in hex
Byte-by-byte "Normal"
Order
Byte 0/1: 00 01 0001 0100
Byte 2/3: f2 03 f203 03f
Byte 4/5: f4 f5 f4f5 f5f
Byte 6/7: f6 f7 f6f7 f7f
--- --- ----- -----
Sum1: 2dc 1f0 2ddf0 1f2
dc f0 ddf0 f2
Carrys: 1 2 2 1
-- -- ---- ----
Sum2: dd f2 ddf2 f2
Final Swap: dd f2 ddf2 ddf
Byte 0/1/2/3: 0001f203 010003f2 03f20100
Byte 4/5/6/7: f4f5f6f7 f5f4f7f6 f7f6f5f
-------- -------- --------
Sum1: 0f4f7e8fa 0f6f4fbe8 0fbe8f6f
Carries: 0 0 0
Top half: f4f7 f6f4 fbe
Bottom half: e8fa fbe8 f6f
----- ----- -----
Sum2: 1ddf1 1f2dc 1f2
ddf1 f2dc f2
Carrys: 1 1 1
---- ---- ----
Sum3: ddf2 f2dd f2
Final Swap: ddf2 ddf2 ddf
Braden, Borman, & Partridge [Page 5]
RFC 1071 Computing the Internet Checksum September 1988
Finally, here an example of breaking the sum into two groups,
the second group starting on a odd boundary
Byte-by-byte
Byte 0/1: 00 01 0001
Byte 2/ : f2 (00) f200
--- --- -----
Sum1: f2 01 f201
Byte 4/5: 03 f4 03f
Byte 6/7: f5 f6 f5f
Byte 8/: f7 (00) f700
--- --- -----
Sum2: 1f0
Sum2: f0
Carry: 1
-----
Sum3: f0
Sum1: f201
Sum3 byte swapped: ebf
-----
Sum4: 1ddf
Sum4: ddf
Carry: 1
-----
Sum5: ddf
Braden, Borman, & Partridge [Page 6]
RFC 1071 Computing the Internet Checksum September 1988
4. Implementation
In this section we show examples of Internet checksum
algorithms that have been found to be efficient on a variety
CPU's. In each case, we show the core of the algorithm,
including environmental code (e.g., subroutine linkages) or special
case code
4.1 "C
The following "C" code algorithm computes the checksum with an
loop that sums 16-bits at a time in a 32-bit accumulator
in 6
{
/* Compute Internet Checksum for "count"
* beginning at location "addr".
*/
register long sum = 0;
while( count > 1 ) {
/* This is the inner loop */
sum += * (unsigned short) addr++;
count -= 2;
}
/* Add left-over byte, if any */
if( count > 0 )
sum += * (unsigned char *) addr
/* Fold 32-bit sum to 16 bits */
while (sum>>16)
sum = (sum & 0xffff) + (sum >> 16);
checksum = ~sum
}
Braden, Borman, & Partridge [Page 7]
RFC 1071 Computing the Internet Checksum September 1988
4.2 Motorola 68020
The following algorithm is given in assembler language for a
68020 chip. This algorithm performs the sum 32 bits at a time,
unrolls the loop with 16 replications. For clarity, we have
the logic to add the last fullword when the length is not a
of 4. The result is left in register d0.
With a 20MHz clock, this routine was measured at 134 usec/kB
random data. This algorithm was developed by Van Jacobson
movl d1,d
lsrl #6,d1 | count/64 = # loop
andl #0x3c,d2 | Then find fractions of a
negl d
andb #0xf,cc | Clear X (extended carry flag
jmp pc@(2$-.-2:b,d2) | Jump into
1$: | Begin inner loop...
movl a0@+,d2 | Fetch 32-bit
addxl d2,d0 | Add word + previous
movl a0@+,d2 | Fetch 32-bit
addxl d2,d0 | Add word + previous
| ... 14 more
2$:
dbra d1,1$ | (NB- dbra doesn't affect X
movl d0,d1 | Fold 32 bit sum to 16
swap d1 | (NB- swap doesn't affect X
addxw d1,d
jcc 3$
addw #1,d
3$:
andl #0xffff,d
Braden, Borman, & Partridge [Page 8]
RFC 1071 Computing the Internet Checksum September 1988
4.3
The following example, in assembler language for a Cray CPU,
contributed by Charley Kline. It implements the checksum
as a vector operation, summing up to 512 bytes at a time with a
summation unit of 32 bits. This example omits many details having
do with short blocks, for clarity
Register A1 holds the address of a 512-byte block of memory
checksum. First two copies of the data are loaded into two
registers. One is vector-shifted right 32 bits, while the other
vector-ANDed with a 32 bit mask. Then the two vectors are
together. Since all these operations chain, it produces one
per clock cycle. Then it collapses the result vector in a loop
adds each element to a scalar register. Finally, the end-
carry is performed and the result is folded to 16-bits
A0 A
VL 64 use full
S1 <32 form 32-bit mask from the right
A2 32
V1 ,A0,1 load packet into V
V2 S1&V1 Form right-hand 32-bits in V2.
V3 V1>A2 Form left-hand 32-bits in V3.
V1 V2+V3 Add the two together
A2 63 Prepare to collapse into a scalar
S1 0
S4 <16 Form 16-bit mask from the right
A4 16
CK$LOOP S2 V1,A
A2 A2-1
A0 A
S1 S1+S
JAN CK$
S2 S1&S4 Form right-hand 16-bits in S
S1 S1>A4 Form left-hand 16-bits in S
S1 S1+S
S2 S1&S4 Form right-hand 16-bits in S
S1 S1>A4 Form left-hand 16-bits in S
S1 S1+S
S1 #S1 Take one's
CMR At this point, S1 contains the checksum
Braden, Borman, & Partridge [Page 9]
RFC 1071 Computing the Internet Checksum September 1988
4.4 IBM 370
The following example, in assembler language for an IBM 370 CPU,
the data 4 bytes at a time. For clarity, we have omitted the
to add the last fullword when the length is not a multiple of 4,
to reverse the bytes when necessary. The result is left in
RCARRY
This code has been timed on an IBM 3090 CPU at 27 usec/KB
summing all one bits. This time is reduced to 24.3 usec/KB if
trouble is taken to word-align the addends (requiring special
at both the beginning and the end, and byte-swapping when
to compensate for starting on an odd byte).
* Registers RADDR and RCOUNT contain the address and length
* the block to be checksummed
*
* (RCARRY, RSUM) must be an even/odd register pair
* (RCOUNT, RMOD) must be an even/odd register pair
*
CHECKSUM SR RSUM,RSUM Clear working registers
SR RCARRY,
LA RONE,1 Set up constant 1.
*
SRDA RCOUNT,6 Count/64 to RCOUNT
AR RCOUNT,RONE +1 = # times in loop
SRL RMOD,26 Size of partial chunk to RMOD
AR RADDR,R3 Adjust addr to compensate
S RADDR,=F(64) jumping into the loop
SRL RMOD,1 (RMOD/4)*2 is halfword index
LH RMOD,DOPEVEC9(RMOD) Use magic dope-vector for offset
B LOOP(RMOD) and jump into the loop...
*
* Inner loop
*
LOOP AL RSUM,0(,RADDR) Add Logical
BC 12,*+6 Branch if no
AR RCARRY,RONE Add 1 end-
AL RSUM,4(,RADDR) Add Logical
BC 12,*+6 Branch if no
AR RCARRY,RONE Add 1 end-
*
* ... 14 more replications ...
*
A RADDR,=F'64' Increment address
BCT RCOUNT,LOOP Branch on
*
* Add Carries into sum, and fold to 16
*
ALR RCARRY,RSUM Add SUM and CARRY
BC 12,*+6 and take care of
Braden, Borman, & Partridge [Page 10]
RFC 1071 Computing the Internet Checksum September 1988
AR RCARRY,
SRDL RCARRY,16 Fold 32-bit sum
SRL RSUM,16 16-
ALR RCARRY,
C RCARRY,=X'0000FFFF' and take care of
BNH DONE last
S RCARRY,=X'0000FFFF
DONE X RCARRY,=X'0000FFFF' 1's
Braden, Borman, & Partridge [Page 11]
RFC 1071 Computing the Internet Checksum September 1988
IEN 45
Section 2.4.4.5
TCP Checksum Function
William W.
Bolt Beranek and Newman, Inc
50 Moulton
Cambridge MA 02138
5 June 1978
Braden, Borman, & Partridge [Page 12]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
1.
Checksums are included in packets in order that
encountered during transmission may be detected. For
protocols such as TCP [1,9] this is especially important
packets may have to cross wireless networks such as the
Radio Network [2] and Atlantic Satellite Network [3]
packets may be corrupted. Internet protocols (e.g., those
real time speech transmission) can tolerate a certain level
transmission errors and forward error correction techniques
possibly no checksum at all might be better. The focus in
paper is on checksum functions for protocols such as TCP
the required reliable delivery is achieved by retransmission
Even if the checksum appears good on a message which has
received, the message may still contain an undetected error.
probability of this is bounded by 2**(-C) where C is the
of checksum bits. Errors can arise from hardware (and software
malfunctions as well as transmission errors. Hardware
errors are usually manifested in certain well known ways and
is desirable to account for this in the design of the
function. Ideally no error of the "common hardware failure"
would go undetected
An example of a failure that the current checksum
handles successfully is picking up a bit in the network
(or I/O buss, memory channel, etc.). This will always render
checksum bad. For an example of how the current function
inadequate, assume that a control signal stops functioning in
network interface and the interface stores zeros in place of
real data. These "all zero" messages appear to have
checksums. Noise on the "There's Your Bit" line of the
Interface [4] may go undetected because the extra bits input
cause the checksum to be perturbed (i.e., shifted) in the
way as the data was
Although messages containing undetected errors will
be passed to higher levels of protocol, it is likely that
will not make sense at that level. In the case of TCP most
messages will be ignored, but some could cause a connection to
aborted. Garbled data could be viewed as a problem for a
of protocol above TCP which itself may have a checksuming scheme
This paper is the first step in design of a new checksum
for TCP and some other Internet protocols. Several
properties of the current function are identified. If
- 1 -
Braden, Borman, & Partridge [Page 13]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
these should be retained in any new function. A number
plausible checksum schemes are investigated. Of these only
"product code" seems to be simple enough for consideration
2. The Current TCP Checksum
The current function is oriented towards sixteen-bit
such as the PDP-11 but can be computed easily on other
(e.g., PDP-10). A packet is thought of as a string of 16-
bytes and the checksum function is the one's complement sum (
with end-around carry) of those bytes. It is the one'
complement of this sum which is stored in the checksum field
the TCP header. Before computing the checksum value, the
places a zero in the checksum field of the packet. If
checksum value computed by a receiver of the packet is zero,
packet is assumed to be valid. This is a consequence of
"negative" number in the checksum field exactly cancelling
contribution of the rest of the packet
Ignoring the difficulty of actually evaluating the
function for a given packet, the way of using the
described above is quite simple, but it assumes some
of the checksum operator (one's complement addition, "+" in
follows):
(P1) + is commutative. Thus, the order in
the 16-bit bytes are "added" together
unimportant
(P2) + has at least one identity element (
current function has two: +0 and -0).
allows the sender to compute the
function by placing a zero in the packet
field before computing the value
(P3) + has an inverse. Thus, the receiver
evaluate the checksum function and expect a zero
(P4) + is associative, allowing the checksum
to be anywhere in the packet and the 16-bit
to be scanned sequentially
Mathematically, these properties of the binary operation "+"
the set of 16-bit numbers forms an Abelian group [5]. Of course
there are many Abelian groups but not all would be
for use as checksum operators. (Another operator
- 2 -
Braden, Borman, & Partridge [Page 14]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
available in the PDP-11 instruction set that has all of
properties is exclusive-OR, but XOR is unsatisfactory for
reasons.)
Albeit imprecise, another property which must be preserved in
future checksum scheme is
(P5) + is fast to compute on a variety of
with limited storage requirements
The current function is quite good in this respect. On
PDP-11 the inner loop looks like
LOOP: ADD (R1)+,R0 ; Add the next 16-bit
ADC R0 ; Make carry be end-
SOB R2,LOOP ; Loop over entire packet
( 4 memory cycles per 16-bit byte )
On the PDP-10 properties P1-4 are exploited further and
16-bit bytes per loop are processed
LOOP: ILDB THIS,PTR ; Get 2 16-bit
ADD SUM,THIS ; Add into current
JUMPGE SUM,CHKSU2 ; Jump if fewer than 8
LDB THIS,[POINT 20,SUM,19] ; Get left 16 and
ANDI SUM,177777 ; Save just low 16
ADD SUM,THIS ; Fold in
CHKSU2: SOJG COUNT,LOOP ; Loop over entire
( 3.1 memory cycles per 16-bit byte )
The "extra" instruction in the loops above are required
convert the two's complement ADD instruction(s) into a one'
complement add by making the carries be end-around. One'
complement arithmetic is better than two's complement because
is equally sensitive to errors in all bit positions. If two'
complement addition were used, an even number of 1's could
dropped (or picked up) in the most significant bit
without affecting the value of the checksum. It is just
property that makes some sort of addition preferable to a
exclusive-OR which is frequently used but permits an even
of drops (pick ups) in any bit channel. RIM10B paper tape
used on PDP-10s [10] uses two's complement add because space
the loader program is extremely limited
- 3 -
Braden, Borman, & Partridge [Page 15]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
Another property of the current checksum scheme is
(P6) Adding the checksum to a packet does not
the information bytes. Peterson [6] calls this
"systematic" code
This property allows intermediate computers such as
machines to act on fields (i.e., the Internet
Address) without having to first decode the packet.
Redundancy Checks used for error correction are not
either. However, most applications of CRCs tend to
error detection rather than correction and consequently can
the message unchanged, with the CRC check bits being appended
the end. The 24-bit CRC used by ARPANET IMPs and Very
Host Interfaces [4] and the ANSI standards for 800 and 6250
per inch magnetic tapes (described in [11]) use this mode
Note that the operation of higher level protocols are not (
design) affected by anything that may be done by a gateway
on possibly invalid packets. It is permissible for gateways
validate the checksum on incoming packets, but in
gateways will not know how to do this if the checksum is
protocol-specific feature
A final property of the current checksum scheme which is
a consequence of P1 and P4 is
(P7) The checksum may be incrementally modified
This property permits an intermediate gateway to add
to a packet, for instance a timestamp, and "add" an
change to the checksum field of the packet. Note that
checksum will still be end-to-end since it was not
recomputed
3. Product
Certain "product codes" are potentially useful for
purposes. The following is a brief description of product
in the context of TCP. More general treatment can be found
Avizienis [7] and probably other more recent works
The basic concept of this coding is that the message (packet)
be sent is formed by transforming the original source message
adding some "check" bits. By reading this and applying
(possibly different) transformation, a receiver can
- 4 -
Braden, Borman, & Partridge [Page 16]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
the original message and determine if it has been
during transmission
Mo Ms
----- ----- -----
| A | code | 7 | decode | A |
| B | ==> | 1 | ==> | B |
| C | | 4 | | C |
----- |...| -----
| 2 | check plus "valid"
-----
Original Sent
With product codes the transformation is Ms = K * Mo . That is
the message sent is simply the product of the original
Mo and some well known constant K . To decode, the
Ms is divided by K which will yield Mr as the quotient
0 as the remainder if Mr is to be considered the same as Mo .
The first problem is selecting a "good" value for K, the "
factor". K must be relatively prime to the base chosen
express the message. (Example: Binary messages with
incorrectly chosen to be 8. This means that Ms looks
like Mo except that three zeros have been appended. The
way the message could look bad to a receiver dividing by 8 is
the error occurred in one of those three bits.)
For TCP the base R will be chosen to be 2**16. That is,
16-bit byte (word on the PDP-11) will be considered as a digit
a big number and that number is the message. Thus
Mo = SIGMA [ Bi * (R**i)] , Bi is i-th
i=0 to
Ms = K *
Corrupting a single digit of Ms will yield Ms' = Ms +or
C*(R**j) for some radix position j . The receiver will
Ms'/K = Mo +or- C(R**j)/K. Since R and K are relatively prime
C*(R**j) cannot be any exact multiple of K. Therefore,
division will result in a non-zero remainder which indicates
Ms' is a corrupted version of Ms. As will be seen, a
choice for K is (R**b - 1), for some b which is the "
length" which controls the degree of detection to be had
- 5 -
Braden, Borman, & Partridge [Page 17]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
burst errors which affect a string of digits (i.e., 16-bit bytes
in the message. In fact b will be chosen to be 1, so K
be 2**16 - 1 so that arithmetic operations will be simple.
means that all bursts of 15 or fewer bits will be detected
According to [7] this choice for b results in the
expression for the fraction of undetected weight 2 errors
f = 16(k-1)/[32(16k-3) + (6/k)] where k is the message length
For large messages f approaches 3.125 per cent as k goes
infinity
Multiple precision multiplication and division are normally
complex operations, especially on small machines which
lack even single precision multiply and divide operations.
exception to this is exactly the case being dealt with here --
the factor is 2**16 - 1 on machines with a word length of 16
bits. The reason for this is due to the following identity
Q*(R**j) = Q, mod (R-1) 0 <= Q <
That is, any digit Q in the selected radix (0, 1, ... R-1)
multiplied by any power of the radix will have a remainder of
when divided by the radix minus 1.
Example: In decimal R = 10. Pick Q = 6.
6 = 0 * 9 + 6 = 6, mod 9
60 = 6 * 9 + 6 = 6, mod 9
600 = 66 * 9 + 6 = 6, mod 9 etc
More to the point, rem(31415/9) = rem((30000+1000+400+10+5)/9)
= (3 mod 9) + (1 mod 9) + (4 mod 9) + (1 mod 9) + (5 mod 9)
= (3+1+4+1+5) mod 9
= 14 mod 9
= 5
So, the remainder of a number divided by the radix minus one
be found by simply summing the digits of the number. Since
radix in the TCP case has been chosen to be 2**16 and the
factor is 2**16 - 1, a message can quickly be checked by
all of the 16-bit words (on a PDP-11), with carries
end-around. If zero is the result, the message can be
valid. Thus, checking a product coded message is exactly
same complexity as with the current TCP checksum
- 6 -
Braden, Borman, & Partridge [Page 18]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
In order to form Ms, the sender must multiply the
precision "number" Mo by 2**16 - 1. Or, Ms = (2**16)Mo - Mo
This is performed by shifting Mo one whole word's worth
precision and subtracting Mo. Since carries must
between digits, but it is only the current digit which is
interest, one's complement arithmetic is used
(2**16)Mo = Mo0 + Mo1 + Mo2 + ... + MoX + 0
- Mo = - ( Mo0 + Mo1 + ......... + MoX
--------- ----------------------------------
Ms = Ms0 + Ms1 + ... -
A loop which implements this function on a PDP-11 might
like
LOOP: MOV -2(R2),R0 ; Next byte of (2**16)
SBC R0 ; Propagate carries from last
SUB (R2)+,R0 ; Subtract byte of
MOV R0,(R3)+ ; Store in
SOB R1,LOOP ; Loop over entire
; 8 memory cycles per 16-bit
Note that the coding procedure is not done in-place since it
not systematic. In general the original copy, Mo, will have
be retained by the sender for retransmission purposes
therefore must remain readable. Thus the MOV R0,(R3)+
required which accounts for 2 of the 8 memory cycles per loop
The coding procedure will add exactly one 16-bit word to
message since Ms < (2**16)Mo . This additional 16 bits will
at the tail of the message, but may be moved into the
location in the TCP header immediately before transmission.
receiver will have to undo this to put Ms back into
format before decoding the message
The code in the receiver for fully decoding the message may
inferred by observing that any word in Ms contains
difference between two successive words of Mo minus the
from the previous word, and the low order word contains minus
low word of Mo. So the low order (i.e., rightmost) word of Mr
just the negative of the low order byte of Ms. The next word
Mr is the next word of Ms plus the just computed word of
plus the carry from that previous computation
A slight refinement of the procedure is required in order
protect against an all-zero message passing to the destination
This will appear to have a valid checksum because Ms'/K = 0/
- 7 -
Braden, Borman, & Partridge [Page 19]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
= 0 with 0 remainder. The refinement is to make the coding
Ms = K*Mo + C where C is some arbitrary, well-known constant
Adding this constant requires a second pass over the message,
this will typically be very short since it can stop as soon
carries stop propagating. Chosing C = 1 is sufficient in
cases
The product code checksum must be evaluated in terms of
desired properties P1 - P7. It has been shown that a factor
two more machine cycles are consumed in computing or verifying
product code checksum (P5 satisfied?).
Although the code is not systematic, the checksum can be
quickly without decoding the message. If the
Destination Address is located at the least significant end
the packet (where the product code computation begins) then it
possible for a gateway to decode only enough of the message
see this field without having to decode the entire message
Thus, P6 is at least partially satisfied. The
properties P1 through P4 are not satisfied, but only a
amount of computation is needed to account for this --
message needs to be reformatted as previously mentioned
P7 is satisfied since the product code checksum can
incrementally updated to account for an added word, although
procedure is somewhat involved. Imagine that the
message has two halves, H1 and H2. Thus, Mo = H1*(R**j) + H2.
The timestamp word is to be inserted between these halves to
a modified Mo' = H1*(R**(j+1)) + T*(R**j) + H2. Since K
been chosen to be R-1, the transmitted message Ms' = Mo'(R-1).
Then
Ms' = Ms*R + T(R-1)(R**j) + P2((R-1)**2)
= Ms*R + T*(R**(j+1)) + T*(R**j) + P2*(R**2) - 2*P2*R - P
Recalling that R is 2**16, the word size on the PDP-11,
multiplying by R means copying down one word in memory. So
the first term of Ms' is simply the unmodified message
down one word. The next term is the new data T added into
Ms' being formed beginning at the (j+1)th word. The addition
fairly easy here since after adding in T all that is left
propagating the carry, and that can stop as soon as no carry
produced. The other terms can be handle similarly
- 8 -
Braden, Borman, & Partridge [Page 20]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
4. More Complicated
There exists a wealth of theory on error detecting and
codes. Peterson [6] is an excellent reference. Most of
"CRC" schemes are designed to be implemented using a
register with a feedback network composed of exclusive-ORs
Simulating such a logic circuit with a program would be too
to be useful unless some programming trick is discovered
One such trick has been proposed by Kirstein [8]. Basically,
few bits (four or eight) of the current shift register state
combined with bits from the input stream (from Mo) and the
is used as an index to a table which yields the new
register state and, if the code is not systematic, bits for
output stream (Ms). A trial coding of an especially "good"
function using four-bit bytes showed showed this technique to
about four times as slow as the current checksum function.
was true for both the PDP-10 and PDP-11 machines. Of
desirable properties listed above, CRC schemes satisfy only P
(It has an inverse.), and P6 (It is systematic.). Placement
the checksum field in the packet is critical and the CRC
be incrementally modified
Although the bulk of coding theory deals with binary codes,
of the theory works if the alphabet contains q symbols,
q is a power of a prime number. For instance q taken as 2**16
should make a great deal of the theory useful on a word-by-
basis
5. Outboard
When a function such as computing an involved checksum
extensive processing, one solution is to put that processing
an outboard processor. In this way "encode message" and "
message" become single instructions which do not tax the
host processor. The Digital Equipment Corporation VAX/780
computer is equipped with special hardware for generating
checking CRCs [13]. In general this is not a very good
since such a processor must be constructed for every
host machine which uses TCP messages
It is conceivable that the gateway functions for a large host
be performed entirely in an "Internet Frontend Machine".
machine would be responsible for forwarding packets
- 9 -
Braden, Borman, & Partridge [Page 21]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
either from the network(s) or from the Internet protocol
in the connected host, and for reassembling Internet
into segments and passing these to the host. Another
of this machine would be to check the checksum so that
segments given to the host are known to be valid at the time
leave the frontend. Since computer cycles are assumed to be
inexpensive and available in the frontend, this seems reasonable
The problem with attempting to validate checksums in the
is that it destroys the end-to-end character of the checksum.
anything, this is the most powerful feature of the TCP checksum
There is a way to make the host-to-frontend link be covered
the end-to-end checksum. A separate, small protocol must
developed to cover this link. After having validated an
packet from the network, the frontend would pass it to the
saying "here is an Internet segment for you. Call it #123".
host would save this segment, and send a copy back to
frontend saying, "Here is what you gave me as #123. Is it OK?".
The frontend would then do a word-by-word comparison with
first transmission, and tell the host either "Here is #123
again", or "You did indeed receive #123 properly. Release it
the appropriate module for further processing."
The headers on the messages crossing the host-frontend link
most likely be covered by a fairly strong checksum so
information like which function is being performed and
message reference numbers are reliable. These headers would
quite short, maybe only sixteen bits, so the checksum could
quite strong. The bulk of the message would not be checksumed
course
The reason this scheme reduces the computing burden on the
is that all that is required in order to validate the
using the end-to-end checksum is to send it back to the
machine. In the case of the PDP-10, this requires only 0.5
memory cycles per 16-bit byte of Internet message, and only a
processor cycles to setup the required transfers
6.
There is an ordering of checksum functions: first and simplest
none at all which provides no error detection or correction
Second, is sending a constant which is checked by the receiver
This also is extremely weak. Third, the exclusive-OR of the
may be sent. XOR takes the minimal amount of computer time
generate and check, but is not a good checksum. A two'
complement sum of the data is somewhat better and takes no
- 10 -
Braden, Borman, & Partridge [Page 22]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
computer time to compute. Fifth, is the one's complement
which is what is currently used by TCP. It is slightly
expensive in terms of computer time. The next step is a
code. The product code is strongly related to one's
sum, takes still more computer time to use, provides a bit
protection against common hardware failures, but has
objectionable properties. Next is a genuine CRC polynomial code
used for checking purposes only. This is very expensive for
program to implement. Finally, a full CRC error correcting
detecting scheme may be used
For TCP and Internet applications the product code scheme
viable. It suffers mainly in that messages must be (at
partially) decoded by intermediate gateways in order that
can be forwarded. Should product codes not be chosen as
improved checksum, some slight modification to the
scheme might be possible. For instance the "add and rotate
function used for paper tape by the PDP-6/10 group at
Artificial Intelligence Laboratory at M.I.T. Project MAC [12]
could be useful if it can be proved that it is better than
current scheme and that it can be computed efficiently on
variety of machines
- 11 -
Braden, Borman, & Partridge [Page 23]
RFC 1071 Computing the Internet Checksum September 1988
Internet Experiment Note 45 5 June 1978
TCP Checksum Function Design William W.
[1] Cerf, V.G. and Kahn, Robert E., "A Protocol for Packet
Communications," IEEE Transactions on Communications, vol
COM-22, No. 5, May 1974.
[2] Kahn, Robert E., "The Organization of Computer Resources
a Packet Radio Network", IEEE Transactions on Communications
vol. COM-25, no. 1, pp. 169-178, January 1977.
[3] Jacobs, Irwin, et al., "CPODA - A Demand Assignment
for SatNet", Fifth Data Communications Symposium,
27-9, 1977, Snowbird,
[4] Bolt Beranek and Newman, Inc. "Specifications for
Interconnection of a Host and an IMP", Report 1822,
1976 edition
[5] Dean, Richard A., "Elements of Abstract Algebra", John
and Sons, Inc., 1966
[6] Peterson, W. Wesley, "Error Correcting Codes", M.I.T.
Cambridge MA, 4th edition, 1968.
[7] Avizienis, Algirdas, "A Study of the Effectiveness of Fault
Detecting Codes for Binary Arithmetic", Jet
Laboratory Technical Report No. 32-711, September 1, 1965.
[8] Kirstein, Peter, private
[9] Cerf, V. G. and Postel, Jonathan B., "Specification
Internetwork Transmission Control Program Version 3",
University of Southern California Information
Institute, January 1978.
[10] Digital Equipment Corporation, "PDP-10 Reference Handbook",
1970, pp. 114-5.
[11] Swanson, Robert, "Understanding Cyclic Redundancy Codes",
Computer Design, November, 1975, pp. 93-99.
[12] Clements, Robert C., private communication
[13] Conklin, Peter F., and Rodgers, David P., "
Minicomputer Designed by Team Evaluation of Hardware/
Tradeoffs", Computer Design, April 1978, pp. 136-7.
- 12 -
Braden, Borman, & Partridge [Page 24]
if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.
RFC documents can be found at I.E.T.F.
Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX