As per Relevance of the word duplicate, we have this rfc below:











Network Working Group V.
Request for Comments: 1185
R.

L.

October 1990


TCP Extension for High-Speed

Status of This

This memo describes an Experimental Protocol extension to TCP for
Internet community, and requests discussion and suggestions
improvements. Please refer to the current edition of the "
Official Protocol Standards" for the standardization state and
of this protocol. Distribution of this memo is unlimited



This memo describes a small extension to TCP to support
operation over very high-speed paths, using sender
transmitted using the TCP Echo option proposed in RFC-1072.

1.

TCP uses positive acknowledgments and retransmissions to
reliable end-to-end delivery over a full-duplex virtual
called a connection [Postel81]. A connection is defined by its
end points; each end point is a "socket", i.e., a (host,port) pair
To protect against data corruption, TCP uses an end-to-end checksum
Duplication and reordering are handled using a fine-grained
number space, with each octet receiving a distinct sequence number

The TCP protocol [Postel81] was designed to operate reliably
almost any transmission medium regardless of transmission rate
delay, corruption, duplication, or reordering of segments.
practice, proper TCP implementations have demonstrated
robustness in adapting to a wide range of network characteristics
For example, TCP implementations currently adapt to transfer rates
the range of 100 bps to 10**7 bps and round-trip delays in the
1 ms to 100 seconds

However, the introduction of fiber optics is resulting in ever-
transmission speeds, and the fastest paths are moving out of
domain for which TCP was originally engineered. This memo and RFC
1072 [Jacobson88] propose modest extensions to TCP to extend



Jacobson, Braden & Zhang [Page 1]

RFC 1185 TCP over High-Speed Paths October 1990


domain of its application to higher speeds

There is no one-line answer to the question: "How fast can TCP go?".
The issues are reliability and performance, and these depend upon
round-trip delay and the maximum time that segments may be queued
the Internet, as well as upon the transmission speed. We must
through these relationships very carefully if we are to
extend TCP's domain

TCP performance depends not upon the transfer rate itself, but
upon the product of the transfer rate and the round-trip delay.
"bandwidth*delay product" measures the amount of data that
"fill the pipe"; it is the buffer space required at sender
receiver to obtain maximum throughput on the TCP connection over
path. RFC-1072 proposed a set of TCP extensions to improve
efficiency for "LFNs" (long fat networks), i.e., networks with
bandwidth*delay products

On the other hand, high transfer rate can threaten TCP reliability
violating the assumptions behind the TCP mechanism for
detection and sequencing. The present memo specifies a solution
this problem, extending TCP reliability to transfer rates well
the foreseeable upper limit of bandwidth

An especially serious kind of error may result from an
reuse of TCP sequence numbers in data segments. Suppose that an "
duplicate segment", e.g., a duplicate data segment that was
in Internet queues, was delivered to the receiver at the wrong
so that its sequence numbers fell somewhere within the
window. There would be no checksum failure to warn of the error,
the result could be an undetected corruption of the data.
of an old duplicate ACK segment at the transmitter could be
slightly less serious: it is likely to lock up the connection so
no further progress can be made and a RST is required
resynchronize the two ends

Duplication of sequence numbers might happen in either of two ways

(1) Sequence number wrap-around on the current

A TCP sequence number contains 32 bits. At a high
transfer rate, the 32-bit sequence space may be "wrapped
(cycled) within the time that a segment may be delayed
queues. Section 2 discusses this case and proposes a
to reject old duplicates on the current connection

(2) Segment from an earlier connection




Jacobson, Braden & Zhang [Page 2]

RFC 1185 TCP over High-Speed Paths October 1990


Suppose a connection terminates, either by a proper
sequence or due to a host crash, and the same connection (i.e.,
using the same pair of sockets) is immediately reopened.
delayed segment from the terminated connection could fall
the current window for the new incarnation and be accepted
valid. This case is discussed in Section 3.

TCP reliability depends upon the existence of a bound on the
of a segment: the "Maximum Segment Lifetime" or MSL. An MSL
generally required by any reliable transport protocol, since
sequence number field must be finite, and therefore any
number may eventually be reused. In the Internet protocol suite,
MSL bound is enforced by an IP-layer mechanism, the "Time-to-Live"
TTL field

Watson's Delta-T protocol [Watson81] includes network-
mechanisms for precise enforcement of an MSL. In contrast, the
mechanism for MSL enforcement is loosely defined and even
loosely implemented in the Internet. Therefore, it is unwise
depend upon active enforcement of MSL for TCP connections, and it
unrealistic to imagine setting MSL's smaller than the current
(e.g., 120 seconds specified for TCP). The timestamp
described in the following section gives a way out of this
for high-speed networks


2. SEQUENCE NUMBER WRAP-

2.1

Avoiding reuse of sequence numbers within the same connection
simple in principle: enforce a segment lifetime shorter than
time it takes to cycle the sequence space, whose size
effectively 2**31.

More specifically, if the maximum effective bandwidth at which
is able to transmit over a particular path is B bytes per second
then the following constraint must be satisfied for error-
operation

2**31 / B > MSL (secs) [1]

The following table shows the value for Twrap = 2**31/B
seconds, for some important values of the bandwidth B







Jacobson, Braden & Zhang [Page 3]

RFC 1185 TCP over High-Speed Paths October 1990


Network B*8 B
bits/sec bytes/sec
_______ _______ ______ ______

ARPANET 56kbps 7KBps 3*10**5 (~3.6 days

DS1 1.5Mbps 190KBps 10**4 (~3 hours

Ethernet 10Mbps 1.25MBps 1700 (~30 mins

DS3 45Mbps 5.6MBps 380

FDDI 100Mbps 12.5MBps 170

Gigabit 1Gbps 125MBps 17


It is clear why wrap-around of the sequence space was not
problem for 56kbps packet switching or even 10Mbps Ethernets.
the other hand, at DS3 and FDDI speeds, Twrap is comparable to
2 minute MSL assumed by the TCP specification [Postel81].
towards gigabit speeds, Twrap becomes too small for
enforcement by the Internet TTL mechanism

The 16-bit window field of TCP limits the effective bandwidth B
2**16/RTT, where RTT is the round-trip time in
[McKenzie89]. If the RTT is large enough, this limits B to
value that meets the constraint [1] for a large MSL value.
example, consider a transcontinental backbone with an RTT of 60
(set by the laws of physics). With the bandwidth*delay
limited to 64KB by the TCP window size, B is then limited
1.1MBps, no matter how high the theoretical transfer rate of
path. This corresponds to cycling the sequence number space
Twrap= 2000 secs, which is safe in today's Internet

Based on this reasoning, an earlier RFC [McKenzie89] has
that expanding the TCP window space as proposed in RFC-1072
lead to sequence wrap-around and hence to possible
corruption. We believe that this is mis-identifying the culprit
which is not the larger window but rather the high bandwidth

For example, consider a (very large) FDDI LAN with a
of 10km. Using the speed of light, we can compute the
across the ring as (2*10**4)/(3*10**8) = 67 microseconds,
the delay*bandwidth product is then 833 bytes. A
connection across this LAN using a window of only 833
will run at the full 100mbps and can wrap the sequence
in about 3 minutes, very close to the MSL of TCP. Thus,



Jacobson, Braden & Zhang [Page 4]

RFC 1185 TCP over High-Speed Paths October 1990


speed alone can cause a reliability problem with
number wrap-around, even without extended windows

An "obvious" fix for the problem of cycling the sequence space
to increase the size of the TCP sequence number field.
example, the sequence number field (and also the
field) could be expanded to 64 bits. However, the proposals
making such a change while maintaining compatibility with
TCP have tended towards complexity and ugliness

This memo proposes a simple solution to the problem, using the
echo options defined in RFC-1072. Section 2.2 which
describes the original use of these options to carry timestamps
order to measure RTT accurately. Section 2.3 proposes a method
using these same timestamps to reject old duplicate segments
could corrupt an open TCP connection. Section 3 discusses
application of this mechanism to avoiding old duplicates
previous incarnations

2.2 TCP

RFC-1072 defined two TCP options, Echo and Echo Reply.
carries a 32-bit number, and the receiver of the option
return this same value to the source host in an Echo Reply option

RFC-1072 furthermore describes the use of these options to
32-bit timestamps, for measuring the RTT. A TCP sending
would include Echo options containing the current clock value
The receiver would echo these timestamps in returning
(generally, ACK segments). The difference between a
from an Echo Reply option and the current time would then
the RTT at the sender

This mechanism was designed to solve the following problem:
all TCP implementations base their RTT measurements on a sample
only one packet per window. If we look at RTT estimation as
signal processing problem (which it is), a data signal at
frequency (the packet rate) is being sampled at a lower
(the window rate). Unfortunately, this lower sampling
violates Nyquist's criteria and may introduce "aliasing"
into the estimated RTT [Hamming77].

A good RTT estimator with a conservative retransmission
calculation can tolerate the aliasing when the sampling
is "close" to the data frequency. For example, with a window
8 packets, the sample rate is 1/8 the data frequency -- less
an order of magnitude different. However, when the window is
or hundreds of packets, the RTT estimator may be seriously



Jacobson, Braden & Zhang [Page 5]

RFC 1185 TCP over High-Speed Paths October 1990


error, resulting in spurious retransmissions

A solution to the aliasing problem that actually simplifies
sender substantially (since the RTT code is typically the
biggest protocol cost for TCP) is as follows: the will
place a timestamp in each segment and the receiver will
these timestamps back in ACK segments. Then a single
gives the sender an accurate RTT measurement for every ACK
(which will correspond to every other data segment, with
sensible receiver). RFC-1072 defined a timestamp echo option
this purpose

It is vitally important to use the timestamp echo option with
windows; otherwise, the door is opened to some
instabilities due to aliasing. Furthermore, the option
probably useful for all TCP's, since it simplifies the sender

2.3 Avoiding Old Duplicate

Timestamps carried from sender to receiver in TCP Echo options
also be used to prevent data corruption caused by sequence
wrap-around, as this section describes

2.3.1 Basic

Assume that every received TCP segment contains a timestamp
The basic idea is that a segment received with a timestamp
is earlier than the timestamp of the most recently
segment can be discarded as an old duplicate.
specifically, the following processing is to be performed
normal incoming segments

R1) If the timestamp in the arriving segment timestamp is
than the timestamp of the most recently received in
sequence segment, treat the arriving segment as
acceptable

If SEG.LEN > 0, send an acknowledgement in reply
specified in RFC-793 page 69, and drop the segment
otherwise, just silently drop the segment.*

_________________________
*Sending an ACK segment in reply is not strictly necessary, since
case can only arise when a later in-order segment has already
received. However, for consistency and simplicity, we
treating a timestamp failure the same way TCP treats any
unacceptable segment




Jacobson, Braden & Zhang [Page 6]

RFC 1185 TCP over High-Speed Paths October 1990


R2) If the segment is outside the window, reject it (
TCP processing

R3) If an arriving segment is in-sequence (i.e, at the
window edge), accept it normally and record its timestamp

R4) Otherwise, treat the segment as a normal in-window, out
of-sequence TCP segment (e.g., queue it for later
to the user).


Steps R2-R4 are the normal TCP processing steps specified
RFC-793, except that in R3 the latest timestamp is set
each in-sequence segment that is accepted. Thus, the
timestamp recorded at the receiver corresponds to the left
of the window and only advances when the left edge
[Jacobson88].

It is important to note that the timestamp is checked only
a segment first arrives at the receiver, regardless of
it is in-sequence or is queued. Consider the
example

Suppose the segment sequence: A.1, B.1, C.1, ..., Z.1
been sent, where the letter indicates the sequence
and the digit represents the timestamp. Suppose also
segment B.1 has been lost. The highest in-
timestamp is 1 (from A.1), so C.1, ..., Z.1 are
acceptable and are queued. When B is retransmitted
segment B.2 (using the latest timestamp), it fills
hole and causes all the segments through Z to
acknowledged and passed to the user. The timestamps
the queued segments are *not* inspected again at
time, since they have already been accepted. When B.2
accepted, the receivers's current timestamp is set to 2.

This rule is vital to allow reasonable performance under loss
A full window of data is in transit at all times, and after
loss a full window less one packet will show up out-of-
to be queued at the receiver (e.g., up to ~2**30 bytes
data); the timestamp option must not result in discarding
data

In certain unlikely circumstances, the algorithm of rules R1-R
could lead to discarding some segments unnecessarily, as
in the following example

Suppose again that segments: A.1, B.1, C.1, ..., Z.1



Jacobson, Braden & Zhang [Page 7]

RFC 1185 TCP over High-Speed Paths October 1990


been sent in sequence and that segment B.1 has been lost
Furthermore, suppose delivery of some of C.1, ... Z.1
delayed until AFTER the retransmission B.2 arrives at
receiver. These delayed segments will be
unnecessarily when they do arrive, since their
are now out of date

This case is very unlikely to occur. If the retransmission
triggered by a timeout, some of the segments C.1, ... Z.1
have been delayed longer than the RTO time. This is
an unlikely event, or there would be many spurious timeouts
retransmissions. If B's retransmission was triggered by
"fast retransmit" algorithm, i.e., by duplicate ACK's, then
queued segments that caused these ACK's must have been
already

Even if a segment was delayed past the RTO, the
acknowledgment (SACK) facility of RFC-1072 will cause
delayed packets to be retransmitted at the same time as B.2,
avoiding an extra RTT and therefore causing a very
performance penalty

We know of no case with a significant probability of
in which timestamps will cause performance degradation
unnecessarily discarding segments

2.3.2 Header

"Header prediction" [Jacobson90] is a high-
transport protocol implementation technique that is is
important for high-speed links. This technique optimizes
code for the most common case: receiving a segment
and in order. Using header prediction, the receiver asks
question, "Is this segment the next in sequence?"
question can be answered in fewer machine instructions than
question, "Is this segment within the window?"

Adding header prediction to our timestamp procedure leads
the following sequence for processing an arriving TCP segment

H1) Check timestamp (same as step R1 above

H2) Do header prediction: if segment is next in sequence
if there are no special conditions requiring
processing, accept the segment, record its timestamp,
skip H3.

H3) Process the segment normally, as specified in RFC-793.



Jacobson, Braden & Zhang [Page 8]

RFC 1185 TCP over High-Speed Paths October 1990


This includes dropping segments that are outside
window and possibly sending acknowledgments, and
in-window, out-of-sequence segments

However, the timestamp check in step H1 is very unlikely
fail, and it is a relatively expensive operation since
requires interval arithmetic on a finite field. To
this check on every single segment seems like
implementation engineering, defeating the purpose of
prediction. Therefore, we suggest that an
interchange H1 and H2, i.e., perform header prediction FIRST
performing H1 and H3 only if header prediction fails.
believe that this change might gain 5-10% in performance
high-speed networks

This reordering does raise a theoretical hazard: a segment
2**32 bytes in the past may arrive at exactly the wrong
and be accepted mistakenly by the header-prediction step.
make the following argument to show that the probability
this failure is negligible

If all segments are equally likely to show up as
duplicates, then the probability of an old
exactly matching the left window edge is the
segment size (MSS) divided by the size of the
space. This ratio must be less than 2**-16, since
must be < 2**16; for example, it will be (2**12)/(2**32) =
2**-20 for an FDDI link. However, the older a segment is
the less likely it is to be retained in the Internet,
under any reasonable model of segment lifetime
probability of an old duplicate exactly at the left
edge must be much smaller than 2**16.

The 16 bit TCP checksum also allows a basic
of one part in 2**16. A protocol mechanism
reliability exceeds the reliability of the TCP
should be considered "good enough", i.e., it won'
contribute significantly to the overall error rate.
therefore believe we can ignore the problem of an
duplicate being accepted by doing header prediction
checking the timestamp

2.3.3 Timestamp

It is important to understand that the receiver algorithm
timestamps does not involve clock synchronization with
sender. The sender's clock is used to stamp the segments,
the sender uses this fact to measure RTT's. However,



Jacobson, Braden & Zhang [Page 9]

RFC 1185 TCP over High-Speed Paths October 1990


receiver treats the timestamp as simply a monotone-
serial number, without any necessary connection to its clock
From the receiver's viewpoint, the timestamp is acting as
logical extension of the high-order bits of the
number

However, the receiver algorithm dpes place some requirements
the frequency of the timestamp "clock":

(a) Timestamp clock must not be "too slow".

It must tick at least once for each 2**31 bytes sent.
fact, in order to be useful to the sender for round
timing, the clock should tick at least once per window'
worth of data, and even with the RFC-1072
extension, 2**31 bytes must be at least two windows

To make this more quantitative, any clock faster than 1
tick/sec will reject old duplicate segments for
speeds of ~2 Gbps; a 1ms clock will work up to
speeds of 2 Tbps (10**12 bps!).

(b) Timestamp clock must not be "too fast".

Its cycling time must be greater than MSL seconds.
the clock (timestamp) is 32 bits and the worst-case MSL
255 seconds, the maximum acceptable clock frequency is
tick every 59 ns

However, since the sender is using the timestamp for
calculations, the timestamp doesn't need to have much
resolution than the granularity of the retransmit timer
e.g., tens or hundreds of milliseconds

Thus, both limits are easily satisfied with a reasonable
rate in the range 1-100ms per tick

Using the timestamp option relaxes the requirements on MSL
avoiding sequence number wrap-around. For example, with a 1
timestamp clock, the 32-bit timestamp will wrap its sign bit
25 days. Thus, it will reject old duplicates on the
connection as long as MSL is 25 days or less. This appears
be a very safe figure. If the timestamp has 10 ms resolution
the MSL requirement is boosted to 250 days. An MSL of 25
or longer can probably be assumed by the gateway system
requiring precise MSL enforcement by the TTL value in the
layer




Jacobson, Braden & Zhang [Page 10]

RFC 1185 TCP over High-Speed Paths October 1990


3. DUPLICATES FROM EARLIER INCARNATIONS OF

We turn now to the second potential cause of old duplicate
errors: packets from an earlier incarnation of the same connection
The appendix contains a review the mechanisms currently included
TCP to handle this problem. These mechanisms depend upon
enforcement of a maximum segment lifetime (MSL) by the
layer

The MSL required to prevent failures due to an earlier
incarnation does not depend (directly) upon the transfer rate
However, the timestamp option used as described in Section 2
provide additional security against old duplicates from
connections. Furthermore, we will see that with the universal use
the timestamp option, enforcement of a maximum segment lifetime
no longer be required for reliable TCP operation

There are two cases to be considered (see the appendix for
explanation): (1) a system crashing (and losing connection state
and restarting, and (2) the same connection being closed and
without a loss of host state. These will be described in
following two sections

3.1 System Crash with Loss of

TCP's quiet time of one MSL upon system startup handles the
of connection state in a system crash/restart. For
explanation, see for example "When to Keep Quiet" in the
protocol specification [Postel81]. The MSL that is required
does not depend upon the transfer speed. The current TCP MSL of 2
minutes seems acceptable as an operational compromise, as
host systems take this long to boot after a crash

However, the timestamp option may be used to ease the
requirements (or to provide additional security against
corruption). If timestamps are being used and if the
clock can be guaranteed to be monotonic over a
crash/restart, i.e., if the first value of the sender's
clock after a crash/restart can be guaranteed to be greater
the last value before the restart, then a quiet time will
unnecessary

To dispense totally with the quiet time would seem to require
the host clock be synchronized to a time source that is
over the crash/restart period, with an accuracy of one
clock tick or better. Fortunately, we can back off from
strict requirement. Suppose that the clock is always re
synchronized to within N timestamp clock ticks and that



Jacobson, Braden & Zhang [Page 11]

RFC 1185 TCP over High-Speed Paths October 1990


(extended with a quiet time, if necessary) takes more than
ticks. This will guarantee monotonicity of the timestamps,
can then be used to reject old duplicates even without an
MSL

3.2 Closing and Reopening a

When a TCP connection is closed, a delay of 2*MSL in TIME-
state ties up the socket pair for 4 minutes (see Section 3.5
[Postel81]. Applications built upon TCP that close one
and open a new one (e.g., an FTP data transfer connection
Stream mode) must choose a new socket pair each time. This
serves two different purposes

(a) Implement the full-duplex reliable close handshake of TCP

The proper time to delay the final close step is not
related to the MSL; it depends instead upon the RTO for
FIN segments and therefore upon the RTT of the path.*
Although there is no formal upper-bound on RTT,
network engineering practice makes an RTT greater than 1
minute very unlikely. Thus, the 4 minute delay in TIME-
state works satisfactorily to provide a reliable full-
TCP close. Note again that this is independent of
enforcement and network speed

The TIME-WAIT state could cause an indirect
problem if an application needed to repeatedly close
connection and open another at a very high frequency,
the number of available TCP ports on a host is less
2**16. However, high network speeds are not the
contributor to this problem; the RTT is the limiting
in how quickly connections can be opened and closed
Therefore, this problem will no worse at high
speeds

(b) Allow old duplicate segements to expire

Suppose that a host keeps a cache of the last
received from each remote host. This can be used to
old duplicate segments from earlier incarnations of
_________________________
*Note: It could be argued that the side that is sending a FIN
what degree of reliability it needs, and therefore it should be
to determine the length of the TIME-WAIT delay for the FIN'
recipient. This could be accomplished with an appropriate TCP
in FIN segments




Jacobson, Braden & Zhang [Page 12]

RFC 1185 TCP over High-Speed Paths October 1990


connection, if the timestamp clock can be guaranteed to
ticked at least once since the old conennection was open
This requires that the TIME-WAIT delay plus the RTT
must be at least one tick of the sender's timestamp clock

Note that this is a variant on the mechanism proposed
Garlick, Rom, and Postel (see the appendix), which
each host to maintain connection records containing
highest sequence numbers on every connection.
timestamps instead, it is only necessary to keep one
per remote host, regardless of the number of
connections to that host

We conclude that if all hosts used the TCP timestamp
described in Section 2, enforcement of a maximum segment
would be unnecessary and the quiet time at system startup could
shortened or removed. In any case, the timestamp mechanism
provide additional security against old duplicates from
connection incarnations. However, a 4 minute TIME-WAIT
(unrelated to MSL enforcement or network speed) must be
to provide the reliable close handshake of TCP

4.

We have presented a mechanism, based upon the TCP timestamp
option of RFC-1072, that will allow very high TCP transfer
without reliability problems due to old duplicate segments on
same connection. This mechanism also provides additional
against intrusion of old duplicates from earlier incarnations of
same connection. If the timestamp mechanism were used by all hosts
the quiet time at system startup could be eliminated and
of a maximum segment lifetime (MSL) would no longer be necessary



[Cerf76] Cerf, V., "TCP Resynchronization", Tech Note #79,
Systems Lab, Stanford, January 1976.

[Dalal74] Dalal, Y., "More on Selecting Sequence Numbers",
Protocol Note #4, October 1974.

[Garlick77] Garlick, L., R. Rom, and J. Postel, "Issues in
Host-to-Host Protocols", Proc. Second Berkeley Workshop
Distributed Data Management and Computer Networks, May 1977.

[Hamming77] Hamming, R., "Digital Filters", ISBN 0-13-212571-4,
Prentice Hall, Englewood Cliffs, N.J., 1977.




Jacobson, Braden & Zhang [Page 13]

RFC 1185 TCP over High-Speed Paths October 1990


[Jacobson88] Jacobson, V., and R. Braden, "TCP Extensions
Long-Delay Paths", RFC 1072, LBL and USC/Information
Institute, October 1988.

[Jacobson90] Jacobson, V., "4BSD Header Prediction", ACM
Communication Review, April 1990.

[McKenzie89] McKenzie, A., "A Problem with the TCP Big
Option", RFC 1110, BBN STC, August 1989.

[Postel81] Postel, J., "Transmission Control Protocol", RFC 793,
DARPA, September 1981.

[Tomlinson74] Tomlinson, R., "Selecting Sequence Numbers",
Protocol Note #2, September 1974.

[Watson81] Watson, R., "Timer-based Mechanisms in
Transport Protocol Connection Management", Computer Networks
Vol. 5, 1981.
































Jacobson, Braden & Zhang [Page 14]

RFC 1185 TCP over High-Speed Paths October 1990


APPENDIX -- Protection against Old Duplicates in

During the development of TCP, a great deal of effort was devoted
the problem of protecting a TCP connection from segments left
earlier incarnations of the same connection. Several
mechanisms were proposed for this purpose [Tomlinson74] [Dalal74]
[Cerf76] [Garlick77].

The connection parameters that are required in this discussion are

Tc = Connection duration in seconds

Nc = Total number of bytes sent on connection

B = Effective bandwidth of connection = Nc/Tc

Tomlinson proposed a scheme with two parts: a clock-driven
of ISN (Initial Sequence Number) for a connection, and
resynchronization procedure [Tomlinson74]. The clock-driven
chooses

ISN = (integer(R*t)) mod 2**32 [2]

where t is the current time relative to an arbitrary origin, and R
a constant. R was intended to be chosen so that ISN will
faster than sequence numbers will be used up on the connection
However, at high speeds this will not be true; the consequences
this will be discussed below

The clock-driven choice of ISN in formula [2] guarantees freedom
old duplicates matching a reopened connection if the
connection was "short-lived" and "slow". By "short-lived", we mean
connection that stayed open for a time Tc less than the time to
the ISN, i.e., Tc < 2**32/R seconds. By "slow", we mean that
effective transfer rate B is less than R

This is illustrated in Figure 1, where sequence numbers are
against time. The asterisks show the ISN lines from formula [2],
while the circles represent the trajectories of several short-
incarnations of the same connection, each terminating at the "x".

Note: allowing rapid reuse of connections was believed to be
important goal during the early TCP development.
requirement was driven by the hope that TCP would serve as
basis for user-level transaction protocols as well
connection-oriented protocols. The paradigm discussed was
"Christmas Tree" or "Kamikazee" segment that contained SYN
FIN bits as well as data. Enthusiasm for this was



Jacobson, Braden & Zhang [Page 15]

RFC 1185 TCP over High-Speed Paths October 1990


dampened when it was observed that the 3-way SYN handshake
the FIN handshake mean that 5 packets are required for a
exchange. Furthermore, the TIME-WAIT state delay implies
the same connection really cannot be reopened immediately.
further work has been done in this area, although
applications (especially SMTP) often generate very short
sessions. The reuse problem is generally avoided by using
different port pair for each connection


|- 2**32 ISN
| * *
| * *
| * *
| *x *
| o *
^ | * *
| | * x *
| * o *
S | *o *
e | o *
q | * *
| * *
# | * x *
| *o *
|o_______________*____________
^ Time -->
4.55


Figure 1. Clock-Driven ISN avoiding duplication
short-Lived, slow connections


However, clock-driven ISN selection does not protect against
duplicate packets for a long-lived or fast connection:
connection may close (or crash) just as the ISN has cycled around
reached the same value again. If the connection is then reopened,
datagram still in transit from the old connection may fall into
current window. This is illustrated by Figure 2 for a slow, long
lived connection, and by Figures 3 and 4 for fast connections.
each case, the point "x" marks the place at which the
connection closes or crashes. The arrow in Figure 2 illustrates
old duplicate segment. Figure 3 shows a connection whose total
count Nc < 2**32, while Figure 4 concerns Nc >= 2**32.

To prevent the duplication illustrated in Figure 2,
proposed to "resynchronize" the connection sequence numbers if



Jacobson, Braden & Zhang [Page 16]

RFC 1185 TCP over High-Speed Paths October 1990


came within an MSL of the ISN. Resynchronization might take the
of a delay (point "y") or the choice of a new sequence number (
"z").

|- 2**32 ISN
| * *
| * *
| * *
| * *
| * *
^ | * *
| | * *
| * *
S | * *
e | * x*
q | * o *
| * o *
# | *o *
| * *
|*_________________*____________
^ Time -->
4.55

Figure 2. Resynchronization to Avoid
on Slow, Long-Lived



|- 2**32 ISN
| * *
| x o * *
| * *
| o-->o* *
| * *
^ | o o *
| | * *
| o * *
S | * *
e | o * *
q | * *
| o* *
# | * *
| o *
|*_________________*____________
^ Time -->
4.55

Figure 3. Duplication on Fast Connection: Nc < 2**32



Jacobson, Braden & Zhang [Page 17]

RFC 1185 TCP over High-Speed Paths October 1990


|- 2**32 ISN
| o * *
| x * *
| * *
| o * *
| o *
^ | * *
| | o * *
| * o *
S | * *
e | o * *
q | * o *
| * *
# | o *
| * o *
|*_________________*____________
^ Time -->
4.55

Figure 4. Duplication on Fast Connection: Nc > 2**32

In summary, Figures 1-4 illustrated four possible failure modes
old duplicate packets from an earlier incarnation. We will
these four modes F1 , F2, F3, and F4:


F1: B < R, Tc < 4.55 hrs. (Figure 1)

F2: B < R, Tc >= 4.55 hrs. (Figure 2)

F3: B >= R, Nc < 2**32 (Figure 3)

F4: B >= R, Nc >= 2**32 (Figure 4)


Another limitation of clock-driven ISN selection should be mentioned
Tomlinson assumed that the current time t in formula [2] is
from a clock that is persistent over a system crash. For his
to work correctly, the clock must be restarted with an accuracy
1/R seconds (e.g, 4 microseconds in the case of TCP). While this
be possible for some hosts and some crashes, in most cases there
be an uncertainty in the clock after a crash that ranges from
second to several minutes

As a result of this random clock offset after
reinitialization, there is a possibility that old segments
before the crash may fall into the window of a new
incarnation. The solution to this problem that was adopted in



Jacobson, Braden & Zhang [Page 18]

RFC 1185 TCP over High-Speed Paths October 1990


final TCP spec is a "quiet time" of MSL seconds when the system
initialized [Postel81, p. 28]. No TCP connection can be opened
the expiration of this quiet time

A different approach was suggested by Garlick, Rom, and
[Garlick77]. Rather than using clock-driven ISN selection,
proposed to maintain connection records containing the last ISN
on every connection. To immediately open a new incarnation of
connection, the ISN is taken to be greater than the last
number of the previous incarnation, so that the new incarnation
have unique sequence numbers. To handle a system crash,
proposed a quiet time, i.e., a delay at system startup time to
old duplicates to expire. Note that the connection records need
kept only for MSL seconds; after that, no collision is possible,
a new connection can start with sequence number zero

The scheme finally adopted for TCP combines features of both
proposals. TCP uses three mechanisms

(A) ISN selection is clock-driven to handle short-lived connections
The parameter R = 250KBps, so that the ISN value cycles
2**32/R = 4.55 hours

(B) (One end of) a closed connection is left in a "busy" state
known as "TIME-WAIT" state, for a time of 2*MSL. TIME-
state handles the proper close of a long-lived
without resynchronization. It also allows reliable
of the full-duplex close handshake

(C) There is a quiet time of one MSL at system startup.
handles a crash of a long-lived connection and avoids
resynchronization problems in (A).

Notice that (B) and (C) together are logically sufficient to
accidental reuse of sequence numbers from a different incarnation
for any of the failure modes F1-F4. (A) is not logically
since the close delay (B) makes it impossible to reopen the same
connection immediately. However, the use of (A) does give
assurance in a common case, perhaps compensating for a host that
set its TIME-WAIT state delay too short

Some TCP implementations have permitted a connection in the TIME-
state to be reopened immediately by the other side, thus short
circuiting mechanism (B). Specifically, a new SYN for the
socket pair is accepted when the earlier incarnation is still
TIME-WAIT state. Old duplicates in one direction can be avoided
choosing the ISN to be the next unused sequence number from
preceding connection (i.e., FIN+1); this is essentially



Jacobson, Braden & Zhang [Page 19]

RFC 1185 TCP over High-Speed Paths October 1990


application of the scheme of Garlick, Rom, and Postel, using
connection block in TIME-WAIT state as the connection record

However, the connection is still vulnerable to old duplicates in
other direction. Mechanism (A) prevents trouble in mode F1,
failures can arise in F2, F3, or F4; of these, F2, on short,
connections, is the most dangerous

Finally, we note TCP will operate reliably without any MSL-
mechanisms in the following restricted domain

* Total data sent is less then 2**32 octets,

* Effective sustained rate less than 250KBps,

* Connection duration less than 4.55 hours

At the present time, the great majority of current TCP usage
into this restricted domain. The third component,
duration, is the most commonly violated

Security

Security issues are not discussed in this memo

Authors'

Van
University of
Lawrence Berkeley
Mail Stop 46
Berkeley, CA 94720

Phone: (415) 486-6411
EMail: van@CSAM.LBL.


Bob
University of Southern
Information Sciences
4676 Admiralty
Marina del Rey, CA 90292

Phone: (213) 822-1511
EMail: Braden@ISI.






Jacobson, Braden & Zhang [Page 20]

RFC 1185 TCP over High-Speed Paths October 1990


Lixia
XEROX Palo Alto Research
3333 Coyote Hill
Palo Alto, CA 94304

Phone: (415) 494-4415
EMail: lixia@PARC.XEROX.












































Jacobson, Braden & Zhang [Page 21]







if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.




RFC documents can be found at I.E.T.F.



Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX







Spectrum