As per Relevance of the word congestion, we have this rfc below:











Network Working Group M.
Request for Comments: 2581 NASA Glenn/Sterling
Obsoletes: 2001 V.
Category: Standards Track ACIRI /
W.

April 1999


TCP Congestion

Status of this

This document specifies an Internet standards track protocol for
Internet community, and requests discussion and suggestions
improvements. Please refer to the current edition of the "
Official Protocol Standards" (STD 1) for the standardization
and status of this protocol. Distribution of this memo is unlimited

Copyright

Copyright (C) The Internet Society (1999). All Rights Reserved



This document defines TCP's four intertwined congestion
algorithms: slow start, congestion avoidance, fast retransmit,
fast recovery. In addition, the document specifies how TCP
begin transmission after a relatively long idle period, as well
discussing various acknowledgment generation methods

1.

This document specifies four TCP [Pos81] congestion
algorithms: slow start, congestion avoidance, fast retransmit
fast recovery. These algorithms were devised in [Jac88] and [Jac90].
Their use with TCP is standardized in [Bra89].

This document is an update of [Ste97]. In addition to specifying
congestion control algorithms, this document specifies what
connections should do after a relatively long idle period, as well
specifying and clarifying some of the issues pertaining to TCP
generation

Note that [Ste94] provides examples of these algorithms in action
[WS95] provides an explanation of the source code for the
implementation of these algorithms




Allman, et. al. Standards Track [Page 1]

RFC 2581 TCP Congestion Control April 1999


This document is organized as follows. Section 2 provides
definitions which will be used throughout the document. Section 3
provides a specification of the congestion control algorithms
Section 4 outlines concerns related to the congestion
algorithms and finally, section 5 outlines security considerations

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
document are to be interpreted as described in [Bra97].

2.

This section provides the definition of several terms that will
used throughout the remainder of this document

SEGMENT
A segment is ANY TCP/IP data or acknowledgment packet (or both).

SENDER MAXIMUM SEGMENT SIZE (SMSS): The SMSS is the size of
largest segment that the sender can transmit. This value can
based on the maximum transmission unit of the network, the
MTU discovery [MD90] algorithm, RMSS (see next item), or
factors. The size does not include the TCP/IP headers
options

RECEIVER MAXIMUM SEGMENT SIZE (RMSS): The RMSS is the size of
largest segment the receiver is willing to accept. This is
value specified in the MSS option sent by the receiver
connection startup. Or, if the MSS option is not used, 536
[Bra89]. The size does not include the TCP/IP headers
options

FULL-SIZED SEGMENT: A segment that contains the maximum number
data bytes permitted (i.e., a segment containing SMSS bytes
data).

RECEIVER WINDOW (rwnd) The most recently advertised receiver window

CONGESTION WINDOW (cwnd): A TCP state variable that limits
amount of data a TCP can send. At any given time, a TCP MUST
send data with a sequence number higher than the sum of
highest acknowledged sequence number and the minimum of cwnd
rwnd

INITIAL WINDOW (IW): The initial window is the size of the sender'
congestion window after the three-way handshake is completed





Allman, et. al. Standards Track [Page 2]

RFC 2581 TCP Congestion Control April 1999


LOSS WINDOW (LW): The loss window is the size of the
window after a TCP sender detects loss using its
timer

RESTART WINDOW (RW): The restart window is the size of
congestion window after a TCP restarts transmission after an
period (if the slow start algorithm is used; see section 4.1
more discussion).

FLIGHT SIZE: The amount of data that has been sent but not
acknowledged

3. Congestion Control

This section defines the four congestion control algorithms:
start, congestion avoidance, fast retransmit and fast recovery
developed in [Jac88] and [Jac90]. In some situations it may
beneficial for a TCP sender to be more conservative than
algorithms allow, however a TCP MUST NOT be more aggressive than
following algorithms allow (that is, MUST NOT send data when
value of cwnd computed by the following algorithms would not
the data to be sent).

3.1 Slow Start and Congestion

The slow start and congestion avoidance algorithms MUST be used by
TCP sender to control the amount of outstanding data being
into the network. To implement these algorithms, two variables
added to the TCP per-connection state. The congestion window (cwnd
is a sender-side limit on the amount of data the sender can
into the network before receiving an acknowledgment (ACK), while
receiver's advertised window (rwnd) is a receiver-side limit on
amount of outstanding data. The minimum of cwnd and rwnd
data transmission

Another state variable, the slow start threshold (ssthresh), is
to determine whether the slow start or congestion avoidance
is used to control data transmission, as discussed below

Beginning transmission into a network with unknown
requires TCP to slowly probe the network to determine the
capacity, in order to avoid congesting the network with
inappropriately large burst of data. The slow start algorithm
used for this purpose at the beginning of a transfer, or
repairing loss detected by the retransmission timer






Allman, et. al. Standards Track [Page 3]

RFC 2581 TCP Congestion Control April 1999


IW, the initial value of cwnd, MUST be less than or equal to 2*
bytes and MUST NOT be more than 2 segments

We note that a non-standard, experimental TCP extension allows that
TCP MAY use a larger initial window (IW), as defined in equation 1
[AFP98]:

IW = min (4*SMSS, max (2*SMSS, 4380 bytes)) (1)

With this extension, a TCP sender MAY use a 3 or 4 segment
window, provided the combined size of the segments does not
4380 bytes. We do NOT allow this change as part of the
defined by this document. However, we include discussion of (1)
the remainder of this document as a guideline for those
with the change, rather than conforming to the present standards
TCP congestion control

The initial value of ssthresh MAY be arbitrarily high (for example
some implementations use the size of the advertised window), but
may be reduced in response to congestion. The slow start
is used when cwnd < ssthresh, while the congestion
algorithm is used when cwnd > ssthresh. When cwnd and ssthresh
equal the sender may use either slow start or congestion avoidance

During slow start, a TCP increments cwnd by at most SMSS bytes
each ACK received that acknowledges new data. Slow start ends
cwnd exceeds ssthresh (or, optionally, when it reaches it, as
above) or when congestion is observed

During congestion avoidance, cwnd is incremented by 1 full-
segment per round-trip time (RTT). Congestion avoidance
until congestion is detected. One formula commonly used to
cwnd during congestion avoidance is given in equation 2:

cwnd += SMSS*SMSS/cwnd (2)

This adjustment is executed on every incoming non-duplicate ACK
Equation (2) provides an acceptable approximation to the
principle of increasing cwnd by 1 full-sized segment per RTT. (
that for a connection in which the receiver acknowledges every
segment, (2) proves slightly more aggressive than 1 segment per RTT
and for a receiver acknowledging every-other packet, (2) is
aggressive.)








Allman, et. al. Standards Track [Page 4]

RFC 2581 TCP Congestion Control April 1999


Implementation Note: Since integer arithmetic is usually used in
implementations, the formula given in equation 2 can fail to
cwnd when the congestion window is very large (larger
SMSS*SMSS). If the above formula yields 0, the result SHOULD
rounded up to 1 byte

Implementation Note: older implementations have an
additive constant on the right-hand side of equation (2). This
incorrect and can actually lead to diminished performance [PAD+98].

Another acceptable way to increase cwnd during congestion
is to count the number of bytes that have been acknowledged by
for new data. (A drawback of this implementation is that it
maintaining an additional state variable.) When the number of
acknowledged reaches cwnd, then cwnd can be incremented by up to
bytes. Note that during congestion avoidance, cwnd MUST NOT
increased by more than the larger of either 1 full-sized segment
RTT, or the value computed using equation 2.

Implementation Note: some implementations maintain cwnd in units
bytes, while others in units of full-sized segments. The latter
find equation (2) difficult to use, and may prefer to use
counting approach discussed in the previous paragraph

When a TCP sender detects segment loss using the
timer, the value of ssthresh MUST be set to no more than the
given in equation 3:

ssthresh = max (FlightSize / 2, 2*SMSS) (3)

As discussed above, FlightSize is the amount of outstanding data
the network

Implementation Note: an easy mistake to make is to simply use cwnd
rather than FlightSize, which in some implementations
incidentally increase well beyond rwnd

Furthermore, upon a timeout cwnd MUST be set to no more than the
window, LW, which equals 1 full-sized segment (regardless of
value of IW). Therefore, after retransmitting the dropped
the TCP sender uses the slow start algorithm to increase the
from 1 full-sized segment to the new value of ssthresh, at
point congestion avoidance again takes over








Allman, et. al. Standards Track [Page 5]

RFC 2581 TCP Congestion Control April 1999


3.2 Fast Retransmit/Fast

A TCP receiver SHOULD send an immediate duplicate ACK when an out
of-order segment arrives. The purpose of this ACK is to inform
sender that a segment was received out-of-order and which
number is expected. From the sender's perspective, duplicate
can be caused by a number of network problems. First, they can
caused by dropped segments. In this case, all segments after
dropped segment will trigger duplicate ACKs. Second, duplicate
can be caused by the re-ordering of data segments by the network (
a rare event along some network paths [Pax97]). Finally,
ACKs can be caused by replication of ACK or data segments by
network. In addition, a TCP receiver SHOULD send an immediate
when the incoming segment fills in all or part of a gap in
sequence space. This will generate more timely information for
sender recovering from a loss through a retransmission timeout,
fast retransmit, or an experimental loss recovery algorithm, such
NewReno [FH98].

The TCP sender SHOULD use the "fast retransmit" algorithm to
and repair loss, based on incoming duplicate ACKs. The
retransmit algorithm uses the arrival of 3 duplicate ACKs (4
identical ACKs without the arrival of any other intervening packets
as an indication that a segment has been lost. After receiving 3
duplicate ACKs, TCP performs a retransmission of what appears to
the missing segment, without waiting for the retransmission timer
expire

After the fast retransmit algorithm sends what appears to be
missing segment, the "fast recovery" algorithm governs
transmission of new data until a non-duplicate ACK arrives.
reason for not performing slow start is that the receipt of
duplicate ACKs not only indicates that a segment has been lost,
also that segments are most likely leaving the network (although
massive segment duplication by the network can invalidate
conclusion). In other words, since the receiver can only generate
duplicate ACK when a segment has arrived, that segment has left
network and is in the receiver's buffer, so we know it is no
consuming network resources. Furthermore, since the ACK "clock
[Jac88] is preserved, the TCP sender can continue to transmit
segments (although transmission must continue using a reduced cwnd).

The fast retransmit and fast recovery algorithms are
implemented together as follows

1. When the third duplicate ACK is received, set ssthresh to no
than the value given in equation 3.




Allman, et. al. Standards Track [Page 6]

RFC 2581 TCP Congestion Control April 1999


2. Retransmit the lost segment and set cwnd to ssthresh plus 3*SMSS
This artificially "inflates" the congestion window by the
of segments (three) that have left the network and which
receiver has buffered

3. For each additional duplicate ACK received, increment cwnd
SMSS. This artificially inflates the congestion window in
to reflect the additional segment that has left the network

4. Transmit a segment, if allowed by the new value of cwnd and
receiver's advertised window

5. When the next ACK arrives that acknowledges new data, set cwnd
ssthresh (the value set in step 1). This is termed "deflating
the window

This ACK should be the acknowledgment elicited by
retransmission from step 1, one RTT after the
(though it may arrive sooner in the presence of significant out
of-order delivery of data segments at the receiver).
Additionally, this ACK should acknowledge all the
segments sent between the lost segment and the receipt of
third duplicate ACK, if none of these were lost

Note: This algorithm is known to generally not recover
efficiently from multiple losses in a single flight of
[FF96]. One proposed set of modifications to address this
can be found in [FH98].

4. Additional

4.1 Re-starting Idle

A known problem with the TCP congestion control algorithms
above is that they allow a potentially inappropriate burst of
to be transmitted after TCP has been idle for a relatively
period of time. After an idle period, TCP cannot use the ACK
to strobe new segments into the network, as all the ACKs have
from the network. Therefore, as specified above, TCP can
send a cwnd-size line-rate burst into the network after an
period

[Jac88] recommends that a TCP use slow start to restart
after a relatively long idle period. Slow start serves to
the ACK clock, just as it does at the beginning of a transfer.
mechanism has been widely deployed in the following manner. When
has not received a segment for more than one retransmission timeout
cwnd is reduced to the value of the restart window (RW)



Allman, et. al. Standards Track [Page 7]

RFC 2581 TCP Congestion Control April 1999


transmission begins

For the purposes of this standard, we define RW = IW

We note that the non-standard experimental extension to TCP
in [AFP98] defines RW = min(IW, cwnd), with the definition of
adjusted per equation (1) above

Using the last time a segment was received to determine whether
not to decrease cwnd fails to deflate cwnd in the common case
persistent HTTP connections [HTH98]. In this case, a WWW
receives a request before transmitting data to the WWW browser.
reception of the request makes the test for an idle connection fail
and allows the TCP to begin transmission with a
inappropriately large cwnd

Therefore, a TCP SHOULD set cwnd to no more than RW before
transmission if the TCP has not sent data in an interval
the retransmission timeout

4.2 Generating

The delayed ACK algorithm specified in [Bra89] SHOULD be used by
TCP receiver. When used, a TCP receiver MUST NOT excessively
acknowledgments. Specifically, an ACK SHOULD be generated for
least every second full-sized segment, and MUST be generated
500 ms of the arrival of the first unacknowledged packet

The requirement that an ACK "SHOULD" be generated for at least
second full-sized segment is listed in [Bra89] in one place as
SHOULD and another as a MUST. Here we unambiguously state it is
SHOULD. We also emphasize that this is a SHOULD, meaning that
implementor should indeed only deviate from this requirement
careful consideration of the implications. See the discussion
"Stretch ACK violation" in [PAD+98] and the references therein for
discussion of the possible performance problems with generating
less frequently than every second full-sized segment

In some cases, the sender and receiver may not agree on
constitutes a full-sized segment. An implementation is deemed
comply with this requirement if it sends at least one
every time it receives 2*RMSS bytes of new data from the sender
where RMSS is the Maximum Segment Size specified by the receiver
the sender (or the default value of 536 bytes, per [Bra89], if
receiver does not specify an MSS option during
establishment). The sender may be forced to use a segment size
than RMSS due to the maximum transmission unit (MTU), the path
discovery algorithm or other factors. For instance, consider



Allman, et. al. Standards Track [Page 8]

RFC 2581 TCP Congestion Control April 1999


case when the receiver announces an RMSS of X bytes but the
ends up using a segment size of Y bytes (Y < X) due to path
discovery (or the sender's MTU size). The receiver will
stretch ACKs if it waits for 2*X bytes to arrive before an ACK
sent. Clearly this will take more than 2 segments of size Y bytes
Therefore, while a specific algorithm is not defined, it is
for receivers to attempt to prevent this situation, for example
acknowledging at least every second segment, regardless of size
Finally, we repeat that an ACK MUST NOT be delayed for more than 500
ms waiting on a second full-sized segment to arrive

Out-of-order data segments SHOULD be acknowledged immediately,
order to accelerate loss recovery. To trigger the fast
algorithm, the receiver SHOULD send an immediate duplicate ACK
it receives a data segment above a gap in the sequence space.
provide feedback to senders recovering from losses, the
SHOULD send an immediate ACK when it receives a data segment
fills in all or part of a gap in the sequence space

A TCP receiver MUST NOT generate more than one ACK for every
segment, other than to update the offered window as the
application consumes new data [page 42, Pos81][Cla82].

4.3 Loss Recovery

A number of loss recovery algorithms that augment fast retransmit
fast recovery have been suggested by TCP researchers. While some
these algorithms are based on the TCP selective acknowledgment (SACK
option [MMFR96], such as [FF96,MM96a,MM96b], others do not
SACKs [Hoe96,FF96,FH98]. The non-SACK algorithms use "
acknowledgments" (ACKs which cover new data, but not all the
outstanding when loss was detected) to trigger retransmissions
While this document does not standardize any of the
algorithms that may improve fast retransmit/fast recovery,
enhanced algorithms are implicitly allowed, as long as they
the general principles of the basic four algorithms outlined above

Therefore, when the first loss in a window of data is detected
ssthresh MUST be set to no more than the value given by equation (3).
Second, until all lost segments in the window of data in question
repaired, the number of segments transmitted in each RTT MUST be
more than half the number of outstanding segments when the loss
detected. Finally, after all loss in the given window of
has been successfully retransmitted, cwnd MUST be set to no more
ssthresh and congestion avoidance MUST be used to further
cwnd. Loss in two successive windows of data, or the loss of
retransmission, should be taken as two indications of congestion and
therefore, cwnd (and ssthresh) MUST be lowered twice in this case



Allman, et. al. Standards Track [Page 9]

RFC 2581 TCP Congestion Control April 1999


The algorithms outlined in [Hoe96,FF96,MM96a,MM6b] follow
principles of the basic four congestion control algorithms
in this document

5. Security

This document requires a TCP to diminish its sending rate in
presence of retransmission timeouts and the arrival of
acknowledgments. An attacker can therefore impair the performance
a TCP connection by either causing data packets or
acknowledgments to be lost, or by forging excessive
acknowledgments. Causing two congestion control events back-to-
will often cut ssthresh to its minimum value of 2*SMSS, causing
connection to immediately enter the slower-performing
avoidance phase

The Internet to a considerable degree relies on the
implementation of these algorithms in order to preserve
stability and avoid congestion collapse. An attacker could cause
endpoints to respond more aggressively in the face of congestion
forging excessive duplicate acknowledgments or
acknowledgments for new data. Conceivably, such an attack
drive a portion of the network into congestion collapse

6. Changes Relative to RFC 2001

This document has been extensively rewritten editorially and it
not feasible to itemize the list of changes between the
documents. The intention of this document is not to change any of
recommendations given in RFC 2001, but to further clarify cases
were not discussed in detail in 2001. Specifically, this
suggests what TCP connections should do after a relatively long
period, as well as specifying and clarifying some of the
pertaining to TCP ACK generation. Finally, the allowable upper
for the initial congestion window has also been raised from one
two segments



The four algorithms that are described were developed by
Jacobson

Some of the text from this document is taken from "TCP/
Illustrated, Volume 1: The Protocols" by W. Richard
(Addison-Wesley, 1994) and "TCP/IP Illustrated, Volume 2:
Implementation" by Gary R. Wright and W. Richard Stevens (Addison
Wesley, 1995). This material is used with the permission
Addison-Wesley



Allman, et. al. Standards Track [Page 10]

RFC 2581 TCP Congestion Control April 1999


Neal Cardwell, Sally Floyd, Craig Partridge and Joe Touch
a number of helpful suggestions



[AFP98] Allman, M., Floyd, S. and C. Partridge, "Increasing TCP'
Initial Window Size, RFC 2414, September 1998.

[Bra89] Braden, R., "Requirements for Internet Hosts --
Communication Layers", STD 3, RFC 1122, October 1989.

[Bra97] Bradner, S., "Key words for use in RFCs to
Requirement Levels", BCP 14, RFC 2119, March 1997.

[Cla82] Clark, D., "Window and Acknowledgment Strategy in TCP",
813, July 1982.

[FF96] Fall, K. and S. Floyd, "Simulation-based Comparisons
Tahoe, Reno and SACK TCP", Computer Communication Review
July 1996. ftp://ftp.ee.lbl.gov/papers/sacks.ps.Z

[FH98] Floyd, S. and T. Henderson, "The NewReno Modification
TCP's Fast Recovery Algorithm", RFC 2582, April 1999.

[Flo94] Floyd, S., "TCP and Successive Fast Retransmits.
report", October 1994.
ftp://ftp.ee.lbl.gov/papers/fastretrans.ps

[Hoe96] Hoe, J., "Improving the Start-up Behavior of a
Control Scheme for TCP", In ACM SIGCOMM, August 1996.

[HTH98] Hughes, A., Touch, J. and J. Heidemann, "Issues in
Slow-Start Restart After Idle", Work in Progress

[Jac88] Jacobson, V., "Congestion Avoidance and Control",
Communication Review, vol. 18, no. 4, pp. 314-329, Aug
1988. ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z

[Jac90] Jacobson, V., "Modified TCP Congestion Avoidance Algorithm",
end2end-interest mailing list, April 30, 1990.
ftp://ftp.isi.edu/end2end/end2end-interest-1990.mail

[MD90] Mogul, J. and S. Deering, "Path MTU Discovery", RFC 1191,
November 1990.







Allman, et. al. Standards Track [Page 11]

RFC 2581 TCP Congestion Control April 1999


[MM96a] Mathis, M. and J. Mahdavi, "Forward Acknowledgment:
TCP Congestion Control", Proceedings of SIGCOMM'96, August
1996, Stanford, CA.
fromhttp://www.psc.edu/networking/papers/papers.

[MM96b] Mathis, M. and J. Mahdavi, "TCP Rate-Halving with
Parameters", Technical report. Available
http://www.psc.edu/networking/papers/FACKnotes/current

[MMFR96] Mathis, M., Mahdavi, J., Floyd, S. and A. Romanow, "
Selective Acknowledgement Options", RFC 2018, October 1996.

[PAD+98] Paxson, V., Allman, M., Dawson, S., Fenner, W., Griner, J.,
Heavens, I., Lahey, K., Semke, J. and B. Volz, "Known
Implementation Problems", RFC 2525, March 1999.

[Pax97] Paxson, V., "End-to-End Internet Packet Dynamics",
Proceedings of SIGCOMM '97, Cannes, France, Sep. 1997.

[Pos81] Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
September 1981.

[Ste94] Stevens, W., "TCP/IP Illustrated, Volume 1: The Protocols",
Addison-Wesley, 1994.

[Ste97] Stevens, W., "TCP Slow Start, Congestion Avoidance,
Retransmit, and Fast Recovery Algorithms", RFC 2001,
1997.

[WS95] Wright, G. and W. Stevens, "TCP/IP Illustrated, Volume 2:
The Implementation", Addison-Wesley, 1995.




















Allman, et. al. Standards Track [Page 12]

RFC 2581 TCP Congestion Control April 1999


Authors'

Mark
NASA Glenn Research Center/Sterling
Lewis
21000 Brookpark Rd. MS 54-2
Cleveland, OH 44135
216-433-6586

EMail: mallman@grc.nasa.
http://roland.grc.nasa.gov/~


Vern
ACIRI /
1947 Center
Suite 600
Berkeley, CA 94704-1198

Phone: +1 510/642-4274 x302
EMail: vern@aciri.


W. Richard
1202 E. Paseo del
Tucson, AZ 85718
520-297-9416

EMail: rstevens@kohala.
http://www.kohala.com/~





















Allman, et. al. Standards Track [Page 13]

RFC 2581 TCP Congestion Control April 1999


Full Copyright

Copyright (C) The Internet Society (1999). All Rights Reserved

This document and translations of it may be copied and furnished
others, and derivative works that comment on or otherwise explain
or assist in its implementation may be prepared, copied,
and distributed, in whole or in part, without restriction of
kind, provided that the above copyright notice and this paragraph
included on all such copies and derivative works. However,
document itself may not be modified in any way, such as by
the copyright notice or references to the Internet Society or
Internet organizations, except as needed for the purpose
developing Internet standards in which case the procedures
copyrights defined in the Internet Standards process must
followed, or as required to translate it into languages other
English

The limited permissions granted above are perpetual and will not
revoked by the Internet Society or its successors or assigns

This document and the information contained herein is provided on
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE
























Allman, et. al. Standards Track [Page 14]








if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.




RFC documents can be found at I.E.T.F.



Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX







Spectrum