As per Relevance of the word congestion, we have this rfc below:
Network Working Group M.
Request for Comments: 2861 J.
Category: Experimental S.
June 2000
TCP Congestion Window
Status of this
This memo defines an Experimental Protocol for the
community. It does not specify an Internet standard of any kind
Discussion and suggestions for improvement are requested
Distribution of this memo is unlimited
Copyright
Copyright (C) The Internet Society (2000). All Rights Reserved
TCP's congestion window controls the number of packets a TCP flow
have in the network at any time. However, long periods when
sender is idle or application-limited can lead to the invalidation
the congestion window, in that the congestion window no
reflects current information about the state of the network.
document describes a simple modification to TCP's congestion
algorithms to decay the congestion window cwnd after the
from a sufficiently-long application-limited period, while using
slow-start threshold ssthresh to save information about the
value of the congestion window
An invalid congestion window also results when the congestion
is increased (i.e., in TCP's slow-start or congestion
phases) during application-limited periods, when the previous
of the congestion window might never have been fully utilized.
propose that the TCP sender should not increase the congestion
when the TCP sender has been application-limited (and therefore
not fully used the current congestion window). We have
these algorithms both with simulations and with experiments from
implementation in FreeBSD
1. Conventions and
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD
SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in
document, are to be interpreted as described in [B97].
Handley, et al. Experimental [Page 1]
RFC 2861 TCP Congestion Window Validation June 2000
2.
TCP's congestion window controls the number of packets a TCP flow
have in the network at any time. The congestion window is set
an Additive-Increase, Multiplicative-Decrease (AIMD) mechanism
probes for available bandwidth, dynamically adapting to
network conditions. This AIMD mechanism works well when the
continually has data to send, as is typically the case for TCP
for bulk-data transfer. In contrast, for TCP used with
applications, the data sender often has little or no data to send
and the sending rate is often determined by the rate at which data
generated by the user. With the advent of the web,
developments such as TCP senders with dynamically-created data
HTTP 1.1 with persistent-connection TCP, the interaction
application-limited periods (when the sender sends less than
allowed by the congestion or receiver windows) and network-
periods (when the sender is limited by the TCP window)
increasingly important. More precisely, we define a network-
period as any period when the sender is sending a full window
data
Long periods when the sender is application-limited can lead to
invalidation of the congestion window. During periods when the
sender is network-limited, the value of the congestion window
repeatedly "revalidated" by the successful transmission of a
of data without loss. When the TCP sender is network-limited,
is an incoming stream of acknowledgements that "clocks out" new data
giving concrete evidence of recent available bandwidth in
network. In contrast, during periods when the TCP sender
application-limited, the estimate of available capacity
by the congestion window may become steadily less accurate over time
In particular, capacity that had once been used by the network
limited connection might now be used by other traffic
Current TCP implementations have a range of behaviors for starting
after an idle period. Some current TCP implementations slow-
after an idle period longer than the RTO estimate, as suggested
[RFC2581] and in the appendix of [VJ88], while other
don't reduce their congestion window after an idle period. RFC 2581
[RFC2581] recommends the following: "a TCP SHOULD set cwnd to no
than RW [the initial window] before beginning transmission if the
has not sent data in an interval exceeding the
timeout." A proposal for TCP's slow-start after idle has also
discussed in [HTH98]. The issue of validation of
information during idle periods has also been addressed in
other than TCP and IP, for example in "Use-it or Lose-it"
for ATM networks [J96,J95].
Handley, et al. Experimental [Page 2]
RFC 2861 TCP Congestion Window Validation June 2000
To address the revalidation of the congestion window after
application-limited period, we propose a simple modification to TCP'
congestion control algorithms to decay the congestion window
after the transition from a sufficiently-long application-
period (i.e., at least one roundtrip time) to a network-
period. In particular, we propose that after an idle period, the
sender should reduce its congestion window by half for every RTT
the flow has remained idle
When the congestion window is reduced, the slow-start
ssthresh remains as "memory" of the recent congestion window
Specifically, ssthresh is never decreased when cwnd is reduced
an application-limited period; before cwnd is reduced, ssthresh
set to the maximum of its current value, and half-way between the
and the new values of cwnd. This use of ssthresh allows a TCP
increasing its sending rate after an application-limited period
quickly slow-start to recover most of the previous value of
congestion window. To be more precise, if ssthresh is less than 3/4
cwnd when the congestion window is reduced after an application
limited period, then ssthresh is increased to 3/4 cwnd before
reduction of the congestion window
An invalid congestion window also results when the congestion
is increased (i.e., in TCP's slow-start or congestion
phases) during application-limited periods, when the previous
of the congestion window might never have been fully utilized.
far as we know, all current TCP implementations increase
congestion window when an acknowledgement arrives, if allowed by
receiver's advertised window and the slow-start or
avoidance window increase algorithm, without checking to see if
previous value of the congestion window has in fact been used.
document proposes that the window increase algorithm not be
during application-limited periods [MSML99]. In particular, the
sender should not increase the congestion window when the TCP
has been application-limited (and therefore has not fully used
current congestion window). This restriction prevents the
window from growing arbitrarily large, in the absence of
that the congestion window can be supported by the network.
[MSML99, Section 5.2]: "This restriction assures that [cwnd]
grows as long as TCP actually succeeds in injecting enough data
the network to test the path."
A somewhat-orthogonal problem associated with maintaining a
congestion window after an application-limited period is that
sender, with a sudden large amount of data to send after a
period, might immediately send a full congestion window of back-to
back packets. This problem of sending large bursts of packets back
to-back can be effectively handled using rate-based pacing (RBP
Handley, et al. Experimental [Page 3]
RFC 2861 TCP Congestion Window Validation June 2000
[VH97]), or using a maximum burst size control [FF96]. We
contend that, even with mechanisms for limiting the sending of back
to-back packets or pacing packets out over the period of a
time, an old congestion window that has not been fully used for
time can not be trusted as an indication of the bandwidth
available for that flow. We would contend that the mechanisms
pace out packets allowed by the congestion window are
orthogonal to the algorithms used to determine the appropriate
of the congestion window
3.
When a TCP sender has sufficient data available to fill the
network capacity for that flow, cwnd and ssthresh get set
appropriate values for the network conditions. When a TCP
stops sending, the flow stops sampling the network conditions, and
the value of the congestion window may become inaccurate. We
the correct conservative behavior under these circumstances is
decay the congestion window by half for every RTT that the
remains inactive. The value of half is a very conservative
based on how quickly multiplicative decrease would have decayed
window in the presence of loss
Another possibility is that the sender may not stop sending, but
become application-limited rather than network-limited, and
less data to the network than the congestion window allows to
sent. In this case the TCP flow is still sampling
conditions, but is not offering sufficient traffic to be sure
there is still sufficient capacity in the network for that flow
send a full congestion window. Under these circumstances we
the correct conservative behavior is for the sender to keep track
the maximum amount of the congestion window used during each RTT,
to decay the congestion window each RTT to midway between the
cwnd value and the maximum value used
Before the congestion window is reduced, ssthresh is set to
maximum of its current value and 3/4 cwnd. If the sender then
more data to send than the decayed cwnd allows, the TCP will slow
start (perform exponential increase) at least half-way back up to
old value of cwnd
The justification for this value of "3/4 cwnd" is that 3/4 cwnd is
conservative estimate of the recent average value of the
window, and the TCP should safely be able to slow-start at least
to this point. For a TCP in steady-state that has been reducing
congestion window each time the congestion window reached
maximum value `maxwin', the average congestion window has been 3/4
maxwin. On average, when the connection becomes application-limited
Handley, et al. Experimental [Page 4]
RFC 2861 TCP Congestion Window Validation June 2000
cwnd will be 3/4 maxwin, and in this case cwnd itself represents
average value of the congestion window. However, if the
happens to become application-limited when cwnd equals maxwin,
the average value of the congestion window is given by 3/4 cwnd
An alternate possibility would be to set ssthresh to the maximum
the current value of ssthresh, and the old value of cwnd,
TCP to slow-start all of the way back up to the old value of cwnd
Further experimentation can be used to evaluate these two options
setting ssthresh
For the separate issue of the increase of the congestion window
response to an acknowledgement, we believe the correct behavior
for the sender to increase the congestion window only if the
was full when the acknowledgment arrived
We term this set of modifications to TCP Congestion Window
(CWV) because they are related to ensuring the congestion window
always a valid reflection of the current network state as probed
the connection
3.1. The basic algorithm for reducing the congestion
A key issue in the CWV algorithm is to determine how to apply
guideline of reducing the congestion window once for every
time that the flow is application-limited. We use TCP'
retransmission timer (RTO) as a reasonable upper bound on
roundtrip time, and reduce the congestion window roughly once
RTO
This basic algorithm could be implemented in TCP as follows: When
sends a new packet it checks to see if more than RTO seconds
elapsed since the previous packet was sent. If RTO has elapsed
ssthresh is set to the maximum of 3/4 cwnd and the current value
ssthresh, and then the congestion window is halved for every RTO
elapsed since the previous packet was sent. In addition, T_prev
set to the current time, and W_used is reset to zero. T_prev will
used to determine the elapsed time since the sender last was network
limited or had reduced cwnd after an idle period. When the sender
application-limited, W_used holds the maximum congestion
actually used since the sender was last network-limited
The mechanism for determining the number of RTOs in the most
idle period could also be implemented by using a timer that
every RTO after the last packet was sent instead of a check
packet - efficiency constraints on different operating systems
dictate which is more efficient to implement
Handley, et al. Experimental [Page 5]
RFC 2861 TCP Congestion Window Validation June 2000
After TCP sends a packet, it also checks to see if that packet
the congestion window. If so, the sender is network-limited,
sets the variable T_prev to the current TCP clock time, and
variable W_used to zero
When TCP sends a packet that does not fill the congestion window,
the TCP send queue is empty, then the sender is application-limited
The sender checks to see if the amount of unacknowledged data
greater than W_used; if so, W_used is set to the amount
unacknowledged data. In addition TCP checks to see if the
time since T_prev is greater than RTO. If so, then the TCP has
just reduced its congestion window following an idle period. The
has been application-limited rather than network-limited for at
an entire RTO interval, but for less than two RTO intervals. In
case, TCP sets ssthresh to the maximum of 3/4 cwnd and the
value of ssthresh, and reduces its congestion window
(cwnd+W_used)/2. W_used is then set to zero, and T_prev is set
the current time, so a further reduction will not take place until
least another RTO period has elapsed. Thus, during an application
limited period the CWV algorithm reduces the congestion window
per RTO
3.2. Pseudo-code for reducing the congestion
Initially
T_last = tcpnow, T_prev = tcpnow, W_used = 0
After sending a data segment
If tcpnow - T_last >=
(The sender has been idle.)
ssthresh = max(ssthresh, 3*cwnd/4)
For i=1 To (tcpnow - T_last)/
win = min(cwnd, receiver's declared max window
cwnd = max(win/2, MSS
T_prev =
W_used = 0
T_last =
If window is
T_prev =
W_used = 0
If no more data is available to
W_used = max(W_used, amount of unacknowledged data
If tcpnow - T_prev >=
(The sender has been application-limited.)
ssthresh = max(ssthresh, 3*cwnd/4)
Handley, et al. Experimental [Page 6]
RFC 2861 TCP Congestion Window Validation June 2000
win = min(cwnd, receiver's declared max window
cwnd = (win + W_used)/2
T_prev =
W_used = 0
4.
The CWV proposal has been implemented as an option in the
simulator NS [NS]. The simulations in the validation test suite
CWV can be run with the command "./test-all-tcp" in the
"tcl/test". The simulations show the use of CWV to reduce
congestion window after a period when the TCP connection
application-limited, and to limit the increase in the
window when a transfer is application-limited. As the
illustrate, the use of ssthresh to maintain connection history is
critical part of the Congestion Window Validation algorithm. [HPF99]
discusses these simulations in more detail
5.
We have implemented the CWV mechanism in the TCP implementation
FreeBSD 3.2. [HPF99] discusses these experiments in more detail
The first experiment examines the effects of the Congestion
Validation mechanisms for limiting cwnd increases
application-limited periods. The experiment used a real
connection through a modem link emulated using Dummynet [Dummynet].
The link speed is 30Kb/s and the link has five packet
available. Today most modem banks have more buffering available
this, but the more buffer-limited situation sometimes occurs
older modems. In the first half of the transfer, the user is
away over the connection. About half way through the time, the
lists a moderately large file, which causes a large burst of
to be transmitted
For the unmodified TCP, every returning ACK during the first part
the transfer results in an increase in cwnd. As a result, the
burst of data arriving from the application to the transport layer
sent as many back-to-back packets, most of which get lost
subsequently retransmitted
For the modified TCP with Congestion Window Validation,
congestion window is not increased when the window is not full,
has been decreased during application-limited periods closer to
the user actually used. The burst of traffic is now constrained
the congestion window, resulting in a better-behaved flow
Handley, et al. Experimental [Page 7]
RFC 2861 TCP Congestion Window Validation June 2000
minimal loss. The end result is that the transfer
approximately 30% faster than the transfer without CWV, due
avoiding retransmission timeouts
The second experiment uses a real ssh connection over a real
ppp connection, where the modem bank has much more buffering.
the unmodified TCP, the initial burst from the large file does
cause loss, but does cause the RTT to increase to approximately 5
seconds, where the connection becomes bounded by the receiver'
window
For the modified TCP with Congestion Window Validation, the flow
much better behaved, and produces no large burst of traffic. In
case the linear increase for cwnd results in a slow increase in
RTT as the buffer slowly fills
For the second experiment, both the modified and the unmodified
finish delivering the data at precisely the same time. This
because the link has been fully utilized in both cases due to
modem buffer being larger than the receiver window. Clearly a
buffer of this size is undesirable due to its effect on the RTT
competing flows, but it is necessary with current TCP
that produce bursts similar to those shown in the top graph
6.
This document has presented several TCP algorithms for
Window Validation, to be employed after an idle period or a period
which the sender was application-limited, and before an increase
the congestion window. The goal of these algorithms is for TCP'
congestion window to reflect recent knowledge of the TCP
about the state of the network path, while at the same time
some memory (i.e., in ssthresh) about the earlier state of the path
We believe that these modifications will be of benefit to both
network and to the TCP flows themselves, by preventing
packet drops due to the TCP sender's failure to update
information (or lack of information) about current
conditions. Future work will document and investigate the
provided by these algorithms, using both simulations and experiments
Additional future work will describe a more complex version of
CWV algorithm for TCP implementations where the sender does not
an accurate estimate of the TCP roundtrip time
Handley, et al. Experimental [Page 8]
RFC 2861 TCP Congestion Window Validation June 2000
7.
[FF96] Fall, K., and Floyd, S., Simulation-based Comparisons
Tahoe, Reno, and SACK TCP, Computer Communication Review
V. 26 N. 3, July 1996, pp. 5-21.
"http://www.aciri.org/floyd/papers.html".
[HPF99] Mark Handley, Jitendra Padhye, Sally Floyd, TCP
Window Validation, UMass CMPSCI Technical Report 99-77,
September 1999. URL "ftp://www
net.cs.umass.edu/pub/Handley99-tcpq-tr-99-77.ps.gz".
[HTH98] Amy Hughes, Joe Touch, John Heidemann, "Issues in
Slow-Start Restart After Idle", Work in Progress
[J88] Jacobson, V., Congestion Avoidance and Control,
from Proceedings of SIGCOMM '88 (Palo Alto, CA, Aug
1988), and revised in 1992. URL "http://www
nrg.ee.lbl.gov/nrg-papers.html".
[JKBFL96] Raj Jain, Shiv Kalyanaraman, Rohit Goyal, Sonia Fahmy,
Fang Lu, Comments on "Use-it or Lose-it", ATM
Document Number: ATM Forum/96-0178,
"http://www.netlab.ohio
state.edu/~jain/atmf/af_rl5b2.htm".
[JKGFL95] R. Jain, S. Kalyanaraman, R. Goyal, S. Fahmy, and F. Lu,
Fix for Source End System Rule 5, AF-TM 95-1660,
1995, URL "http://www.netlab.ohio
state.edu/~jain/atmf/af_rl52.htm".
[MSML99] Matt Mathis, Jeff Semke, Jamshid Mahdavi, and Kevin Lahey
The Rate-Halving Algorithm for TCP Congestion Control
June 1999.
"http://www.psc.edu/networking/ftp/papers/draft
ratehalving.txt".
[NS] NS, the UCB/LBNL/VINT Network Simulator.
"http://www-mash.cs.berkeley.edu/ns/".
[RFC2581] Allman, M., Paxson, V. and W. Stevens, TCP
Control, RFC 2581, April 1999.
[VH97] Vikram Visweswaraiah and John Heidemann. Improving
of Idle TCP Connections, Technical Report 97-661,
University of Southern California, November, 1997.
Handley, et al. Experimental [Page 9]
RFC 2861 TCP Congestion Window Validation June 2000
[Dummynet] Luigi Rizzo, "Dummynet and Forward Error Correction",
Freenix 98, June 1998, New Orleans.
"http://info.iet.unipi.it/~luigi/ip_dummynet/".
8. Security
General security considerations concerning TCP congestion control
discussed in RFC 2581. This document describes a algorithm for
aspect of those congestion control procedures, and so
considerations described in RFC 2581 apply to this algorithm also
There are no known additional security concerns for this
algorithm
9. Authors'
Mark
AT&T Center for Internet Research at ICSI (ACIRI
Phone: +1 510 666 2946
EMail: mjh@aciri.
URL: http://www.aciri.org/mjh
Jitendra
AT&T Center for Internet Research at ICSI (ACIRI
Phone: +1 510 666 2887
EMail: padhye@aciri.
URL: http://www-net.cs.umass.edu/~jitu
Sally
AT&T Center for Internet Research at ICSI (ACIRI
Phone: +1 510 666 2989
EMail: floyd@aciri.
URL: http://www.aciri.org/floyd
Handley, et al. Experimental [Page 10]
RFC 2861 TCP Congestion Window Validation June 2000
10. Full Copyright
Copyright (C) The Internet Society (2000). All Rights Reserved
This document and translations of it may be copied and furnished
others, and derivative works that comment on or otherwise explain
or assist in its implementation may be prepared, copied,
and distributed, in whole or in part, without restriction of
kind, provided that the above copyright notice and this paragraph
included on all such copies and derivative works. However,
document itself may not be modified in any way, such as by
the copyright notice or references to the Internet Society or
Internet organizations, except as needed for the purpose
developing Internet standards in which case the procedures
copyrights defined in the Internet Standards process must
followed, or as required to translate it into languages other
English
The limited permissions granted above are perpetual and will not
revoked by the Internet Society or its successors or assigns
This document and the information contained herein is provided on
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE
Funding for the RFC Editor function is currently provided by
Internet Society
Handley, et al. Experimental [Page 11]
if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.
RFC documents can be found at I.E.T.F.
Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX