As per Relevance of the word datagram, we have this rfc below:
Network Working Group W.
Request for Comments: 1016 J.
July 1987
Something a Host Could Do with Source Quench
The Source Quench Introduced Delay (SQuID
Status of this
This memo is intended to explore the issue of what a host could
with a source quench. The proposal is for each source host IP
to introduce some delay between datagrams sent to the
destination host. This is an "crazy idea paper" and discussion
essential. Distribution of this memo is unlimited
A gateway may discard Internet datagrams if it does not have
buffer space needed to queue the datagrams for output to the
network on the route to the destination network. If a
discards a datagram, it may send a source quench message to
Internet source host of the datagram. A destination host may
send a source quench message if datagrams arrive too fast to
processed. The source quench message is a request to the host to
back the rate at which it is sending traffic to the
destination. The gateway may send a source quench message for
message that it discards. On receipt of a source quench message,
source host should cut back the rate at which it is sending
to the specified destination until it no longer receives
quench messages from the gateway. The source host can then
increase the rate at which it sends traffic to the destination
it again receives source quench messages [1,2].
The gateway or host may send the source quench message when
approaches its capacity limit rather than waiting until the
is exceeded. This means that the data datagram which triggered
source quench message may be delivered
The SQuID
Suppose the IP module at the datagram source has a queue of
to send, and the IP module has a parameter "D". D is the
delay between sending datagrams from the queue to the network.
is, when the IP module discovers a datagram waiting to be sent to
network, it sends it to the network then waits time D before
looking at the datagram queue again. Normally, the value of D
Prue & Postel [Page 1]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
zero
Imagine that when a source quench is received (or any other signal
received that the host should slow down its transmissions to
network), the value of D is increased. As time goes by, the value
D is decreased
The SQuID
on increase event
D <-- maximum (D + K, I
(where K = .020 second
I = .075 second
on decrease event
D <-- maximum (D - J, 0)
(where J = .001 second
An increase event is receipt of one or more source quenches in
event period E, (where E is 2.000 seconds).
A decrease event is when S time has passed since D was decreased
there is a datagram to send (where S is 1.000 seconds).
A cache of D's is kept for the last M hosts communicated with
Note that when no datagrams are sent to a destination for some
the D for that destination is not decreased, but, if a destination
not used for a long time that D for that destination may fall out
the cache
Possible
Keep a separate outgoing queue of datagrams for each
host, local subnet, or network
Keep the cache of D's per network or local subnet, instead of
host
"I" could be based upon the basic speed of the slowest
network (see Appendix A).
"D" could be limited to never go below "I" if the above
were implemented
"S" could be based upon the round trip time
Prue & Postel [Page 2]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
"D" could be adjusted datagram by datagram based upon the length
the datagrams. Wait longer after a long datagram
The delay algorithm could be implemented such that if a
doesn't send a datagram when it is next allowed (the introduced
interval) or for N such intervals that the source gets a credit
one and only one free (no delay) datagram
Implementation
Since IP does not normally keep much state information about things
we want the default or idle IP to have no state about these D values
Since the default D value is zero, let us propose that the IP
keep a list of only those destinations with non zero D's
When the IP wants to send a datagram, it searches the D-list to
if the destination is noted. If it is not, the D value is zero,
the IP sends the datagram at once. If the destination is listed,
IP must wait D time indicated before sending that
datagram. It could look at a datagram addressed to a
destination, and possibly send it in the mean time
When the IP receives a source quench, it checks to see if
destination in the datagram that caused the source quench is on
list. If so, it adds K to the D value. If not, it appends
destination to the list with the D value set to "I".
A Closer Look At the
Some implementations of IP send one SQ for every N datagrams
discard (for example, N=20) so the SQ messages will not make
congestion problem much worse [3]. In such situations any of
sources of the 20 datagrams may get the SQ not necessarily the
causing the most traffic. However if a host continues to
datagrams at a high rate it has a high probability of receiving a
message sooner or later. It is much like a speeder on a highway
Not all speeders get speeding tickets but the ones who speed
often or most excessively are most likely to be ticketed. In
case they will get a ticket and their car may be destroyed
With memory becoming so inexpensive many IP nodes put an
low limit on the size of their queues so that through node delay
not be excessive [4]. For example, if one megabyte of data
buffered to be sent over a 56 kb/s line the last datagram will
over 2 minutes before being sent
One problem with SQ is that the IP or ICMP specification does
have a well defined event to indicate receipt of SQ to higher
Prue & Postel [Page 3]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
protocols. Therefore many TCP implementations do not get
about SQ events and thus do not react to SQ. TCP is not the
source of IP datagrams either. Other protocols should also
to SQ events in some appropriate way. TCP and other protocols
that level should do something about a source quench, however
discussion of their behavior is beyond the scope of this memo.
that implementation of SQ processing at one level of protocol
not interfere with the behavior of higher level protocols.
however, is difficult to do
For protocols using IP which are trying to transfer large amounts
data the data flow is most typically very bursty. TCP for example
might send 5-10 segments into a window of 5-10 K bytes then wait
the acknowledgment of the data which opens the window again.
as defined by RFC-998 is a rate based protocol which has
for burst size and burst rate
One purpose of the bursts is to allow the source computer to
several datagrams at once to provide more efficient scheduling.
other reason is to keep the network busy accepting data to
effective throughput in spite of a potentially large network
trip delay. To send a datagram then wait for an acknowledgment is
simple but not efficient protocol on a large wide area network
The reasons for efficiencies obtained at the source node
generating many datagrams at once are not as applicable in
intermediate IP node. Since each datagram is potentially from
different node they must all be treated individually.
received in a burst may also overload the queue of an
node losing datagrams and causing SQs to be generated. If the
is near a threshold and a burst comes, possibly all of the
will be lost. When datagrams arrive evenly spaced, less
are likely to be lost because the inter-arrival time allows the
a little time to empty out. Therefore datagrams spaced with
delay between them may be better for intermediate IP nodes
Congestion is most likely to occur at IP nodes which are
between a slower network and a faster one. The congestion will be
the send queue from the slow network to the fast network. An
being returned to the sender will return on the faster network. (
diagram below.)
A Gateway Source Quench
In order for the SQuID algorithm to work we rely upon the gateways
send SQs to us to tell us how we are doing. Because the loss of
single datagram affects data flow so much (see lost
discussion in Observed Results below) it would be much better for
Prue & Postel [Page 4]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
source IP node if it got a warning before datagrams were discarded
We propose gateway IP nodes start SQing before the node is flooded
a level we call SQ Keep (SQK) but forward the datagram. If the
level reaches a critical level, SQ Toss level (SQT), the
should toss datagrams to resolve the problem unless the datagram
an ICMP message. Even ICMP messages will be tossed if the MaxQ
is reached. Once the gateway starts sending SQs it should
to do so until the queue level goes below a low water mark
(SQLW) as shown below. This is analogous to methods some
systems use to handle memory space management
The gateway should try to send SQ to as many of the contributors
the congestion as possible but only once per contributor per
or two
Source Quench Queue
+--------------+ MaxQ
| |> datagrams tossed & SQed (but not ICMP msgs.)
+--------------+ SQT level (95%)
| |\
| | > datagrams SQed but
| |/
+--------------+ SQK level (70%)
| |\
| | \ datagrams SQed but forwarded if SQK
| | / exceeded & SQLW or lower not yet
| |/
+--------------+ SQLW level (50%)
| |\
| | \
| | \
| | \ datagrams
| | /
| | /
| | /
| |/
+--------------+
Description of the Test
We needed some way of testing our algorithm and its
parameters. It was important to check the interaction between
with the SQuID algorithm and TCP. We also wanted to try
combinations of retransmission strategy and source quench
which required control of the entire test network. We
decided to build an Internet model
Prue & Postel [Page 5]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
Using this example configuration for illustration
_______ LAN _______ WAN _______ LAN _______
| 1 | | 2 | | 3 | | 4 |
|TCP/IP |---10 Mb/s--| IP |---56 kb/s--| IP |---10 Mb/s--|TCP/IP |
|_______| |_______| |_______| |_______|
A program was written in C which created queues and structures to
on the queues representing datagrams carrying data,
and SQs. The program moved datagrams from one queue to the
based upon rules defined
A client fed the TCP in node 1 data at the rate it would accept.
TCP function in node 1 would chop the data up into fixed 512
datagrams for transmission to the IP in node 1. When the
were given to IP for transmission, a timestamp was put on it and
copy of it was put on a TCP ack-wait queue (data sent but not
acknowledged). In particular TCP assumed that once it handed data
IP, the data was sent immediately for purposes of
timeouts even though our algorithm has IP add delay
transmission
Each IP node had one queue in each direction (left and right).
each IP in the model IP would forward datagrams at the rate of
communications line going to the next node. Thus the fifth
on IP 2's queue going right would take 5 X 73 msec or 365 msec
it would appear at the end of IP 3's queue. The time to process
datagram was considered to be less than the time it took for the
to be sent over the 56 kb/s lines and therefore done during
transmission times and not included in the model. For the
communications this is not the case but since they were not at
bottleneck of the path this processing time was ignored.
because LAN communications are typically shared band width, the
band width available to each IP instance was considered to be 1 Mb/s
a crude approximation
When the data arrived at node 4 the data was immediately given to
TCP receive function which validated the sequence number. If
datagram was in sequence the datagram was turned into an ack
and sent back to the source. An ack datagram carries no data
will move the right edge of the window, the window size past the
acked data sequence number. The ack datagram is assumed to be 1/8
the length of a data datagram and thus can be transmitted from
node to the next 8 times faster. If the sequence number is less
expected (a retransmission due to a missed ack) then it too is
into an ack. A larger sequence number datagram is
indefinitely until the missing datagrams are received
Prue & Postel [Page 6]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
We also modeled the gateway source quench algorithm. When a
was put on an IP queue the number on the queue was compared to an
keep level (SQK). If it was greater, an SQ was generated
returned to the sender. If it was larger than the SQ toss (SQT)
it was also discarded. Once SQs were generated they would
to be sent until the queue level went below SQ Low Water (SQLW)
which was below the original SQK level. These percentages
modifiable as were many parameters. An SQ could be lost if
exceeded the maximum queue size (MaxQ), but a source quench was
sent about tossing a source quench
Upon each transition from one node to the next, the datagram
vulnerable to datagram loss due to errors. The loss rate could
set as M losses out of N datagrams sent, thus the model allowed
multi-datagram loss bursts or single datagram losses. We used
single datagram loss rate of 1 lost datagram per 300 datagrams
for much of our testing. While this may seem low for
simulation, remember it does not include losses due to congestion
Some network parameters we used were a maximum queue length of 15
datagrams per IP direction left and right. We started sending SQ
the queue was 70% full, SQK level, tossed data datagrams, but not
datagrams, if 95% of the queue was reached, SQT level, and
SQing when a 50% SQLW level was reached (see above).
We ignored additional SQs for 2 seconds after receipt of one SQ
This was done because some Internet nodes only send one SQ for
20 datagrams they discard even though our model sent SQs for
datagram discarded. Other IP node may send one SQ per
packet. The SQuID algorithm needed a way to handle both types of
generation. We therefore treated one or a burst of SQs as a
event and incremented our D by a larger amount than would
appropriate for responding individually to the multiple SQs of
verbose nodes
The simulation did not do any fragmenting of datagrams. Silly
syndrome was avoided. The model did not implement nor simulate
TTL (time-to-live) function
The model allowed for a flexible topology definition with many
source/destination pairs on host IP nodes or gateway IP nodes
various windows allowed. An IP node could have any number of
assigned to it. Each line could have an individually set speed.
TCP could send to any other TCP. The routing from one location
another was fixed. Therefore datagrams did not arrive out
sequence. However, datagrams arrived in ascending order, but
consecutively, on a regular basis because of datagram losses
Datagrams going "left" through a node did not affect the queue size
Prue & Postel [Page 7]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
or SQ chances, of data going "right" through the node
The TCP retransmission timer algorithm used an Alpha of .15 and
Beta of 1.5. The test was run without the benefit of the
sophisticated retransmission timer algorithm proposed by Van
[5].
The program would display either the queue sizes of the various
nodes and the TCP under test as time passed or do a crude plot
various parameters of interest including SRTT, perceived round
time, throughput, and the critical queue size
As we observed the effects of various algorithms for responding to
we adapted our model to better react to SQ. Initial tests showed
we incremented slowly and decremented quickly we
oscillations around the correct value but more of the time was
over driving the network, thus losing datagrams, than at a
which helped the congestion situation
A significant problem is the delay between when some
node starts dropping datagrams and sending source quenches to
time when the source quenches arrive at the source host and can
to effect the behavior at the data source. Because of this and
possibility that a IP might send only one SQ for each 20
lost, we decided that the increase in D per source quench should
substantial (for example, D should increase by 20 msec for
source quench), and the decrease with time should be very slow (
example, D should decrease 1 msec every second). Note that this
the opposite behavior than suggested in an early draft by one of
authors
However, when many source quenches are received (for example, when
source quench is received for every datagram dropped) in a short
period the D value is increased excessively. To prevent D
growing too large, we decided to ignore subsequent source
for a time (for example, 2 seconds) once we had increased D
Tests were run with only one TCP sending data to learn as much
possible how an unperturbed session might run. Other test runs
introduce and eliminate competing traffic dynamically between
TCP instances on the various nodes to see how the algorithms
to changes in network load. A potential flaw in the model is
the defined TCPs with open windows always tried to forward data
Their clients feeding them data never paused to think what they
going to type nor got swapped out in favor of other applications
turned the session around logically to listen to the other end
more user commands. In other words all of the simulated TCP
were doing file transfers
Prue & Postel [Page 8]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
The model was defined to allow many mixes of competing algorithms
responding to SQ. It allowed comparing effective throughput
TCPs with small windows and large windows and those whose IP
introduce inter-datagram delays and those who totally ignored SQ.
also allowed comparisons with various inter-datagram
amounts and decrement amounts. Because of the number of
configurations and parameter combinations only a few combinations
parameters were tested. It is hoped they were the most
ones upon which to concentrate
Observed
All of our algorithms oscillate, some worse than others
If we put in just the right amount of introduced delay we seem to
the best throughput. But finding the right amount is not easy
Throughput is adversely affected, heavily, by a single lost
at least for the short time. Examine what happens when a window
35 datagrams wide with an average round trip delay of 2500 msec
512 byte datagrams when a single datagram is lost at the beginning
Thirty five datagrams are given by TCP to IP and a timer is
on the first datagram. Since the first datagram is missing,
receiving TCP will not sent an acknowledgment but will buffer all 34
of the out-of-sequence datagrams. After 1.5 X 2500 msec, or 3750
msec, have elapsed the datagram times out and is resent. It
and is acked, along with the other 34, 2500 msec later. Before
lost datagram we might have been sending at the average rate a 56
kb/s line could accept, about one every 75 msec. After loss of
datagram we send at the rate of one in 6250 msec over 83
slower
If the lost datagram in the above example is other than the
datagram the situation becomes the same when all of the
before the lost datagram are acknowledged. The example holds
then for any single lost datagram in the window
When SQ doesn't always cause datagram loss the sender continues
send too fast (queue size oscillates a lot). It is important for
SQ to cause feed-back into the sending system as soon as possible
therefore when the source host IP receives an SQ it must
adjustments to the send rate for the datagrams still on the
queue not just datagrams IP is requested to send after the SQ
Through network delay goes up as the network queue lengths go up
Window size affect the chance of getting SQed. Look at our
above using a queue level of 15 for node 2 before SQs are
Prue & Postel [Page 9]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
and a window size of 20 datagrams. We assumed that we could
data over the LAN at a sustained average rate of 1 Mb/s or about 18
times as fast as over the WAN. When TCP sends a burst of 20
datagrams to node 1 they make it to node 2 in 81 msec.
transition time from node 2 to node 3 is 73 msec, therefore, in 81
msec, only one datagram is forwarded to node 3. Thus the 17th, 18th
19th, and 20th datagram are lost every time we send a whole window
More are lost when the queue is not empty. If a sequence of
come back in response to the sent data, the acks tend to return
the rate at which data can traverse the net thus pacing new send
by opening the window at the rate which the network can accept it
However as soon as one datagram is lost all of the subsequent
are deferred and batched until receipt of the missing data
which acks all of the datagrams and opens the window to 20 again
This causes the max queue size to be exceeded again
If we assume a window smaller than the max queue size in
bottleneck node, any time we send a window's worth of data, it
done independently of the current size of the queue. The larger
send window, the larger a percentage of the stressed queue we send
If we send 50% of the stressed queue size any time that queue is
than 50% we threaten to overflow the queue. Evenly spaced
datagram bursts have the least chance of overflowing the queue
they represent the minimum percentage of the max queue one may send
When a big window opens up (that is, a missing datagram at the
of a 40 datagram send queue gets retransmitted and acked),
perceived round trip time for datagrams subsequently sent hits
minimum value then goes up linearly. The SRTT goes down then back
in a nice smooth curve. This is caused by the fact IP will not
delay if the queue is empty and IP has not sent any datagrams to
destination for our introduced delay time. But as many datagrams
added to the IP pre-staged send queue in bursts all have the
send time as far as TCP is concerned. IP will delay each datagram
the head of the queue by the introduced delay amount. The first
be undelayed as just described but all of the others are delayed
their ordinal number on the queue times the introduced delay amount
It seems as though in a race between a TCP session which
sending to IP and one who does not, the delayer will get
throughput because less datagrams are lost. The send window may
be increased to keep the pipeline full. If however the non
uses windowing to reduce the chance of SQ datagram loss
throughput may possibly be better because no fair queuing
is in place
If gateways send SQs early and don't toss data until its critical
keep sending SQs until a low water mark is hit, effective
Prue & Postel [Page 10]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
seems to go up
At the startup of our tests throughput was very high, then
off quickly as the last of the window got clobbered. Our
should have used a slow start up algorithm to minimize the
shock. However the learning curve to estimate the proper value for
was probably quicker
A large part of the perceived RTT is due to the delay getting off
TCP2IP (TCP transitional) queue when we used large windows. If
would invoke some back-pressure to TCP in a real implementation
can be significantly reduced. Reducing the window would do this
us at the expense of throughput
After an SQ burst which tosses datagrams the sender gets in a
where TCP may only send one or two datagrams per RTT until the
but not acked segments fall into sequence and are acked.
assumes only the head of the retransmission queue is retransmitted
a timeout. We can send one datagram upon timeout. When the ack
the retransmission is received the window opens allowing sending
second. We then wait for the next lost datagram to time out
If we stop sending data for a while but allow D to be decreased,
algorithm causes the introduced delay to dwindle away. We would
go through a new startup learning curve and network
sequence
One thing not observed often was TCP timing out a segment before
source IP even sent the datagram the first time. As discussed
the first datagram on the queue of a large burst is delayed
and succeeding datagrams have linearly increasing delays.
smoothed round trip delay algorithm has a chance to adapt to
perceived increasing round trip times
Unstructured Thoughts and
The further down a route a datagram traverses before being
the greater the waste of network resources. SQs which do not
the datagram referred to are better than ones that do if return
resources are available
Any fix must be implementable piecemeal. A fix can not be
in all or most nodes at one time. The SQuID algorithm fulfills
requirement. It could be implemented, installed in one location,
used effectively
If it can be shown that by using the new algorithm
throughput can be increased over implementations which do
Prue & Postel [Page 11]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
implement it that may well be effective impetus to get vendors
implement it
Once a source host has an established average minimum inter-
delay to a destination (see Appendix A), this information should
stored across system restarts. This value might be used each
data is sent to the given host as a minimum inter-datagram
value
Window closing algorithms reduce the average inter-datagram delay
the burst size but do not affect the minimum inter-datagram
by TCP
Currently an IP gateway node can know if it is in a critical
because its queues stay high or keep building up. Its optimum
size is one because it always has something to do and the
node delay is at a minimum. It is very important that the gateway
the critical path not so discourage data flow that its queue
drops to zero. If the gateway tosses datagrams this stops data
for TCP for a while (as described in Observed Results above).
argues for the gateway algorithm described above which SQs but
not toss datagrams unless necessary. Optimally we should try to
a queue size somewhat larger than 1 but less than say 50% of the
queue size. Large queues lead to large delay
TCP's SRTT is made artificially large by introducing delay at IP
the perceived round trip time variance is probably smaller allowing
smaller Beta value for the timeout value
So that a decrease timer is not needed for the "D" decrease function
upon the next sent datagram to a delayed destination just
the delay by the amount of time since we last did this divided by
decrease timer interval. An alternate algorithm would be to
it by only one decrease unit amount if more than the timer
has gone by. This eliminates the problem caused by the delay, "D",
dwindling away if we stop sending for a while. The longer we
using this "D", the more likely it is that it is too large a
and the more we should decrease it
It is better for the network and the sender for our introduced
to be a little on the high side. It minimizes the chances of
a datagram clobbered by sending it into a congested gateway. A
datagram scenario described above showed that one lost datagram
reduce our effective delay by one to two orders of
temporarily. Also if the delay is a little high, the net is
stressed and the queues get smaller, reducing through network delay
The RTT experienced at a given time verses the minimum RTT
Prue & Postel [Page 12]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
for the given route does give a good measure of congestion. If
ever get congestion control working RTT may have little to do
the amount of congestion. Effective throughput when compared
the possible throughput (or some other measure) is the only
measure of congestion
Slow startup of TCP is a good thing and should be encouraged as
additional mechanism for alleviating network overload
The network dynamics tends to bunch datagrams. If we properly
data instead of bunching it like windowing techniques to
overflow of queues then greater throughput is accomplished
the absolute rate we can send is pacing our sending not the RTT.
eliminate "stochastic bunching" [6].
The longer the RTT the more network resources the data takes
traverse the net
Should "fair queuing" say that a longer route data transfer
get less band width than a shorter one (since it consumes more of
net)? Being fair locally on each node may be unfair overall
datagrams traversing many nodes
If we solve congestion problems today, we will start loading up
net with more data tomorrow. When this causes congestion in a
will that type of congestion be harder to solve than todays or is
not our problem? John Nagle suggests "In a large net, we may
try to force congestion out to the fringes and keep the interior
the net uncongested by controlling entry to the net. The IMP-
systems work that way, or at least used to. This has the effect
concentrating congestion at the entrance to the long-haul system
That's where we want it; the Source Quench / congestion window /
queuing set of strategies are able to handle congestion at the LAN
WAN bottleneck [7]. Our algorithm should try to push the
congestion out to the extremities and keep the interior
congestion free
Use of the algorithm is aesthetically appealing because the data
sitting in our local queue instead of consuming resources inside
net. We give data to the network only when it is ready to accept it
An averaged minimum inter-datagram arrival value will give a
of the network bottleneck speed at the receiver. If the
does not defer or batch together acks the same would be learned
the inter-datagram arrival time of the acks. A problem is that
doesn't have knowledge of the datagram contents. However IP
know from which host a datagram comes
Prue & Postel [Page 13]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
If SQuID limits the size of its pre-net buffering properly (
back-pressure to TCP) then artificially high RTT measurements
not occur
TCP might, in the future, get a way to query IP for the
introduced delay, D, for a given destination and if the value is
excessive abort or not start a session
With the new algorithm TCP could have an arbitrarily large window
send into without fear of over running queue sizes in
nodes (not that any TCP ever considered having this fear before).
Thus it could have a window size which would allow it to always
sending; keeping the pipe full and seldom getting in the stop-and
wait mode of sending. This presupposes that the local IP is able
cause some sort of back pressure so that the local IPs queues are
over run. TCP would still be operating in the burst mode of
but the local IP would be sending a datagram for the TCP as often
the network could accept it keeping the data flow continuous
potentially slow
Experience implementing protocols suggests avoiding timers
protocols whenever possible. IP, as currently defined, does not
timers. The SQuID algorithm uses two at the IP level. A way
eliminate the introduced delay decrease timer is to decrease the
value when we check the send queue for data to send. We
decrease "D" by one "J" unit if "S" time, or more, has elapsed.
other timer is not so easily eliminated. If the IP implementation
periodically awakened to check for work to do, it could check
time stamps of the head of the queues to see if any datagrams
waited long enough. This would mean we would necessarily wait
long before sending, but it may not be too large of a variance
our desired rates. The additional delay would help congestion
reduce our chances of SQ. Another option is setting a real
which is say 25-50% too large and hope that our periodic work in
will allow us to notice a datagram is delayed enough, and send it
Upon sending the datagram we would cancel the timer, avoiding
timer interrupt and environment swap. In other implementations
communications interface or the link level protocol may be able
provide the timing needed without interrupts to the main processor
Background tasks like some file transfers could query IP for
latest delay characteristics for a given destination to determine
it is appropriate to consume network resources now or if it would
better to wait until later
We should consider what would happen if IP, using the
algorithm, and TCP both introduced delay between the datagrams.
TCPs delay was greater than IP's, then when IP got the datagrams
Prue & Postel [Page 14]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
would forward them immediately as described in the algorithm above
This is because when the IP send queue is empty and it has been
least as long as IP wants the higher level protocol, TCP, gets
free (no delay) send. Note also that IP will be decreasing
amount of delay it wants introduced because of the
transmissions without SQs. This would affect other protocols
might also send to the same destination. If TCP sends too
then IP will protect the network from its indiscretion as
in the basic algorithm however TCPs round trip time estimates will
much closer to reality. A lost datagram will thus be detected
quickly. If TCP also uses windowing to limit its sending rate,
might, because of its success with a smaller window, increase
window size increasing TCPs efficiency
As this algorithm is implemented everywhere, the SQ Keep (SQK) and
Low Water (SQLW) queue level percentages should be dropped to
queue sizes and thus the through net delay. The percentage we
SQK and SQLW to should be adjusted based upon the percentage of
the queue is empty. The more the queue is empty the more likely
is that the node is discouraging data flow too much. The more
the queue is not empty but not too full, the more likely it is
node is not excessively discouraging data flow. How uniform
queue size is, is a measure of how well the network citizens
behaved
As the congestion is pushed to the sources, gateways which
bottlenecks can more easily detect someone not playing by the
(sending datagrams in bursts) and deal with the offender
Prue & Postel [Page 15]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
Appendix A -- Determination of the Value for the Parameter "I
To get to the correct value for the delay needed quickly, when
event occurred and the currently used delay was minimal,
transmission time for an average sized datagram across the
communications link was used for a first value. How a real IP
is to guess this value is discussed below. In our example
transition between node 2 and node 3 is the bottleneck. Using the 56
kb/s line, a 512 byte datagram would take 73 msec with no queuing
processing time is considered. This value is defined to be
minimum inter-datagram arrival time (MIAT). Assuming a
network, ignoring factors other than transmission speed, this is
minimum time one could expect between receipt of datagrams at
destination, because of the slowed data rate through the bottleneck
This won't hold true if the datagrams do not follow the same path
The MIAT, minimum inter-datagram arrival time, may be guessed at
comparing the arrival timestamps of consecutive datagrams from
source of data. Each value to be considered needs to be adjusted
or down based upon the size between the second datagram received
the typical datagram size. More simply stated, a datagram which
half the size of the average datagram can be transmitted across
line in half the time. Therefore, double its IAT before
it for an MIAT. If the timestamp of a datagrams is taken based
an event caused by the start of the datagram arriving, not
completion of the datagram arriving, then the first datagram's
is the limiting length and should be used to adjust its IAT.
order to keep the algorithm simple, arrival times for short
could be ignored as could IATs which were orders of magnitude
large (see below).
Once a minimal value is found based upon looking for small
over a minute or more, the value might be time averaged with a
kept like TCP's SRTT in order to reduce the effects of a false MIAT
We could assume the limiting facility would be a 9.6 kb/s line,
56-64 kb/s line, or a 1.5 Mb/s line. These facilities would
a MIAT of 427 msec, 73-64 msec, or 3 msec respectively, for
datagram 512 bytes long. These are almost orders of magnitude
differences. If the MIAT a node measures is not in this range
close, the value it is closest to may be used for its MIAT from
source
One of the good things about this measurement is that it is
entirely passive measurement. No additional traffic is needed
measure it. If a source is not sending data continuously then
longer measured values won't be validated as minimal values.
assumption is that at least sometimes the source will send
datagrams at a time
Prue & Postel [Page 16]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
The MIAT measurement is totally unaffected by satellite
characteristics of any intervening facilities. The chance of
a valid minimal reading is affected by the number of nodes traversed
but the value measured if it is valid will not be affected by
number of nodes traversed. Stated another way, when a pair
datagrams traverse from one node to the next the datagrams
susceptible to being separated by a datagram from another source
Both of these factors affect SRTT. The value obtained requires
topological knowledge of the route
A potential problem with the measurement is, it will not be
proper value for some forms of alternate routes. If a T1 link is
bottleneck route some times and other times it is a 56 kb/s link
first guess for inter-datagram delay would be too small for the 56
kb/s line route. Another problem is that if one datagram goes
one route and the next goes via another, their inter-datagram
difference could lead to a small false measurement. If
networks fragment datagrams then the measured IAT between
could be misleading. A solution to this problem is to
fragmented datagram IATs
This number represents the minimum inter-datagram delay the
IP should use to send to us, the measuring site, for the
datagram size. If we assume that the return path will be through
same facilities or the same type, then as described above this
can be the first guess for inter-datagram introduced delay, "D" (
the algorithm). It represents the "I" parameter
These MIATs may be cached on a host, subnet, or network basis.
last "n" hosts MIATs could be kept. For infrequent destinations
entry per destination network would be applicable to
destinations. If the local net is in fact a subnet, then the
local subnet MIATs could be kept
If a good algorithm is found for MIAT, comparing it to the
IAT (during data transfer) would give a good measure of the amount
network traffic load. Since IP has no idea when the source of
is sending as fast as possible, to get a valid average, upper
protocols would have to figure this out for themselves. IP
however provide an interface to get the current MIAT for a
destination
Prue & Postel [Page 17]
RFC 1016 Source Quench Introduced Delay -- SQuID July 1987
[1] Postel, Jon, "Internet Protocol - DARPA Internet
Protocol Specification", RFC 791, ISI, September 1981.
[2] Postel, Jon, "Internet Control Message Protocol - DARPA
Program Protocol Specification", RFC 792, ISI, September 1981.
[3] Karels, M., "Re: Source Quench", electronic mail message to J
Postel and INENG-INTEREST, Tue, 24 Feb 87.
[4] Nagle, John B., "On Packet Switches With Infinite Storage",
970, FACC Palo Alto, December 1985.
[5] Jacobson, Van, "Re: interpacket arrival variance and mean",
electronic mail message to TCP-IP, Mon, 15 Jun 87 06:08:01
[6] Jacobson, Van, "Re: Appropriate measures of gateway performance
electronic mail message to J. Noel Chiappa and INENG-INTEREST, Sun
22 Mar 87 15:04:44 PST
[7] Nagle, John B., "Source quench, and congestion generally",
electronic mail message to B. Braden and INENG-INTEREST, Thu, 5
87 11:08:49 PST
[8] Nagle, John B., "Congestion Control in IP/TCP Internetworks",
896, FACC Palo Alto, 6 January 1984.
Prue & Postel [Page 18]
if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.
RFC documents can be found at I.E.T.F.
Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX