As per Relevance of the word datagram, we have this rfc below:
Network Working Group J.
Request For Comments: 1063 C.
C.
K.
July 1988
IP MTU Discovery
STATUS OF THIS
A pair of IP options that can be used to learn the minimum MTU of
path through an internet is described, along with its possible uses
This is a proposal for an Experimental protocol. Distribution
this memo is unlimited
Although the Internet Protocol allows gateways to fragment
that are too large to forward, fragmentation is not always desirable
It can lead to poor performance or even total communication
in circumstances that are surprisingly common. (For a
discussion of this issue, see [1]).
A datagram will be fragmented if it is larger than the
Transmission Unit (MTU) of some network along the path it follows
In order to avoid fragmentation, a host sending an IP datagram
ensure that the datagram is no larger than the Minimum MTU (MINMTU
over the entire path
It has long been recognized that the methods for discovering
MINMTU of an IP internetwork path are inadequate. The
currently available fall into two categories: (1) choosing small
to avoid fragmentation or (2) using additional probe packets
discover when fragmentation will occur. Both methods have problems
Choosing MTUs requires a balance between network utilization (
requires the use of the largest possible datagram) and
avoidance (which in the absence of knowledge about the network
encourages the use of small, and thus too many, datagrams).
choice for the MTU size, without information from the network,
likely to either fail to properly utilize the network or fail
avoid fragmentation
Probe packets have the problem of burdening the network
Mogul, Kent, Partridge, & McCloghrie [Page 1]
RFC 1063 IP MTU Discovery Options July 1988
unnecessary packets. And because network paths often change
the lifetime of a TCP connection, probe packets will have to be
on a regular basis to detect any changes in the effective MINMTU
Implementors sometimes mistake the TCP MSS option as a mechanism
learning the network MINMTU. In fact, the MSS option is only
mechanism for learning about buffering capabilities at the two
peers. Separate provisions must be made to learn the IP MINMTU
In this memo, we propose two new IP options that, when used
conjunction will permit two peers to determine the MINMTU of
paths between them. In this scheme, one option is used to
the lowest MTU in a path; the second option is used to convey
MTU back to the sender (possibly in the IP datagram containing
transport acknowledgement to the datagram which contained the
discovery option).
OPTION
Probe MTU Option (Number 11)
+--------+--------+--------+--------+
|00001011|00000100| 2 octet value |
+--------+--------+--------+--------+
This option always contains the lowest MTU of all the
that have been traversed so far by the datagram
A host that sends this option must initialize the value field
be the MTU of the directly-connected network. If the host
multi-homed, this should be for the first-hop network
Each gateway that receives a datagram containing this option
compare the MTU field with the MTUs of the inbound and
links for the datagram. If either MTU is lower than the value
the MTU field of the option, the option value should be set to
lower MTU. (Note that gateways conforming to RFC-1009 may
know either the inbound interface or the outbound interface at
time that IP options are processed. Accordingly, support for
option may require major gateway software changes).
Any host receiving a datagram containing this option
confirm that value of the MTU field of the option is less than
equal to that of the inbound link, and if necessary, reduce
Mogul, Kent, Partridge, & McCloghrie [Page 2]
RFC 1063 IP MTU Discovery Options July 1988
MTU field value, before processing the option
If the receiving host is not able to accept datagrams as large
specified by the value of the MTU field of the option, then
should reduce the MTU field to the size of the largest datagram
can accept
Reply MTU Option (Number 12)
+--------+--------+--------+--------+
|00001100|00000100| 2 octet value |
+--------+--------+--------+--------+
This option is used to return the value learned from a Probe
option to the sender of the Probe MTU option
RELATION TO TCP
Note that there are two superficially similar problems in
the size of a datagram. First, there is the restriction [2] that
host not send a datagram larger than 576 octets unless it
assurance that the destination is prepared to accept a
datagram. Second, the sending host should not send a datagram
than MINMTU, in order to avoid fragmentation. The datagram
should normally be the minimum of these two lower bounds
In the past, the TCP MSS option [3] has been used to avoid
packets larger than the destination can accept. Unfortunately,
is not the most general mechanism; it is not available to
transport layers, and it cannot determine the MINMTU (
gateways do not parse TCP options).
Because the MINMTU returned by a probe cannot be larger than
maximum datagram size that the destination can accept, this IP
could, in theory, supplant the use of the TCP MSS option,
an economy of mechanism. (Note however, that some
believe that the value of the TCP MSS is distinct from the path'
MINMTU. The MSS is the upper limit of the data size that the
will accept, while the MINMTU represents a statement about the
size supported by the path).
Note that a failure to observe the MINMTU restriction is not
fatal; fragmentation will occur, but this is supposed to work.
failure to observe the TCP MSS option, however, could be
Mogul, Kent, Partridge, & McCloghrie [Page 3]
RFC 1063 IP MTU Discovery Options July 1988
because it might lead to datagrams that can never be accepted by
destination. Therefore, unless and until the Probe MTU option
universally implemented, at least by hosts, the TCP MSS option
be used as well
IMPLEMENTATION
Who Sends the
There are at least two ways to implement the MTU discovery scheme
One method makes the transport layer responsible for
discovery; the other method makes the IP layer responsible for
discovery. A host system should support one of the two schemes
Transport
In the transport case, the transport layer can include the
MTU option in an outbound datagram. When a datagram
the Probe MTU option is received, the option must be passed up
the receiving transport layer, which should then acknowledge
Probe with a Reply MTU option in the next return datagram.
that because the options are placed on unreliable datagrams,
original sender will have to resend Probes (possibly once
window of data) until it receives a Reply option. Also note
the Reply MTU option may be returned on an IP datagram for
different transport protocol from which it was sent (e.g.,
generated the probe but the Reply was received on a UDP datagram).
IP
A better scheme is to put MTU discovery into the IP layer,
control mechanisms in the routing cache. Whenever an IP
is sent, the IP layer checks in the routing cache to see if
Probe or Reply MTU option needs to be inserted in the datagram
Whenever a datagram containing either option is received,
information in those options is placed in the routing cache
The basic working of the protocol is somewhat complex. We
it here through one round-trip. Implementors should realize
there may be cases where both options are contained in
datagram. For the purposes of this exposition, the sender of
probe is called the Probe-Sender and the receiver, Probe-Receiver
When the IP layer is asked to send a Probe MTU option (see
section below on when to probe), it makes some record in
routing cache that indicates the next IP datagram to Probe
Receiver should contain the Probe MTU option
Mogul, Kent, Partridge, & McCloghrie [Page 4]
RFC 1063 IP MTU Discovery Options July 1988
When the next IP datagram to Probe-Receiver is sent, the Probe
option is inserted. The IP layer in Probe-Sender should
to send an occasional Probe MTU in subsequent datagrams until
Reply MTU option is received. It is strongly recommended that
Probe MTU not be sent in all datagrams but only at such a
that, on average, one Probe MTU will be sent per round-
interval. (Another way of saying this is that we would hope
only one datagram in a transport protocol window worth of data
the Probe MTU option set). This mechanism might be implemented
sending every Nth packet, or, in those implementations where
round-trip time estimate to the destination is cached with
route, once every estimated RTT
When a Probe MTU option is received by Probe-Receiver,
receiving IP should place the value of this option in the
datagram it sends back to Probe-Sender. The value is
discarded. In other words, each Probe MTU option causes the
MTU option to be placed in one return datagram
When Probe-Sender receives the Reply MTU option, it should
the value of the option against the current MINMTU estimate in
routing cache. If the option value is lower, it becomes the
MINMTU estimate. If the option value is higher, Probe-
should be more conservative about changing the MINMTU estimate
If a route is flapping, the MINMTU may change frequently. In
situations, keeping the smallest MINMTU of various routes in
is preferred. As a result, a higher MINMTU estimate should
be accepted after a lower estimate has been permitted to "age"
bit. In other words, if the probe value is higher than
estimated MINMTU, only update the estimate if the estimate
several seconds old or more. Finally, whenever the Probe-
receives a Reply MTU option, it should stop retransmitting
to Probe-Receiver
A few additional issues complicate this discussion
One problem is setting the default MINMTU when no Reply
options have been received. We recommend the use of the
of the supported IP datagram size (576 octets) and the
network MTU for destinations not on the local connected network
and the connected network MTU for hosts on the connected network
The MINMTU information, while kept by the Internet layer, is
fact, only of interest to the transport and higher layers
Accordingly, the Internet layer must keep the transport
informed of the current value of the estimated MINMTU
Furthermore, minimal transport protocols, such as UDP, must
prepared to pass this information up to the transport
Mogul, Kent, Partridge, & McCloghrie [Page 5]
RFC 1063 IP MTU Discovery Options July 1988
user
It is expected that there will be a transition period during
some hosts support this option and some do not. As a result
hosts should stop sending Probe MTU options and refuse to send
further options if it does not receive either a Probe MTU
or Reply MTU option from the remote system after a certain
of Probe MTU options have been sent. In short, if Probe-
has sent several probes but has gotten no indication that Probe
Receiver supports MTU probing, then Probe-Sender should
that Probe-Receiver does not support probes. (Obviously,
Probe-Sender later receives a probe option from Probe-Receiver,
should revise its opinion.)
Implementations should not assume that routes to the
destination that have a different TOS have the same
MINMTU. We recommend that the MTU be probed separately for
TOS
Respecting the TCP
One issue concerning TCP MSS is that it is usually
assuming an IP header that contains no options. If the
layer is sending maximum size segments, it may not leave space
IP to fit the options into the datagram. Thus, insertion of
Probe MTU or Reply MTU option may violate the MSS restriction
Because, unlike other IP options, the MTU options can be
without the knowledge of the transport layer, the implementor
carefully consider the implications of adding options to an
datagram
One approach is to reserve 4 bytes from the MINMTU reported to
transport layer; this will allow the IP layer to insert at
one MTU option in every datagram (it can compare the size of
outgoing datagram with the MINMTU stored in the route cache to
how much room there actually is). This is simple to implement
but does waste a little bandwidth in the normal case
Another approach is to provide a means for the IP layer to
the transport layer that space must be reserved for sending
option; the transport layer would then make a forthcoming
somewhat smaller than usual
When a Probe Can Be
A system that receives a Probe MTU option should always
with a Reply MTU option, unless the probe was sent to an IP or
broadcast address
Mogul, Kent, Partridge, & McCloghrie [Page 6]
RFC 1063 IP MTU Discovery Options July 1988
A Probe MTU option should be sent in any of the
situations
(1) The MINMTU for the path is not yet known
(2) A received datagram suffers a fragmentation re-
timeout. (This is a strong hint the path has changed
send a probe to the datagram's source);
(3) An ICMP Time Exceeded/Fragmentation Reassembly Timeout
received (this is the only message we will get
indicates fragmentation occurred along the network path);
(4) The transport layer requests it
Implementations may also wish to periodically probe a path,
if there is no indication that fragmentation is occurring.
practice is perfectly reasonable; if fragmentation and
is working perfectly, the sender may never get any indication
the path MINMTU has changed unless a probe is sent. We recommend
however, that implementations send such periodic probes sparingly
Once every few minutes, or once every few hundred datagrams
probably sufficient
There are also some scenarios in which the Probe MTU should not
sent, even though there may be some indication of an
change
(1) Probes should not be sent in response to the receipt
a probe option. Although the fact that the remote
is probing indicates that the MINMTU may have changed
sending a probe in response to a probe causes a
exchange of probe options
(2) Probes must not be sent in response to
datagrams except when the fragmentation
of the datagram fails. The problem in this case
that the receiver has no mechanism for informing the
peer that fragmentation has occurred, unless
reassembly fails (in which case an ICMP message is sent).
Thus, a peer may use the wrong MTU for some time
discovering a problem. If we probe on
datagrams, we may probe, unnecessarily, for some
until the remote peer corrects its MTU
(3) For compatibility with hosts that do not implement
option, no Probe MTU Option should be sent more
ten times without receiving a Reply MTU Option or
Mogul, Kent, Partridge, & McCloghrie [Page 7]
RFC 1063 IP MTU Discovery Options July 1988
Probe MTU Option from the remote peer. Peers
ignore probes and do not send probes must be
as not supporting probes
(4) Probes should not be sent to an IP or LAN
address
(5) We recommend that Probe MTUs not be sent to other
on the directly-connected network, but that this
be configurable. There are situations (for example,
Proxy ARP is in use) where it may be difficult to
which systems are on the directly-connected network.
this case, probing may make sense
SAMPLE IMPLEMENTATION
We present here a somewhat more concrete description of how an IP
layer implementation of MTU probing might be designed
First, the routing cache entries are enhanced to store
additional values
MINMTU: The current MINMTU of the path
ProbeRetry: A timestamp indicating when the next
should be sent
LastDecreased: A timestamp showing when the MTU
last decreased
ProbeReply: A bit indicating a Reply MTU option should
sent
ReplyMTU: The value to go in the Reply MTU option
SupportsProbes: A bit indicating that the remote
can deal with probes (always defaults
1=true).
ConsecutiveProbes: The number of probes sent
the receipt of a Probe MTU or
MTU option
There are also several configuration parameters; these should
configurable by appropriate network management software; the
we suggest are "reasonable":
Default_MINMTU: The default value for the MINMTU field of
Mogul, Kent, Partridge, & McCloghrie [Page 8]
RFC 1063 IP MTU Discovery Options July 1988
routing cache entry, to be used when the
MINMTU is unknown. Recommended value: 576.
Max_ConsecutiveProbs: The maximum number of probes to
before assuming that the destination
not support the probe option
Recommended value: 10.
ProbeRetryTime: The time (in seconds) to wait before
an unanswered probe. Recommended value
60 seconds, or 2*RTT if the the RTT is
to the IP layer
ReprobeInterval: The time to wait before sending a probe
receiving a successful Reply MTU, in order
detect increases in the route's MINMTU
Recommended value: 5 times the ProbeRetryTime
IncreaseInterval: The time to wait before increasing the
after the value has been decreased, to
flapping. Recommended value: same
ProbeRetryTime
When a new route is entered into the routing cache, the
values should be set as follows
MINMTU = Default_
ProbeRetry = Current
LastDecreased = Current Time -
ProbeReply =
SupportsProbes =
ConsecutiveProbes = 0
This initialization is done before attempting to send the
packet along this route, so that the first packet will contain
Probe MTU option
Whenever the IP layer sends a datagram on this route it checks
SupportsProbes bit to see if the remote system supports probing.
the SupportsProbes bit is set, and the timestamp in ProbeRetry
less than or equal to the current time, a Probe option should be
in the datagram, and the ProbeRetry field incremented
ProbeRetryTime
Mogul, Kent, Partridge, & McCloghrie [Page 9]
RFC 1063 IP MTU Discovery Options July 1988
Whether or not the Probe MTU option is sent in a datagram, if
ProbeReply bit is set, then a Reply MTU option with the value of
ReplyMTU field is placed in the outbound datagram. The
bit is then cleared
Every time a Probe option is sent, the ConsecutiveProbes value
be incremented. If this value reaches Max_ConsecutiveProbes,
SupportsProbe bit should be cleared
When an IP datagram containing the Probe MTU option is received,
receiving IP sets the ReplyMTU to the Probe MTU option value and
the ProbeReply bit in its outbound route to the source of
datagram. The SupportsProbe bit is set, and the
value is reset to 0.
If an IP datagram containing the Reply MTU option is received, the
layer must locate the routing cache entry corresponding to the
of the Reply MTU option; if no such entry exists, a new one (
default values) should be created. The SupportsProbe bit is set,
the ConsecutiveProbes value is reset to 0. The ProbeRetry field
set to the current time plus ReprobeInterval
Four cases are possible when a Reply MTU option is received
(1) The Reply MTU option value is less than the
MINMTU: the MINMTU field is set to the new value,
the LastDecreased field is set to the current time
(2) The Reply MTU option value is greater than
current MINMTU and the LastDecreased field
IncreaseInterval is less than the current time: set
ProbeRetry field to LastDecreased plus IncreaseInterval
but do not change MINMTU
(3) The Reply MTU option value is greater than
current MINMTU and the LastDecreased field
IncreaseInterval is greater than the current time:
the MINMTU field to the new value
(4) The Reply MTU option value is equal to the
MINMTU: do nothing more
Whenever the MTU field is changed, the transport layer should
notified, either by an upcall or by a change in a shared
(which may be accessed from the transport layer by a downcall).
If a fragmentation reassembly timeout occurs, if an ICMP
Exceeded/Fragmentation Reassembly Timeout is received, or if the
Mogul, Kent, Partridge, & McCloghrie [Page 10]
RFC 1063 IP MTU Discovery Options July 1988
layer is asked to send a probe by a higher layer, the
field for the appropriate routing cache entry is set to the
time. This will cause a Probe option to be sent with the
datagram (unless the SupportsProbe bit is turned off).
MANAGEMENT
We suggest that the following parameters be made available to
applications and remote network management systems
(1) The number of probe retries to be made before
a system is down. The value of 10 is certain to be
in some situations
(2) The frequency with which probes are sent. Systems
find that more or less frequent probing is more
effective
(3) The default MINMTU used to initialize routes
(4) Applications should have the ability to force a
on a particular route. There are cases where a
needs to be sent but the sender doesn't know it.
operator must be able to cause a probe in such situations
Furthermore, it may be useful for applications to "ping
for the MTU
[1] Kent, C. and J. Mogul, "Fragmentation
Harmful", Proc. ACM SIGCOMM '87, Stowe, VT, August 1987.
[2] Postel, J., Ed., "Internet Protocol", RFC-791,
USC/Information Sciences Institute, Marina del Rey, CA
September 1981.
[3] Postel, J., Ed., "Transmission Control Protocol", RFC-793,
USC/Information Sciences Institute, Marina del Rey, CA
September 1981.
[4] Postel, J., "The TCP Maximum Segment Size and Related Topics",
RFC-879, USC/Information Sciences Institute, Marina del Rey
CA, November 1983.
Mogul, Kent, Partridge, & McCloghrie [Page 11]
if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.
RFC documents can be found at I.E.T.F.
Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX