As per Relevance of the word transport, we have this rfc below:











Network Working Group T.
Request for Comments: 3208 Cisco
Category: Experimental J.

J.

D.
Procket
S.
Juniper
D.
TIBCO
M.
Digital
T.
Talarian
L.
University of
A.
N.
R.
R.
L.
Cisco
December 2001


PGM Reliable Transport Protocol

Status of this

This memo defines an Experimental Protocol for the
community. It does not specify an Internet standard of any kind
Discussion and suggestions for improvement are requested
Distribution of this memo is unlimited

Copyright

Copyright (C) The Internet Society (2001). All Rights Reserved



Pragmatic General Multicast (PGM) is a reliable multicast
protocol for applications that require ordered or unordered
duplicate-free, multicast data delivery from multiple sources
multiple receivers. PGM guarantees that a receiver in the
either receives all data packets from transmissions and repairs,
is able to detect unrecoverable data packet loss. PGM



Speakman, et. al. Experimental [Page 1]

RFC 3208 PGM Reliable Transport Protocol December 2001


specifically intended as a workable solution for
applications with basic reliability requirements. Its central
goal is simplicity of operation with due regard for scalability
network efficiency

Table of

1. Introduction and Overview .................................. 3
2. Architectural Description .................................. 9
3. Terms and Concepts ......................................... 12
4. Procedures - General ....................................... 18
5. Procedures - Sources ....................................... 19
6. Procedures - Receivers ..................................... 22
7. Procedures - Network Elements .............................. 27
8. Packet Formats ............................................. 31
9. Options .................................................... 40
10. Security Considerations .................................... 56
11. Appendix A - Forward Error Correction ...................... 58
12. Appendix B - Support for Congestion Control ................ 72
13. Appendix C - SPM Requests .................................. 79
14. Appendix D - Poll Mechanism ................................ 82
15. Appendix E - Implosion Prevention .......................... 92
16. Appendix F - Transmit Window Example ....................... 98
17 Appendix G - Applicability Statement ....................... 103
18. Abbreviations .............................................. 105
19. Acknowledgments ............................................ 106
20. References ................................................. 106
21. Authors' Addresses.......................................... 108
22. Full Copyright Statement ................................... 111

Nota Bene

The publication of this specification is intended to freeze
definition of PGM in the interest of fostering both ongoing
prospective experimentation with the protocol. The intent of
experimentation is to provide experience with the implementation
deployment of a reliable multicast protocol of this class so as to
able to feed that experience back into the longer-
standardization process underway in the Reliable Multicast
Working Group of the IETF. Appendix G provides more specific
on the scope and status of some of this experimentation. Reports
experiments include [16-23]. Additional results and
experimentation are encouraged








Speakman, et. al. Experimental [Page 2]

RFC 3208 PGM Reliable Transport Protocol December 2001


1. Introduction and

A variety of reliable protocols have been proposed for multicast
delivery, each with an emphasis on particular types of applications
network characteristics, or definitions of reliability ([1], [2],
[3], [4]). In this tradition, Pragmatic General Multicast (PGM) is
reliable transport protocol for applications that require ordered
unordered, duplicate-free, multicast data delivery from
sources to multiple receivers

PGM is specifically intended as a workable solution for
applications with basic reliability requirements rather than as
comprehensive solution for multicast applications with
ordering, agreement, and robustness requirements. Its central
goal is simplicity of operation with due regard for scalability
network efficiency

PGM has no notion of group membership. It simply provides
multicast data delivery within a transmit window advanced by a
according to a purely local strategy. Reliable delivery is
within a source's transmit window from the time a receiver joins
group until it departs. PGM guarantees that a receiver in the
either receives all data packets from transmissions and repairs,
is able to detect unrecoverable data packet loss. PGM supports
number of sources within a multicast group, each fully identified
a globally unique Transport Session Identifier (TSI), but since
sources/sessions operate entirely independently of each other,
specification is phrased in terms of a single source and
without modification to multiple sources

More specifically, PGM is not intended for use with applications
depend either upon acknowledged delivery to a known group
recipients, or upon total ordering amongst multiple sources

Rather, PGM is best suited to those applications in which members
join and leave at any time, and that are either insensitive
unrecoverable data packet loss or are prepared to resort
application recovery in the event. Through its optional extensions
PGM provides specific mechanisms to support applications as
as stock and news updates, data conferencing, low-delay real-
video transfer, and bulk data transfer

In the following text, transport-layer originators of PGM
packets are referred to as sources, transport-layer consumers of
data packets are referred to as receivers, and network-layer
in the intervening network are referred to as network elements





Speakman, et. al. Experimental [Page 3]

RFC 3208 PGM Reliable Transport Protocol December 2001


Unless otherwise specified, the term "repair" will be used
indicate both the actual retransmission of a copy of a missing
or the transmission of an FEC repair packet



The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in
document are to be interpreted as described in RFC 2119 [14]
indicate requirement levels for compliant PGM implementations

1.1. Summary of

PGM runs over a datagram multicast protocol such as IP multicast [5].
In the normal course of data transfer, a source multicasts
data packets (ODATA), and receivers unicast selective
acknowledgments (NAKs) for data packets detected to be missing
the expected sequence. Network elements forward NAKs PGM-hop-by
PGM-hop to the source, and confirm each hop by multicasting a
confirmation (NCF) in response on the interface on which the NAK
received. Repairs (RDATA) may be provided either by the
itself or by a Designated Local Repairer (DLR) in response to a NAK

Since NAKs provide the sole mechanism for reliability, PGM
particularly sensitive to their loss. To minimize NAK loss,
defines a network-layer hop-by-hop procedure for reliable
forwarding

Upon detection of a missing data packet, a receiver
unicasts a NAK to the last-hop PGM network element on
distribution tree from the source. A receiver repeats this NAK
it receives a NAK confirmation (NCF) multicast to the group from
PGM network element. That network element responds with an NCF
the first occurrence of the NAK and any further retransmissions
that same NAK from any receiver. In turn, the network
repeatedly forwards the NAK to the upstream PGM network element
the reverse of the distribution path from the source of the
data packet until it also receives an NCF from that network element
Finally, the source itself receives and confirms the NAK
multicasting an NCF to the group

While NCFs are multicast to the group, they are not propagated by
network elements since they act as hop-by-hop confirmations








Speakman, et. al. Experimental [Page 4]

RFC 3208 PGM Reliable Transport Protocol December 2001


To avoid NAK implosion, PGM specifies procedures for subnet-based
suppression amongst receivers and NAK elimination within
elements. The usual result is the propagation of just one copy of
given NAK along the reverse of the distribution path from any
with directly connected receivers to a source

The net effect is that unicast NAKs return from a receiver to
source on the reverse of the path on which ODATA was forwarded,
is, on the reverse of the distribution tree from the source.
specifically, they return through exactly the same sequence of
network elements through which ODATA was forwarded, but in reverse
The reasons for handling NAKs this way will become clear in
discussion of constraining repairs, but first it's necessary
describe the mechanisms for establishing the requisite source
state in PGM network elements

To establish source path state in PGM network elements, the
data transfer operation is augmented by Source Path Messages (SPMs
from a source, periodically interleaved with ODATA. SPMs
primarily to establish source path state for a given TSI in all
network elements on the distribution tree from the source.
network elements use this information to address returning
NAKs directly to the upstream PGM network element toward the source
and thereby insure that NAKs return from a receiver to a source
the reverse of the distribution path for the TSI

SPMs are sent by a source at a rate that serves to maintain up-to
date PGM neighbor information. In addition, SPMs complement the
of DATA packets in provoking further NAKs from receivers,
maintaining receive window state in the receivers

As a further efficiency, PGM specifies procedures for the
of repairs by network elements so that they reach only those
segments containing group members that did not receive the
transmission. As NAKs traverse the reverse of the ODATA
(upward), they establish repair state in the network elements
is used in turn to constrain the (downward) forwarding of
corresponding RDATA

Besides procedures for the source to provide repairs, PGM
specifies options and procedures that permit designated
repairers (DLRs) to announce their availability and to
repair requests (NAKs) to themselves rather than to the
source. In addition to these conventional procedures for
recovery through selective ARQ, Appendix A specifies Forward
Correction (FEC) procedures for sources to provide and receivers
request general error correcting parity packets rather than
retransmissions



Speakman, et. al. Experimental [Page 5]

RFC 3208 PGM Reliable Transport Protocol December 2001


Finally, since PGM operates without regular return traffic
receivers, conventional feedback mechanisms for transport flow
congestion control cannot be applied. Appendix B specifies a TCP
friendly, NE-based solution for PGM congestion control, and cites
reference to a TCP-friendly, end-to-end solution for PGM
control

In its basic operation, PGM relies on a purely rate-
transmission strategy in the source to bound the bandwidth
by PGM transport sessions and to define the transmit
maintained by the source

PGM defines four basic packet types: three that flow
(SPMs, DATA, NCFs), and one that flows upstream (NAKs).

1.2. Design Goals and

PGM has been designed to serve that broad range of
applications that have relatively simple reliability requirements
and to do so in a way that realizes the much advertised but
unrealized network efficiencies of multicast data transfer.
usual impediments to realizing these efficiencies are the
of negative and positive acknowledgments from receivers to sources
repair latency from the source, and the propagation of repairs
disinterested receivers

1.2.1. Reliability

Reliable data delivery across an unreliable network is
achieved through an end-to-end protocol in which a source (
or explicitly) solicits receipt confirmation from a receiver, and
receiver responds positively or negatively. While the frequency
negative acknowledgments is a function of the reliability of
network and the receiver's resources (and so, potentially quite low),
the frequency of positive acknowledgments is fixed at at least
rate at which the transmit window is advanced, and usually
often

Negative acknowledgments primarily determine repairs and reliability
Positive acknowledgments primarily determine transmit
management

When these principles are extended without modification to
protocols, the result, at least for positive acknowledgments, is
burden of positive acknowledgments transmitted to the source
quickly threatens to overwhelm it as the number of receivers grows
More succinctly, ACK implosion keeps ACK-based reliable
protocols from scaling well



Speakman, et. al. Experimental [Page 6]

RFC 3208 PGM Reliable Transport Protocol December 2001


One of the goals of PGM is to get as strong a definition
reliability as possible from as simple a protocol as possible.
implosion can be addressed in a variety of effective but
ways, most of which require re-transmit capability from other
the original source

An alternative is to dispense with positive
altogether, and to resort to other strategies for buffer
while retaining negative acknowledgments for repairs and reliability
The approach taken in PGM is to retain negative acknowledgments,
to dispense with positive acknowledgments and resort instead
timeouts at the source to manage transmit resources

The definition of reliability with PGM is a direct consequence
this design decision. PGM guarantees that a receiver either
all data packets from transmissions and repairs, or is able to
unrecoverable data packet loss

PGM includes strategies for repeatedly provoking NAKs from receivers
and for adding reliability to the NAKs themselves. By
the NAK mechanism, PGM minimizes the probability that a receiver
detect a missing data packet so late that the packet is
for repair either from the source or from a designated local
(DLR). Without ACKs and knowledge of group membership, however,
cannot eliminate this possibility

1.2.2. Group

A second consequence of eliminating ACKs is that knowledge of
membership is neither required nor provided by the protocol
Although a source may receive some PGM packets (NAKs for instance
from some receivers, the identity of the receivers does not figure
the processing of those packets. Group membership MAY change
the course of a PGM transport session without the knowledge of
consequence to the source or the remaining receivers

1.2.3.

While PGM avoids the implosion of positive acknowledgments simply
dispensing with ACKs, the implosion of negative acknowledgments
addressed directly

Receivers observe a random back-off prior to generating a NAK
which interval the NAK is suppressed (i.e. it is not sent, but
receiver acts as if it had sent it) by the receiver upon receipt of
matching NCF. In addition, PGM network elements eliminate
NAKs received on different interfaces on the same network element




Speakman, et. al. Experimental [Page 7]

RFC 3208 PGM Reliable Transport Protocol December 2001


The combination of these two strategies usually results in the
receiving just a single NAK for any given lost data packet

Whether a repair is provided from a DLR or the original source, it
important to constrain that repair to only those network
containing members that negatively acknowledged the
transmission rather than propagating it throughout the group.
specifies procedures for network elements to use the pattern of
to define a sub-tree within the group upon which to forward
corresponding repair so that it reaches only those receivers
missed it in the first place

1.2.4.

PGM is designed to achieve the greatest improvement in
(as compared to the usual UDP) with the least complexity. As
result, PGM does NOT address conference control, global
amongst multiple sources in the group, nor recovery from
partitions

1.2.5.

PGM is designed to function, albeit with less efficiency, even
some or all of the network elements in the multicast tree have
knowledge of PGM. To that end, all PGM data packets can
conventionally multicast routed by non-PGM network elements with
loss of functionality, but with some inefficiency in the
of RDATA and NCFs

In addition, since NAKs are unicast to the last-hop PGM
element and NCFs are multicast to the group, NAK/NCF operation
also consistent across non-PGM network elements. Note that for
suppression to be most effective, receivers should always have a
network element as a first hop network element between themselves
every path to every PGM source. If receivers are several
removed from the first PGM network element, the efficacy of
suppression may degrade

1.3.

In addition to the basic data transfer operation described above,
specifies several end-to-end options to address specific
requirements. PGM specifies options to support fragmentation,
joining, redirection, Forward Error Correction (FEC), reachability
and session synchronization/termination/reset. Options MAY
appended to PGM data packet headers only by their
transmitters. While they MAY be interpreted by network elements
options are neither added nor removed by network elements



Speakman, et. al. Experimental [Page 8]

RFC 3208 PGM Reliable Transport Protocol December 2001


All options are receiver-significant (i.e., they must be
by receivers). Some options are also network-significant (i.e.,
must be interpreted by network elements).

Fragmentation MAY be used in conjunction with data packets to allow
transport-layer entity at the source to break up application-
data packets into multiple PGM data packets to conform with
maximum transmission unit (MTU) supported by the network layer

Late joining allows a source to indicate whether or not receivers
request all available repairs when they initially join a
transport session

Redirection MAY be used in conjunction with Poll Responses to allow
DLR to respond to normal NCFs or POLLs with a redirecting
advertising its own address as an alternative re-transmitter to
original source

FEC techniques MAY be applied by receivers to use source-
parity packets rather than selective retransmissions to effect
recovery

2. Architectural

As an end-to-end transport protocol, PGM specifies packet formats
procedures for sources to transmit and for receivers to receive data
To enhance the efficiency of this data transfer, PGM also
packet formats and procedures for network elements to improve
reliability of NAKs and to constrain the propagation of repairs.
division of these functions is described in this section and
in detail in the next section

2.1. Source

Data

Sources multicast ODATA packets to the group within
transmit window at a given transmit rate

Source Path

Sources multicast SPMs to the group, interleaved with ODATA
present, to establish source path state in PGM
elements







Speakman, et. al. Experimental [Page 9]

RFC 3208 PGM Reliable Transport Protocol December 2001


NAK

Sources multicast NCFs to the group in response to any
they receive



Sources multicast RDATA packets to the group in response
NAKs received for data packets within the transmit window

Transmit Window

Sources MAY advance the trailing edge of the window
to one of a number of strategies. Implementations MAY
automatic adjustments such as keeping the window at a
size in bytes, a fixed number of packets or a fixed real
duration. In addition, they MAY optionally delay
advancement based on NAK-silence for a certain period.
possible strategies are outlined later in this document

2.2. Receiver

Source Path

Receivers use SPMs to determine the last-hop PGM
element for a given TSI to which to direct their NAKs

Data

Receivers receive ODATA within the transmit window
eliminate any duplicates

Repair

Receivers unicast NAKs to the last-hop PGM network element (
MAY optionally multicast a NAK with TTL of 1 to the
group) for data packets within the receive window detected
be missing from the expected sequence. A receiver
repeatedly transmit a given NAK until it receives a
NCF

NAK

Receivers suppress NAKs for which a matching NCF or NAK
received during the NAK transmit back-off interval






Speakman, et. al. Experimental [Page 10]

RFC 3208 PGM Reliable Transport Protocol December 2001


Receive Window

Receivers immediately advance their receive windows
receipt of any PGM data packet or SPM within the
window that advances the receive window

2.3. Network Element

Network elements forward ODATA without intervention

Source Path

Network elements intercept SPMs and use them to
source path state for the corresponding TSI before
forwarding them in the usual way

NAK

Network elements multicast NCFs to the group in response to
NAK they receive. For each NAK received, network
create repair state recording the transport session identifier
the sequence number of the NAK, and the input interface
which the NAK was received

Constrained NAK

Network elements repeatedly unicast forward only the first
of any NAK they receive to the upstream PGM network element
the distribution path for the TSI until they receive an NCF
response. In addition, they MAY optionally multicast this
upstream with TTL of 1.

Nota Bene: Once confirmed by an NCF, network elements discard
packets; NAKs are NOT retained in network elements beyond
forwarding operation, but state about the reception of them
stored

NAK

Network elements discard exact duplicates of any NAK for
they already have repair state (i.e., that has been
either by themselves or a neighboring PGM network element),
respond with a matching NCF








Speakman, et. al. Experimental [Page 11]

RFC 3208 PGM Reliable Transport Protocol December 2001


Constrained RDATA

Network elements use NAKs to maintain repair state
of a list of interfaces upon which a given NAK was received
and they forward the corresponding RDATA only on
interfaces

NAK

If a network element hears an upstream NCF (i.e., on
upstream interface for the distribution tree for the TSI),
establishes repair state without outgoing interfaces
anticipation of responding to and eliminating duplicates of
NAK that may arrive from downstream

3. Terms and

Before proceeding from the preceding overview to the detail in
subsequent Procedures, this section presents some concepts
definitions that make that detail more intelligible

3.1. Transport Session

Every PGM packet is identified by a

TSI transport session

TSIs MUST be globally unique, and only one source at a time may
as the source for a transport session. (Note that repairers do
change the TSI in any RDATA they transmit). TSIs are composed of
concatenation of a globally unique source identifier (GSI) and
source-assigned data-source port

Since all PGM packets originated by receivers are in response to
packets originated by a source, receivers simply echo the TSI
from the source in any corresponding packets they originate

Since all PGM packets originated by network elements are in
to PGM packets originated by a receiver, network elements simply
the TSI heard from the receiver in any corresponding packets
originate

3.2. Sequence

PGM uses a circular sequence number space from 0 through ((2**32) -
1) to identify and order ODATA packets. Sources MUST number
packets in unit increments in the order in which the
application data is submitted for transmission. Within a transmit



Speakman, et. al. Experimental [Page 12]

RFC 3208 PGM Reliable Transport Protocol December 2001


receive window (defined below), a sequence number x is "less"
"older" than sequence number y if it numbers an ODATA
preceding ODATA packet y, and a sequence number y is "greater"
"more recent" than sequence number x if it numbers an ODATA
subsequent to ODATA packet x

3.3. Transmit

The description of the operation of PGM rests fundamentally on
definition of the source-maintained transmit window. This
in turn is derived directly from the amount of transmitted data (
seconds) a source retains for repair (TXW_SECS), and the
transmit rate (in bytes/second) maintained by a source to
its bandwidth utilization (TXW_MAX_RTE).

In terms of sequence numbers, the transmit window is the range
sequence numbers consumed by the source for sequentially
and transmitting the most recent TXW_SECS of ODATA packets.
trailing (or left) edge of the transmit window (TXW_TRAIL) is
as the sequence number of the oldest data packet available for
from a source. The leading (or right) edge of the transmit
(TXW_LEAD) is defined as the sequence number of the most recent
packet a source has transmitted

The size of the transmit window in sequence numbers (TXW_SQNS) (i.e.,
the difference between the leading and trailing edges plus one)
be no greater than half the PGM sequence number space less one

When TXW_TRAIL is equal to TXW_LEAD, the transmit window size is one
When TXW_TRAIL is equal to TXW_LEAD plus one, the transmit
size is empty

3.4. Receive

The receive window at the receivers is determined entirely by
packets from the source. That is, a receiver simply obeys what
source tells it in terms of window state and advancement

For a given transport session identified by a TSI, a
maintains

RXW_TRAIL the sequence number defining the trailing edge of
receive window, the sequence number (known from
packets and SPMs) of the oldest data packet
for repair from the






Speakman, et. al. Experimental [Page 13]

RFC 3208 PGM Reliable Transport Protocol December 2001


RXW_LEAD the sequence number defining the leading edge of
receive window, the greatest sequence number of
received data packet within the transmit

The receive window is the range of sequence numbers a receiver
expected to use to identify receivable ODATA

A data packet is described as being "in" the receive window if
sequence number is in the receive window

The receive window is advanced by the receiver when it receives
SPM or ODATA packet within the transmit window that
RXW_TRAIL. Receivers also advance their receive windows upon
of any PGM data packet within the receive window that advances
receive window

3.5. Source Path

To establish the repair state required to constrain RDATA, it'
essential that NAKs return from a receiver to a source on the
of the distribution tree from the source. That is, they must
through the same sequence of PGM network elements through which
ODATA was forwarded, but in reverse. There are two reasons for this
the less obvious one being by far the more important

The first and obvious reason is that RDATA is forwarded on the
path as ODATA and so repair state must be established on this path
it is to constrain the propagation of RDATA

The second and less obvious reason is that in the absence of
state, PGM network elements do NOT forward RDATA, so the
behavior is to discard repairs. If repair state is not
established for interfaces on which ODATA went missing,
receivers on those interfaces will continue to NAK for lost data
ultimately experience unrecoverable data loss

The principle function of SPMs is to provide the source path
required for PGM network elements to forward NAKs from one
network element to the next on the reverse of the distribution
for the TSI, establishing repair state each step of the way.
source path state is simply the address of the upstream PGM
element on the reverse of the distribution tree for the TSI.
upstream PGM network element may be more than one subnet hop away
SPMs establish the identity of the upstream PGM network element
the distribution tree for each TSI in each group in each PGM
element, a sort of virtual PGM topology. So although NAKs
unicast addressed, they are NOT unicast routed by PGM
elements in the conventional sense. Instead PGM network elements



Speakman, et. al. Experimental [Page 14]

RFC 3208 PGM Reliable Transport Protocol December 2001


the source path state established by SPMs to direct NAKs PGM-hop-by
PGM-hop toward the source. The idea is to constrain NAKs to the
PGM topology spanning the more heterogeneous underlying topology
both PGM and non-PGM network elements

The result is repair state in every PGM network element between
receiver and the source so that the corresponding RDATA is
discarded by a PGM network element for lack of repair state

SPMs also maintain transmit window state in receivers by
the trailing and leading edges of the transmit window (SPM_TRAIL
SPM_LEAD). In the absence of data, SPMs MAY be used to close
transmit window in time by advancing the transmit window
SPM_TRAIL is equal to SPM_LEAD plus one

3.6. Packet

This section just provides enough short-hand to make the
intelligible. For the full details of packet contents, please
to Packet Formats below

3.6.1. Source Path

3.6.1.1.

SPMs are transmitted by sources to establish source-path state in
network elements, and to provide transmit-window state in receivers

SPMs are multicast to the group and contain

SPM_TSI the source-assigned TSI for the session to which
SPM

SPM_SQN a sequence number assigned sequentially by the
in unit increments and scoped by SPM_

Nota Bene: this is an entirely separate sequence than is used
number ODATA and RDATA

SPM_TRAIL the sequence number defining the trailing edge of
source's transmit window (TXW_TRAIL

SPM_LEAD the sequence number defining the leading edge of
source's transmit window (TXW_LEAD

SPM_PATH the network-layer address (NLA) of the interface
the PGM network element on which the SPM is




Speakman, et. al. Experimental [Page 15]

RFC 3208 PGM Reliable Transport Protocol December 2001


3.6.2. Data

3.6.2.1. ODATA - Original

ODATA packets are transmitted by sources to send application data
receivers

ODATA packets are multicast to the group and contain

OD_TSI the globally unique source-assigned

OD_TRAIL the sequence number defining the trailing edge of
source's transmit window (TXW_TRAIL

OD_TRAIL makes the protocol more robust in the face
lost SPMs. By including the trailing edge of
transmit window on every data packet, receivers
have missed any SPMs that advanced the transmit
can still detect the case, recover the application
and potentially re-synchronize to the
session

OD_SQN a sequence number assigned sequentially by the
in unit increments and scoped by OD_

3.6.2.2. RDATA - Repair

RDATA packets are repair packets transmitted by sources or DLRs
response to NAKs

RDATA packets are multicast to the group and contain

RD_TSI OD_TSI of the ODATA packet for which this is a

RD_TRAIL the sequence number defining the trailing edge of
source's transmit window (TXW_TRAIL). This is
to the most current value when the repair is sent,
it is not necessarily the same as OD_TRAIL of
ODATA packet for which this is a

RD_SQN OD_SQN of the ODATA packet for which this is a

3.6.3. Negative

3.6.3.1. NAKs - Negative

NAKs are transmitted by receivers to request repairs for missing
packets



Speakman, et. al. Experimental [Page 16]

RFC 3208 PGM Reliable Transport Protocol December 2001


NAKs are unicast (PGM-hop-by-PGM-hop) to the source and contain

NAK_TSI OD_TSI of the ODATA packet for which a repair


NAK_SQN OD_SQN of the ODATA packet for which a repair


NAK_SRC the unicast NLA of the original source of the
ODATA

NAK_GRP the multicast group

3.6.3.2. NNAKs - Null Negative

NNAKs are transmitted by a DLR that receives NAKs redirected to it
either receivers or network elements to provide flow-control feed
back to a source

NNAKs are unicast (PGM-hop-by-PGM-hop) to the source and contain

NNAK_TSI NAK_TSI of the corresponding re-directed NAK

NNAK_SQN NAK_SQN of the corresponding re-directed NAK

NNAK_SRC NAK_SRC of the corresponding re-directed NAK

NNAK_GRP NAK_GRP of the corresponding re-directed NAK

3.6.4. Negative Acknowledgment

3.6.4.1. NCFs - NAK

NCFs are transmitted by network elements and sources in response
NAKs

NCFs are multicast to the group and contain

NCF_TSI NAK_TSI of the NAK being

NCF_SQN NAK_SQN of the NAK being

NCF_SRC NAK_SRC of the NAK being

NCF_GRP NAK_GRP of the NAK being






Speakman, et. al. Experimental [Page 17]

RFC 3208 PGM Reliable Transport Protocol December 2001


3.6.5. Option

OPT_LENGTH 0x00 - Option's

OPT_FRAGMENT 0x01 -

OPT_NAK_LIST 0x02 - List of NAK

OPT_JOIN 0x03 - Late

OPT_REDIRECT 0x07 -

OPT_SYN 0x0D -

OPT_FIN 0x0E - Session Fin receivers,


OPT_RST 0x0F - Session

OPT_PARITY_PRM 0x08 - Forward Error Correction

OPT_PARITY_GRP 0x09 - Forward Error Correction Group

OPT_CURR_TGSIZE 0x0A - Forward Error Correction Group

OPT_CR 0x10 - Congestion

OPT_CRQST 0x11 - Congestion Report

OPT_NAK_BO_IVL 0x04 - NAK Back-Off

OPT_NAK_BO_RNG 0x05 - NAK Back-Off

OPT_NBR_UNREACH 0x0B - Neighbor

OPT_PATH_NLA 0x0C - Path

OPT_INVALID 0x7F - Option

4. Procedures -

Since SPMs, NCFs, and RDATA must be treated conditionally by
network elements, they must be distinguished from other packets
the chosen multicast network protocol if PGM network elements are
extract them from the usual switching path






Speakman, et. al. Experimental [Page 18]

RFC 3208 PGM Reliable Transport Protocol December 2001


The most obvious way for network elements to achieve this is
examine every packet in the network for the PGM transport
and packet types. However, the overhead of this approach is
for high-performance, multi-protocol network elements.
alternative, and a requirement for PGM over IP multicast, is
SPMs, NCFs, and RDATA MUST be transmitted with the IP Router
Option [6]. This option gives network elements a network-
indication that a packet should be extracted from IP switching
more detailed processing

5. Procedures -

5.1. Data

Since PGM relies on a purely rate-limited transmission strategy
the source to bound the bandwidth consumed by PGM transport sessions
an assortment of techniques is assembled here to make that
as conservative and robust as possible. These techniques are
minimum REQUIRED of a PGM source

5.1.1. Maximum Cumulative Transmit

A source MUST number ODATA packets in the order in which they
submitted for transmission by the application. A source
transmit ODATA packets in sequence and only within the
window beginning with TXW_TRAIL at no greater a rate
TXW_MAX_RTE

TXW_MAX_RTE is typically the maximum cumulative transmit rate of SPM
ODATA, and RDATA. Different transmission strategies MAY
TXW_MAX_RTE as appropriate for the implementation

5.1.2. Transmit Rate

To regulate its transmit rate, a source MUST use a token
scheme or any other traffic management scheme that yields
behavior. A token bucket [7] is characterized by a
sustainable data rate (the token rate) and the extent to which
data rate may exceed the token rate for short periods of time (
token bucket size). Over any arbitrarily chosen interval, the
of bytes the source may transmit MUST NOT exceed the token
size plus the product of the token rate and the chosen interval

In addition, a source MUST bound the maximum rate at which
packets may be transmitted using a leaky bucket scheme drained at
maximum transmit rate, or equivalent mechanism





Speakman, et. al. Experimental [Page 19]

RFC 3208 PGM Reliable Transport Protocol December 2001


5.1.3. Outgoing Packet

To preserve the logic of PGM's transmit window, a source
strictly prioritize sending of pending NCFs first, pending
second, and only send ODATA or RDATA when no NCFs or SPMs
pending. The priority of RDATA versus ODATA is
dependent. The sender MAY implement weighted bandwidth
between RDATA and ODATA. Note that strict prioritization of
over ODATA may stall progress of ODATA if there are receivers
keep generating NAKs so as to always have RDATA pending (e.g.
steady stream of late joiners with OPT_JOIN). Strictly
ODATA over RDATA may lead to a larger portion of receivers
unrecoverable losses

5.1.4. Ambient

Interleaved with ODATA and RDATA, a source MUST transmit SPMs at
rate at least sufficient to maintain current source path state in
network elements. Note that source path state in network
does not track underlying changes in the distribution tree from
source until an SPM traverses the altered distribution tree.
consequence is that NAKs may go unconfirmed both at receivers
amongst network elements while changes in the underlying
tree take place

5.1.5. Heartbeat

In the absence of data to transmit, a source SHOULD transmit SPMs
a decaying rate in order to assist early detection of lost data,
maintain current source path state in PGM network elements, and
maintain current receive window state in the receivers

In this scheme [8], a source maintains an inter-heartbeat
IHB_TMR which times the interval between the most recent
(ODATA, RDATA, or SPM) transmission and the next
transmission. IHB_TMR is initialized to a minimum interval IHB_
after the transmission of any data packet. If IHB_TMR expires,
source transmits a heartbeat SPM and initializes IHB_TMR to
its previous value. The transmission of consecutive heartbeat
doubles IHB each time up to a maximum interval IHB_MAX.
transmission of any data packet initializes IHB_TMR to IHB_MIN
again. The effect is to provoke prompt detection of missing
in the absence of data to transmit, and to do so with
bandwidth overhead







Speakman, et. al. Experimental [Page 20]

RFC 3208 PGM Reliable Transport Protocol December 2001


5.1.6. Ambient and Heartbeat

Ambient and heartbeat SPMs are described as driven by separate
in this specification to highlight their contrasting functions
Ambient SPMs are driven by a count-down timer that expires
while heartbeat SPMs are driven by a count-down timer that
being reset by data, and the interval of which changes once it
to expire. The ambient SPM timer is just counting down in real-
while the heartbeat timer is measuring the inter-data-
interval

In the presence of data, no heartbeat SPMs will be transmitted
the transmission of data keeps setting the IHB_TMR back to
initial value. At the same time however, ambient SPMs MUST
interleaved into the data as a matter of course, not necessarily as
heartbeat mechanism. This ambient transmission of SPMs is
to keep the distribution tree information in the network current
to allow new receivers to synchronize with the session

An implementation SHOULD de-couple ambient and heartbeat SPM
sufficiently to permit them to be configured independently of
other

5.2. Negative Acknowledgment

A source MUST immediately multicast an NCF in response to any NAK
receives. The NCF is REQUIRED since the alternative of
immediately with RDATA would not allow other PGM network elements
the same subnet to do NAK anticipation, nor would it allow DLRs
the same subnet to provide repairs. A source SHOULD be able
detect a NAK storm and adopt countermeasure to protect the
against a denial of service. A possible countermeasure is to
the first NCF immediately in response to a NAK and then delay
generation of further NCFs (for identical NAKs) by a small interval
so that identical NCFs are rate-limited, without affecting
ability to suppress NAKs

5.3.

After multicasting an NCF in response to a NAK, a source MUST
multicast RDATA (while respecting TXW_MAX_RTE) in response to any
it receives for data packets within the transmit window

In the interest of increasing the efficiency of a particular
packet, a source MAY delay RDATA transmission to accommodate
arrival of NAKs from the whole loss neighborhood. This delay
not exceed twice the greatest propagation delay in the
neighborhood



Speakman, et. al. Experimental [Page 21]

RFC 3208 PGM Reliable Transport Protocol December 2001


6. Procedures -

6.1. Data

Initial data

A receiver SHOULD initiate data reception beginning with the
data packet it receives within the advertised transmit window.
packet's sequence number (ODATA_SQN) temporarily defines the
edge of the transmit window from the receiver's perspective.
is, it is assigned to RXW_TRAIL_INIT within the receiver, and
the trailing edge sequence number advertised in subsequent
(SPMs or ODATA or RDATA) increments past RXW_TRAIL_INIT, the
MUST only request repairs for sequence numbers subsequent
RXW_TRAIL_INIT. Thereafter, it MAY request repairs anywhere in
transmit window. This temporary restriction on repair
prevents receivers from requesting a potentially large amount
history when they first begin to receive a given PGM
session

Note that the JOIN option, discussed later, MAY be used to provide
different value for RXW_TRAIL_INIT

Receiving and discarding data

Within a given transport session, a receiver MUST accept any ODATA
RDATA packets within the receive window. A receiver MUST discard
data packet that duplicates one already received in the
window. A receiver MUST discard any data packet outside of
receive window

Contiguous

Contiguous data is comprised of those data packets within the
window that have been received and are in the range from RXW_TRAIL
to (but not including) the first missing sequence number in
receive window. The most recently received data packet of
data defines the leading edge of contiguous data

As its default mode of operation, a receiver MUST deliver
contiguous data packets to the application, and it MUST do so in
order defined by those data packets' sequence numbers. This
applications with a reliable ordered data flow








Speakman, et. al. Experimental [Page 22]

RFC 3208 PGM Reliable Transport Protocol December 2001


Non contiguous

PGM receiver implementations MAY optionally provide a mode
operation in which data is delivered to an application in the
received. However, the implementation MUST only deliver
application protocol data units (APDUs) to the application. That is
APDUs that have been fragmented into different TPDUs MUST
reassembled before delivery to the application

6.2. Source Path

Receivers MUST receive and sequence SPMs for any TSI they
receiving. An SPM is in sequence if its sequence number is
than that of the most recent in-sequence SPM and within half the
number space. Out-of-sequence SPMs MUST be discarded

For each TSI, receivers MUST use the most recent SPM to determine
NLA of the upstream PGM network element for use in NAK addressing.
receiver MUST NOT initiate repair requests until it has received
least one SPM for the corresponding TSI

Since SPMs require per-hop processing, it is likely that they will
forwarded at a slower rate than data, and that they will arrive
of sync with the data stream. In this case, the window
that the SPMs carry will be out of date. Receivers SHOULD
this to be the case and SHOULD detect it by comparing the packet
and trail values with the values the receivers have stored for
and trail. If the SPM packet values are less, they SHOULD
ignored, but the rest of the packet SHOULD be processed as normal

6.3. Data Recovery by Negative

Detecting missing data

Receivers MUST detect gaps in the expected data sequence in
following manners

by comparing the sequence number on the most recently
ODATA or RDATA packet with the leading edge of contiguous

by comparing SPM_LEAD of the most recently received SPM with
leading edge of contiguous

In both cases, if the receiver has not received all intervening
packets, it MAY initiate selective NAK generation for each
sequence number





Speakman, et. al. Experimental [Page 23]

RFC 3208 PGM Reliable Transport Protocol December 2001


In addition, a receiver may detect a single missing data packet
receiving an NCF or multicast NAK for a data packet within
transmit window which it has not received. In this case it
initiate selective NAK generation for the said sequence number

In all cases, receivers SHOULD temper the initiation of
generation to account for simple mis-ordering introduced by
network. A possible mechanism to achieve this is to assume loss
after the reception of N packets with sequence numbers higher
those of the (assumed) lost packets. A possible value for N is 2.
This method SHOULD be complemented with a timeout based
that handles the loss of the last packet before a pause in
transmission of the data stream. The leading edge field in
SHOULD also be taken into account in the loss detection algorithm

Generating

NAK generation follows the detection of a missing data packet and
the cycle of

waiting for a random period of time (NAK_RB_IVL) while
for matching NCFs or

transmitting a NAK if a matching NCF or NAK is not

waiting a period (NAK_RPT_IVL) for a matching NCF and
NAK generation if the matching NCF is not

waiting a period (NAK_RDATA_IVL) for data and recommencing
generation if the matching data is not

The entire generation process can be summarized by the
state machine


















Speakman, et. al. Experimental [Page 24]

RFC 3208 PGM Reliable Transport Protocol December 2001


|
| detect missing
| - clear data retry
| - clear NCF retry

matching NCF |--------------------------|
<---------------| BACK-OFF_STATE | <----------------------
| | start timer(NAK_RB_IVL) | ^ ^
| | | | |
| |--------------------------| | |
| matching | | timer expires | |
| NAK | | - send NAK | |
| | | | |
| V V | |
| |--------------------------| | |
| | WAIT_NCF_STATE | | |
| matching NCF | start timer(NAK_RPT_IVL) | | |
|<--------------| |------------> |
| |--------------------------| timer expires |
| | | ^ - increment NCF |
| NAK_NCF_RETRIES | | | retry count |
| exceeded | | | |
| V ----------- |
| Cancelation matching NAK |
| - restart timer(NAK_RPT_IVL) |
| |
| |
V |--------------------------| |
--------------->| WAIT_DATA_STATE |----------------------->
|start timer(NAK_RDATA_IVL)| timer
| | - increment
|--------------------------| retry
| | ^
NAK_DATA_RETRIES | | |
exceeded | | |
| -----------
| matching NCF or
V - restart timer(NAK_RDATA_IVL


In any state, receipt of matching RDATA or ODATA completes
recovery and successful exit from the state machine.
transition stops any running timers

In any state, if the trailing edge of the window moves beyond
sequence number, data recovery for that sequence number terminates





Speakman, et. al. Experimental [Page 25]

RFC 3208 PGM Reliable Transport Protocol December 2001


During NAK_RB_IVL a NAK is said to be pending. When awaiting data
an NCF, a NAK is said to be outstanding

Backing off NAK

Before transmitting a NAK, a receiver MUST wait some
NAK_RB_IVL chosen randomly over some time period NAK_BO_IVL.
this period, receipt of a matching NAK or a matching NCF will
NAK generation. NAK_RB_IVL is counted down from the time a
data packet is detected

A value for NAK_BO_IVL learned from OPT_NAK_BO_IVL (see 16.4.1 below
MUST NOT be used by a receiver (i.e., the receiver MUST NOT NAK
unless either NAK_BO_IVL_SQN is zero, or the receiver has
POLL_RND == 0 for POLL_SQN =< NAK_BO_IVL_SQN within half the
number space

When a parity NAK (Appendix A, FEC) is being generated, the back-
interval SHOULD be inversely biased with respect to the number
parity packets requested. This way NAKs requesting larger numbers
parity packets are likely to be sent first and thus suppress
NAKs. A NAK for a given transmission group suppresses another
for the same transmission group only if it is requesting an equal
larger number of parity packets

When a receiver has to transmit a sequence of NAKs, it
transmit the NAKs in order from oldest to most recent

Suspending NAK

Suspending NAK generation just means waiting for either NAK_RB_IVL
NAK_RPT_IVL or NAK_RDATA_IVL to pass. A receiver MUST suspend
generation if a duplicate of the NAK is already pending from
receiver or the NAK is already outstanding from this or
receiver

NAK

A receiver MUST suppress NAK generation and wait at
NAK_RDATA_IVL before recommencing NAK generation if it hears
matching NCF or NAK during NAK_RB_IVL. A matching NCF must
NCF_TSI with NAK_TSI, and NCF_SQN with NAK_SQN

Transmitting a

Upon expiry of NAK_RB_IVL, a receiver MUST unicast a NAK to
upstream PGM network element for the TSI specifying the
session identifier and missing sequence number. In addition, it



Speakman, et. al. Experimental [Page 26]

RFC 3208 PGM Reliable Transport Protocol December 2001


multicast a NAK with TTL of 1 to the group, if the PGM parent is
directly connected. It also records both the address of the
of the corresponding ODATA and the address of the group in the
header

It MUST repeat the NAK at a rate governed by NAK_RPT_IVL up
NAK_NCF_RETRIES times while waiting for a matching NCF. It MUST
wait NAK_RDATA_IVL before recommencing NAK generation. If it hears
matching NCF or NAK during NAK_RDATA_IVL, it MUST wait anew
NAK_RDATA_IVL before recommencing NAK generation (i.e. matching
and NAKs restart NAK_RDATA_IVL).

Completion of NAK

NAK generation is complete only upon the receipt of the
RDATA (or even ODATA) packet at any time during NAK generation

Cancellation of NAK

NAK generation is cancelled upon the advancing of the receive
so as to exclude the matching sequence number of a pending
outstanding NAK, or NAK_DATA_RETRIES / NAK_NCF_RETRIES
exceeded. Cancellation of NAK generation indicates
data loss

Receiving NCFs and multicast

A receiver MUST discard any NCFs or NAKs it hears for data
outside the transmit window or for data packets it has received
Otherwise they are treated as appropriate for the current
state

7. Procedures - Network

7.1. Source Path

Upon receipt of an in-sequence SPM, a network element records
Source Path Address SPM_PATH with the multicast routing
for the TSI. If the receiving network element is on the same
as the forwarding network element, this address will be the same
the address of the immediately upstream network element on
distribution tree for the TSI. If, however, non-PGM network
intervene between the forwarding and the receiving network elements
this address will be the address of the first PGM network
across the intervening network elements






Speakman, et. al. Experimental [Page 27]

RFC 3208 PGM Reliable Transport Protocol December 2001


The network element then forwards the SPM on each outgoing
for that TSI. As it does so, it encodes the network address of
outgoing interface in SPM_PATH in each copy of the SPM it forwards

7.2. NAK

Network elements MUST immediately transmit an NCF in response to
unicast NAK they receive. The NCF MUST be multicast to the group
the interface on which the NAK was received

Nota Bene: In order to avoid creating multicast routing state
PGM network elements across non-PGM-capable clouds, the network
header source address of NCFs transmitted by network elements
be set to the ODATA source's NLA, not the network element's NLA
might be expected

Network elements should be able to detect a NAK storm and
counter-measure to protect the network against a denial of service
A possible countermeasure is to send the first NCF immediately
response to a NAK and then delay the generation of further NCFs (
identical NAKs) by a small interval, so that identical NCFs
rate-limited, without affecting the ability to suppress NAKs

Simultaneously, network elements MUST establish repair state for
NAK if such state does not already exist, and add the interface
which the NAK was received to the corresponding repair