As per Relevance of the word sampling, we have this rfc below:











Network Working Group J.
Request for Comments: 2762
Category: Experimental H.
Columbia U
February 2000


Sampling of the Group Membership in

Status of this

This memo defines an Experimental Protocol for the
community. It does not specify an Internet standard of any kind
Discussion and suggestions for improvement are requested
Distribution of this memo is unlimited

Copyright

Copyright (C) The Internet Society (2000). All Rights Reserved



In large multicast groups, the size of the group membership
maintained by RTP (Real Time Transport Protocol) participants
become unwieldy, particularly for embedded devices with
memory and processing power. This document discusses mechanisms
sampling of this group membership table in order to reduce the
requirements. Several mechanisms are proposed, and the performance
each is considered

1

RTP, the Real Time Transport Protocol [1], mandates that RTCP
be transmitted from each participant with a period
proportional to the group size. The group size is obtained by
a table, containing an entry for each unique SSRC seen in RTP
RTCP packets. As members leave or time out, entries are deleted,
as new members join, entries are added. The table is thus
dynamic

For large multicast sessions, such as an mbone broadcast or IP-
TV distribution, group sizes can be extremely large, on the order
hundreds of thousands to millions of participants. In
environments, RTCP may not always be used, and thus the
membership table isn't needed. However, it is highly desirable
RTP to scale well for groups with one member to groups with
million members, without human intervention to "turn off" RTCP
it's no longer appropriate. This means that the same tools



Rosenberg & Schulzrinne Experimental [Page 1]

RFC 2762 RTP Sampling February 2000


systems can be used for both small conferences and TV broadcasts in
smooth, scalable fashion

Previous work [2] has identified three major scalability
with RTP. These are

1. Congestion due to floods of RTCP packets in highly dynamic groups

2. Large delays between receipt of RTCP packets from a single user

3. Large size of the group membership table

The reconsideration algorithm [2] helps to alleviate the first
these. This document addresses the third, that of large group
tables

Storage of an SSRC table with one million members, for example
requires at least four megabytes. As a result, embedded devices
small memory capacity may have difficulty under these conditions.
solve this problem, SSRC sampling has been proposed. SSRC
uses statistical sampling to obtain a stochastic estimate of
group membership. There are many issues that arise when this is done
This document reviews these issues and discusses the mechanisms
can be applied by implementors. In particular, it focuses on
methods for adapting the sampling probability as the group
varies. It is important to note that the IETF has been notified
intellectual property rights claimed in regard to some or all of
specification contained in this document, and in particular to one
the three mechanisms: the binning algorithm described below. For
information consult the online list of claimed rights. The two
approaches presented are inferior to the binning algorithm, but
included as they are believed to be unencumbered by IPR

2 Basic

The basic idea behind SSRC sampling is simple. Each
maintains a key K of 32 bits, and a mask M of 32 bits. Assume that
of the bits in the mask are 1, and the remainder are zero. When
RTCP packet arrives with some SSRC S, rather than placing it in
table, it is first sampled. The sampling is performed by ANDing
key and the mask, and also ANDing the SSRC and the mask.
resulting values are compared. If equal, the SSRC is stored in
table. If not equal, the SSRC is rejected, and the packet is
as if it has never been received

The key can be anything, but is usually derived from the SSRC of
user who is performing the sampling




Rosenberg & Schulzrinne Experimental [Page 2]

RFC 2762 RTP Sampling February 2000


This sampling process can be described mathematically as

D = (K*M == S*M

Where the * operator denotes AND and the == operator denotes a
for equality. D represents the sampling decision

According to the RTP specification, the SSRC's used by
participants are chosen randomly. If the distribution is
uniform, it is easy to see that the above filtering will cause 1
of 2**m SSRC's to be placed in the table, where m is the number
bits in the mask, M, which are one. Thus, the sampling probability
is 2**-m

Then, to obtain an actual group size estimate, L, the number
entries in the table N is multiplied by 2**m

L = N * 2**

Care must be taken when choosing which bits to set to 1 in the mask
Although the RTP specification mandates randomly chosen SSRC,
are many known implementations which do not conform to this.
particular, the ITU H.323 [3] series of recommendations allows
central control element, the gatekeeper, to assign the
significant 8 bits of the SSRC, while the most significant
randomly chosen by RTP participants

The safest way to handle this problem is to first hash the SSRC
a cryptographically secure hash, such as MD5 [4], and then choose 32
of the bits in the result as the SSRC used in the above computation
This provides much better randomness, and doesn't require
knowledge about how various implementations actually set the SSRC

2.1

The estimate is more accurate as the value of m decreases,
accurate as it increases. This can be demonstrated analytically.
the actual group size is G, the ratio of the standard deviation
mean of the estimate L (coefficient of variation) is

sqrt((2**m - 1)/G

This equation can be used as a guide for selecting the thresholds
when to change the sampling factor, as discussed below. For example
if the target is a 1% standard deviation to mean, the






Rosenberg & Schulzrinne Experimental [Page 3]

RFC 2762 RTP Sampling February 2000


probability p=2**-m should be no smaller than .5 when there are
thousand group members. More generally, to achieve a desired
deviation to mean ratio of T, the sampling probability should be
less than

p > 1 / (1 + G*(T**2))

3 Increasing the Sampling

The above simple sampling procedure would work fine if the group
was static. However, it is not. A participant joining an RTP
will initially see just one participant (themselves). As packets
received, the group size as seen by that participant will increase
To handle this, the sampling probability must be made dynamic,
will need to increase and decrease as group sizes vary

The procedure for increasing the sampling probability is easy.
participant starts with a mask with m=0. Under these conditions
every received SSRC will be stored in the table, so there
effectively no sampling. At some point, the value of m is
by one. This implies that approximately half of the SSRC already
the table will no longer match the key under the masking operation
In order to maintain a correct estimate, these SSRC must be
from the table. New SSRC are only added if they match the key
the new mask

The decision about when to increase the number of bits in the mask
also simple. Let's say an RTP participant has a memory with
capacity to store C entries in the table. The best estimate of
group is obtained by the largest sampling probability. This
means that the best estimate is obtained the fuller the table is.
reasonable approach is therefore to increase the number of bits
the mask just as the table fills to C. This will generally cause
contents to be reduced by half on average. Once the table
again, the number of bits in the mask is further increased

4 Reducing the Sampling

If the group size begins to decrease, it may be necessary to
the number of one bits in the mask. Not doing so will result
extremely poor estimates of the group size. Unfortunately,
the number of bits in the mask is more difficult than
them

When the number of bits in the mask increases, the user
by removing those SSRC which no longer match. When the number of
decreases, the user should theoretically add back those users
SSRC now match. However, these SSRC are not known, since the



Rosenberg & Schulzrinne Experimental [Page 4]

RFC 2762 RTP Sampling February 2000


point of sampling was to not have to remember them. Therefore, if
number of bits in the mask is just reduced without any changes in
membership table, the group estimate will instantly drop by
half

To compensate for this, some kind of algorithm is needed.
approaches are presented here: a corrective-factor solution, and
binning solution. The binning solution is simpler to understand
performs better. However, we include a discussion of the corrective
factor solution for completeness and comparison, and also because
is believed to be unencumbered by IPR

4.1 Corrective

The idea with the corrective factors is to take one of
approaches. In the first, a corrective factor is added to the
size estimate, and in the second, the group size estimate
multiplied by a corrective factor. In both cases, the purpose is
compensate for the change in sample mask. The corrective
should decay as the "fudged" members are eventually learned about
actually placed in the membership list

The additive factor starts at the difference between the group
estimate before and after the number of bits in the mask is reduced
and decays to 0 (this is not always half the group size estimate,
the corrective factors can be compounded, see below).
multiplicative corrective factor starts at 2, and gradually decays
one. Both factors decay over a time of cL(ts-), where c is
average RTCP packet size divided by the RTCP bandwidth for receivers
and L(ts-) is the group size estimate just before the change in
number of bits in the mask at time ts. The reason for this
is as follows. In the case where the actual group membership has
changed, those members which were forgotten will still be
RTCP packets. The amount of time it will take to hear an RTCP
from each of them is the average RTCP interval, which is cL(ts-).
Therefore, by cL(ts-) seconds after the change in the mask,
users who were fudged by the corrective factor should have sent
packet and thus appear in the table. We chose to decay both
linearly. This is because the rate of arrival of RTCP packets
linear

What happens if the number of bits in the mask is reduced once
before the previous corrective factor has expired? In that case,
compound the factors by using yet another one. Let fi() represent
ith additive correction function, and gi() the ith
correction function. If ts is the time when the number of bits in
mask is reduced, we can describe the additive correction factor as




Rosenberg & Schulzrinne Experimental [Page 5]

RFC 2762 RTP Sampling February 2000


/ 0 , t <
| ts + cL(ts-) -
fi(t) = |( L(ts-) - L(ts+)) ---------------- , ts < t < ts+cL(ts-)
| cL(ts-)
| 0 , t > ts + cL(ts-)
\



and the multiplicative factor as


/ 1 , t <
|
| ts + 2cL(ts-) -
gi(t) | ----------------- , ts < t < ts + cL(ts-)
| cL(ts-)
|
\ 1 , t > ts + cL(ts-)

Note that in these equations, L(t) denotes the group size
obtained including the corrective factors except for the new factor
ts- is the time right before the reduction in the number of bits,
ts+ the time after. As a result, L(ts-) represents the group
estimate before the reduction, and L(ts+) the estimate right after
but not including the new factor

Finally, the actual group size estimate L(t) is given by

-----
\
L(t) = / fi(t) + N*(2**m
-----


for the additive factor, and

------
| |
| |
L(t)= | | N*(2**m)*gi(t

for the multiplicative factor

Simulations showed that both algorithms performed equally well,
both tended to seriously underestimate the group size when the
membership was rapidly declining [5]. This is demonstrated in
performance data below



Rosenberg & Schulzrinne Experimental [Page 6]

RFC 2762 RTP Sampling February 2000


As an example, consider computation of the additive factor. The
size is 1000, c is 1 second, and m is two. With a mask of this size
a participant will, on average, observe 250 (N = 250) users. At t=0,
the user decides to reduce the number of bits in the mask to 1. As
result, L(0-) is 1000, and L(0+) is 500. The additive
therefore starts at 500, and decays to zero at time ts + cL(ts-) =
1000. At time 500, lets assume N has increased to 375 (this will,
average, be the case if the actual group size has not changed).
time 500, the additive factor is 250. This is added to 2**m times N
which is 750, resulting in a group size estimate of 1000. Now,
user decides to reduce the number of bits in the mask again, so
m=0. Another additive factor is computed. This factor starts
L(ts-) (which is 1000), minus L(ts+). L(ts+) is computed without
new factor; it is the first additive factor at this time (250)
2**m (1) times N (375). This is 625. As a result, the new
factor starts at 1000 - 625 (375), and decays to 0 in 1000 seconds

4.2 Binning

In order to more correctly estimate the group size even when it
rapidly decreasing, a binning algorithm can be used. The
works as follows. There are 32 bins, same as the number of bits
the sample mask. When an RTCP packet from a new user arrives
SSRC matches the key under the masking operation, it is placed in
mth bin (where m is the number of ones in the mask) otherwise it
discarded

When the number of bits in the mask is to be increased, those
in the bin who still match after the new mask are moved into the
higher bin. Those who don't match are discarded. When the number
bits in the mask is to be decreased, nothing is done. Users in
various bins stay where they are. However, when an RTCP packet for
user shows up, and the user is in a bin with a higher value than
current number of bits in the mask, it is moved into the
corresponding to the current number of bits in the mask. Finally,
group size estimate L(t) is obtained by

31
----
\
L(t) = / B(i) * 2**
----
i=0



Where B(i) are the number of users in the ith bin




Rosenberg & Schulzrinne Experimental [Page 7]

RFC 2762 RTP Sampling February 2000


The algorithm works by basically keeping the old estimate when
number of bits in the mask drops. As users arrive, they are
moved into the lower bin, reducing the amount that the higher
contributes to the total estimate. However, the old estimate is
updated in the sense that users which timeout are removed from
higher bin, and users who send BYE packets are also removed from
higher bin. This allows the older estimate to still adapt,
gradually phasing it out. It is this adaptation which makes
perform much better than the corrective algorithms. The algorithm
also extremely simple

4.3

The algorithms are all compared via simulation in Table 1. In
simulation, 10,001 users join a group at t=0. At t=10,000, 5000
them leave. At t=20,000, another 5000 leave. All implement an
sampling algorithm, unconditional forward reconsideration and
reconsideration. The table depicts the group size estimate from
20,000 to time 25,000 as seen by the single user present
the entire session. In the simulation, a memory size of 1000 SSRC
assumed. The performance without sampling, and with sampling with
additive, multiplicative, and bin-based correction are depicted

As the table shows, the bin based algorithm performs
well at capturing the group size estimate towards the tail end of
simulation

























Rosenberg & Schulzrinne Experimental [Page 8]

RFC 2762 RTP Sampling February 2000


Time No Sampling Binned Additive
---- ----------- ------ -------- --------------
20000 5001 5024 5024 5024
20250 4379 4352 4352 4352
20500 3881 3888 3900 3853
20750 3420 3456 3508 3272
21000 3018 2992 3100 2701
21250 2677 2592 2724 2225
21500 2322 2272 2389 1783
21750 2034 2096 2125 1414
22000 1756 1760 1795 1007
22250 1476 1472 1459 582
22500 1243 1232 1135 230
22750 1047 1040 807 80
23000 856 864 468 59
23250 683 704 106 44
23500 535 512 32 32
23750 401 369 24 24
24000 290 257 17 17
24250 198 177 13 13
24500 119 129 11 11
24750 59 65 8 8
25000 18 1 2 2

4.4 Sender

Care must be taken in handling senders when using SSRC sampling
Since the number of senders is generally small, and they
significantly to the computation of the RTCP interval,
should not be applied to them. However, they must be kept in
separate table, and not be "counted" as part of the general
membership. If they are counted as part of the general
membership, and are not sampled, the group size estimate will
inflated to overemphasize the senders

This is easily demonstrated analytically. Let Ns be the number
senders, and Nr be the number of receivers. The membership table
contain all Ns senders and (1/2)**m of the receivers. The total
size estimate in the current memo is obtained by 2**m times
number of entries in the table. Therefore, the group size
becomes

L(t) = (2**m) Ns +

which exponentially weights the senders






Rosenberg & Schulzrinne Experimental [Page 9]

RFC 2762 RTP Sampling February 2000


This is easily compensated for in the binning algorithm. A sender
always placed in the 0th bin. When a sender becomes a receiver, it
moved into the bin corresponding to the current value of m, if
SSRC matches the key under the masked comparison operation

5 Security

The use of SSRC sampling does not appear to introduce any
security considerations beyond those described in [1]. In fact,
sampling, as described above, can help somewhat in reducing
effect of certain attacks

RTP, when used without authentication of RTCP packets, is
to a spoofing attack. Attackers can inject many RTCP packets into
group, each with a different SSRC. This will cause RTP
to believe the group membership is much higher than it actually is
The result is that each participant will end up transmitting
packets very infrequently, if ever. When SSRC sampling is used,
problem can be amplified if a participant is not applying a hash
the SSRC before matching them against their key. This is because
attacker can send many packets, each with different SSRC, that
the key. This would cause the group size to inflate exponentially
However, with a random hash applied, an attacker cannot guess
SSRC which will match against the key. In fact, an attacker will
to send 2**m different SSRC before finding one that matches,
average. Of course, the effect of a match causes an increase of
membership by 2**m. But, the use of sampling means that an
will have to send many packets before an effect can be observed

6

The authors wish to thank Bill Fenner and Vern Paxson for
comments

7

[1] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP
a transport protocol for real-time applications", RFC 1889,
January 1996.

[2] J. Rosenberg and H. Schulzrinne, "Timer reconsideration
enhanced RTP scalability", IEEE Infocom, (San Francisco
California), March/April 1998.








Rosenberg & Schulzrinne Experimental [Page 10]

RFC 2762 RTP Sampling February 2000


[3] International Telecommunication Union, "Visual telephone
and equipment for local area networks which provide a non
guaranteed quality of service," Recommendation H.323,
Telecommunication Standardization Sector of ITU, Geneva
Switzerland, May 1996.

[4] Rivest, R., "The MD5 message-digest algorithm", RFC 1321,
1992.

[5] Rosenberg, J., "Protocols and Algorithms for
Distributed Internet Telephony," PhD Thesis, Columbia University
Dec. 1999. Work in Progress

8 Authors'

Jonathan

200 Executive
West Orange, NJ 07052


EMail: jdrosen@dynamicsoft.


Henning
Columbia
M/S 0401
1214 Amsterdam Ave
New York, NY 10027-7003


EMail: schulzrinne@cs.columbia.



















Rosenberg & Schulzrinne Experimental [Page 11]

RFC 2762 RTP Sampling February 2000


9 Full Copyright

Copyright (C) The Internet Society (2000). All Rights Reserved

This document and translations of it may be copied and furnished
others, and derivative works that comment on or otherwise explain
or assist in its implementation may be prepared, copied,
and distributed, in whole or in part, without restriction of
kind, provided that the above copyright notice and this paragraph
included on all such copies and derivative works. However,
document itself may not be modified in any way, such as by
the copyright notice or references to the Internet Society or
Internet organizations, except as needed for the purpose
developing Internet standards in which case the procedures
copyrights defined in the Internet Standards process must
followed, or as required to translate it into languages other
English

The limited permissions granted above are perpetual and will not
revoked by the Internet Society or its successors or assigns

This document and the information contained herein is provided on
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED,
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE



Funding for the RFC Editor function is currently provided by
Internet Society



















Rosenberg & Schulzrinne Experimental [Page 12]








if you see any problems within the linking, don't worry be happy,
this is version 0.1 of the Relevance System and you gotta expect some crappy subroutines sometimes,
just be content we did not write this in Java, which would have made this "bigger and better" HAHAHHA.




RFC documents can be found at I.E.T.F.



Relevance System Copyright © 2002 Spectrum WorldResearch
other technical nosh by ServerMasters Corporation
collaboration of BobX







Spectrum