Re: [R-C] Congestion Control BOF

I'm responding to the new list, but CCing people for now until they have a chance to join; if you want to continue in this discussion join the list at http://www.alvestrand.no/mailman/listinfo/rtp-congestion On 10/8/2011 5:13 PM, Harald Alvestrand wrote:
On 10/08/2011 04:53 PM, Randell Jesup wrote:
On 10/8/2011 7:26 AM, Harald Alvestrand wrote:
Randell, this seems like a very good start!
Two thoughts from random discussion:
- WRT timestamps, there's a header extension proposed in the TFRC-for-RTP draft that would tag each packet at sending time rather than at frame generation time. Could we switch to recommending this timestamp, and using it if it's available?
That could mess up A/V sync, I'm afraid. How? The A/V timestamps would still be available for syncing, but we'd use the send-time timestamps to compute throughput (and then would have the ability to measure each packet's transit time individually, rather than looking at frame generation -> frame arrival timing).
Well, I'm probably being overly-worried about processing delays (and in particular differing delays for audio and video). Let's say audio gets sampled at X, and (ignoring other processing steps) takes 1ms to encode. It gets to the wire at X + <other steps> + 1. Lets say video is also sampled at X, and (ignoring other processing steps) takes 10ms to encode. It gets to the wire at X + <other steps> + 10. So we've added a 9ms offset to all our A/V sync, and in this case it's in the "wrong" direction (people are more sensitive to early-audio than early-video). And if "other steps" on each side don't balance (and they may not), it could be worse. I also worry more that in a browser, with no access to true RT_PRI processing, the delays could be significantly variable (we get preempted by some other process/thread for 10 or 20ms, etc). Also, if the receiver isn't careful it could be tricked into skipping frames it should be displaying due to jitter in the packet-to-packet timestamps. So perhaps I'm not being overly-worried. I realize that I'm trading off accuracy in bandwidth estimation (or if you prefer, reaction speed) for ease in getting a consistent framerate and best-possible A/V sync. In a perfect world we'd record the sampling time and the delta until it was submitted to sendto(), so we'd have both. (You could use a header extension to do that). Also... I've been thinking more about my two listed options for feeding the combined Kalman filter. I *think* the first option might not work in practice (treat all the packets as in one stream, and calculate arrival deltas for each packet received against the previous one). Without a hard relationship between the two timestamp clocks (in particular relative offset), you can't calculate inter-packet send times properly. You can choose an arbitrary offset between them and update it periodically, but that will not match reality; it will be offset in one direction or the other. This means that A,B will be biased in one direction, and B,A will be biased in the other - and A,A or B,B won't be biased. This might "balance out" and happen to work, but I'm distrustful without more info about timestamp-synchronization being passed. We should look into it and how well option 1 works with sync sources, because in some cases they are synchronized or could be. Ironically Harald's suggestion above (or the header-extension variant) would help work around this. -- Randell Jesup randell-ietf@jesup.org

On Sat, Oct 8, 2011 at 10:39 PM, Randell Jesup <randell-ietf@jesup.org>wrote:
I'm responding to the new list, but CCing people for now until they have a chance to join; if you want to continue in this discussion join the list at http://www.alvestrand.no/**mailman/listinfo/rtp-**congestion<http://www.alvestrand.no/mailman/listinfo/rtp-congestion>
On 10/8/2011 5:13 PM, Harald Alvestrand wrote:
On 10/08/2011 04:53 PM, Randell Jesup wrote:
On 10/8/2011 7:26 AM, Harald Alvestrand wrote:
Randell, this seems like a very good start!
Two thoughts from random discussion:
- WRT timestamps, there's a header extension proposed in the TFRC-for-RTP draft that would tag each packet at sending time rather than at frame generation time. Could we switch to recommending this timestamp, and using it if it's available?
That could mess up A/V sync, I'm afraid.
How? The A/V timestamps would still be available for syncing, but we'd use the send-time timestamps to compute throughput (and then would have the ability to measure each packet's transit time individually, rather than looking at frame generation -> frame arrival timing).
Well, I'm probably being overly-worried about processing delays (and in particular differing delays for audio and video). Let's say audio gets sampled at X, and (ignoring other processing steps) takes 1ms to encode. It gets to the wire at X + <other steps> + 1. Lets say video is also sampled at X, and (ignoring other processing steps) takes 10ms to encode. It gets to the wire at X + <other steps> + 10. So we've added a 9ms offset to all our A/V sync, and in this case it's in the "wrong" direction (people are more sensitive to early-audio than early-video). And if "other steps" on each side don't balance (and they may not), it could be worse. I also worry more that in a browser, with no access to true RT_PRI processing, the delays could be significantly variable (we get preempted by some other process/thread for 10 or 20ms, etc). Also, if the receiver isn't careful it could be tricked into skipping frames it should be displaying due to jitter in the packet-to-packet timestamps.
So perhaps I'm not being overly-worried. I realize that I'm trading off accuracy in bandwidth estimation (or if you prefer, reaction speed) for ease in getting a consistent framerate and best-possible A/V sync. In a perfect world we'd record the sampling time and the delta until it was submitted to sendto(), so we'd have both. (You could use a header extension to do that).
There's a lot more going on here. The algorithmic delays for audio and video will often be different, the capture delays perhaps wildly so. In addition, you won't want to just dump the video directly onto the wire - typically it will be leaked out over some interval to avoid bandwidth spikes, and the audio will have to maintain some jitter buffer to prevent underrun - so I think the encoding processing deltas will be nominal compared to the other delays in the pipeline. I think this also does illustrate why having "time-on-wire" timestamping is really useful for increasing estimation accuracy :-)
Also...
I've been thinking more about my two listed options for feeding the combined Kalman filter. I *think* the first option might not work in practice (treat all the packets as in one stream, and calculate arrival deltas for each packet received against the previous one). Without a hard relationship between the two timestamp clocks (in particular relative offset), you can't calculate inter-packet send times properly. You can choose an arbitrary offset between them and update it periodically, but that will not match reality; it will be offset in one direction or the other. This means that A,B will be biased in one direction, and B,A will be biased in the other - and A,A or B,B won't be biased. This might "balance out" and happen to work, but I'm distrustful without more info about timestamp-synchronization being passed. We should look into it and how well option 1 works with sync sources, because in some cases they are synchronized or could be.
Ironically Harald's suggestion above (or the header-extension variant) would help work around this.
-- Randell Jesup randell-ietf@jesup.org

On 10/8/2011 11:29 PM, Justin Uberti wrote:
On Sat, Oct 8, 2011 at 10:39 PM, Randell Jesup <randell-ietf@jesup.org <mailto:randell-ietf@jesup.org>> wrote:
Well, I'm probably being overly-worried about processing delays (and in particular differing delays for audio and video). Let's say audio gets sampled at X, and (ignoring other processing steps) takes 1ms to encode. It gets to the wire at X + <other steps> + 1. Lets say video is also sampled at X, and (ignoring other processing steps) takes 10ms to encode. It gets to the wire at X + <other steps> + 10. So we've added a 9ms offset to all our A/V sync, and in this case it's in the "wrong" direction (people are more sensitive to early-audio than early-video). And if "other steps" on each side don't balance (and they may not), it could be worse. I also worry more that in a browser, with no access to true RT_PRI processing, the delays could be significantly variable (we get preempted by some other process/thread for 10 or 20ms, etc). Also, if the receiver isn't careful it could be tricked into skipping frames it should be displaying due to jitter in the packet-to-packet timestamps.
So perhaps I'm not being overly-worried. I realize that I'm trading off accuracy in bandwidth estimation (or if you prefer, reaction speed) for ease in getting a consistent framerate and best-possible A/V sync. In a perfect world we'd record the sampling time and the delta until it was submitted to sendto(), so we'd have both. (You could use a header extension to do that).
There's a lot more going on here. The algorithmic delays for audio and video will often be different, the capture delays perhaps wildly so. In addition, you won't want to just dump the video directly onto the wire - typically it will be leaked out over some interval to avoid bandwidth spikes, and the audio will have to maintain some jitter buffer to prevent underrun - so I think the encoding processing deltas will be nominal compared to the other delays in the pipeline.
Sure - though you have the sampling time of the audio and video, and if you do your job right on the playback side, they'll be rock-solid synced (and that can be done even if there's static drift between the audio and video timestamp clocks). So long as you don't use time-on-wire timestamps...
I think this also does illustrate why having "time-on-wire" timestamping is really useful for increasing estimation accuracy :-)
BTW, I was serious when I said you could improve on this with an RTP header extension with "time-on-the-wire" delta from sample time. However, I don't think we need this here. As it would be totally optional and ignored, that could be added later. -- Randell Jesup randell-ietf@jesup.org

Hi, I want to lift an important non-technical topic for discussion. Namely, what is needed in the specifications and how to get that to happen. It is clear that RTCWEB can't be the home to specify any new congestion control algorithms or procedures. That needs to be defined elsewhere. Looking at what is available from a specification point of view, I think the only that is fully specified for RTP is TFRC in DCCP. The technical discussion appears to be desiring something else. Thus I think we need to have the discussion of how to get that written up in a specification eventually. What is the plan here? From my perspective we will need something that is acceptable to get the RTCWEB's RTP usage specification through IESG. Especially as we have identified clear security threats to no having congestion control in the browser part preventing significant over uses. So what are our alternative here? 1) Pick TFRC for now while developing something better? Possibly ensure that the RTP mapping gets published in a reasonable time frame. 2) Try to write clear requirements on the implementation, but no specification and hopes that goes through? 3) Develop something and delay the publication of any part that needs this until it is done? 4) ? Regarding 3), I don't see how that is going to complete in less than 2 more likely 3 years. Spin up a TSV WG, develop a solution. Simulate and discuss corner cases for a while before getting good enough out. So what are your views on this issue? Cheers Magnus Westerlund ---------------------------------------------------------------------- Multimedia Technologies, Ericsson Research EAB/TVM ---------------------------------------------------------------------- Ericsson AB | Phone +46 10 7148287 Färögatan 6 | Mobile +46 73 0949079 SE-164 80 Stockholm, Sweden| mailto: magnus.westerlund@ericsson.com ----------------------------------------------------------------------

On 10/10/2011 11:22 AM, Magnus Westerlund wrote:
Hi,
I want to lift an important non-technical topic for discussion. Namely, what is needed in the specifications and how to get that to happen.
It is clear that RTCWEB can't be the home to specify any new congestion control algorithms or procedures. That needs to be defined elsewhere.
Looking at what is available from a specification point of view, I think the only that is fully specified for RTP is TFRC in DCCP.
The technical discussion appears to be desiring something else. Thus I think we need to have the discussion of how to get that written up in a specification eventually. What is the plan here? From my perspective we will need something that is acceptable to get the RTCWEB's RTP usage specification through IESG. Especially as we have identified clear security threats to no having congestion control in the browser part preventing significant over uses. To me, this logic looks somewhat convoluted. The fact that we have security threats that can be alleviated using congestion control means, to me, that for the good of the Internet, we need to have those mechanisms in place.
The IESG are supposed to care about the Internet, not about some obscure agenda irrelevant to real life.
So what are our alternative here?
1) Pick TFRC for now while developing something better? Possibly ensure that the RTP mapping gets published in a reasonable time frame.
2) Try to write clear requirements on the implementation, but no specification and hopes that goes through?
3) Develop something and delay the publication of any part that needs this until it is done?
4) ?
Regarding 3), I don't see how that is going to complete in less than 2 more likely 3 years. Spin up a TSV WG, develop a solution. Simulate and discuss corner cases for a while before getting good enough out. I would vote for 2) in parallel with 3) - get a set of requirements together that can be tested, and have something that minimally passes the test (and ideally works well). Then publish, and let the chips fall where they may.
I'm not willing to concede the field to the "everything takes two years" meme. In the extreme, I'd be willing to concede that "the Internet runs on internet-drafts" and spend significant time iterating over essentialy the same solutions before publication. If we don't attempt to boil the ocean and achieve perfection in our requirements, I think it's possible to get there from here.
So what are your views on this issue?
Cheers
Magnus Westerlund
---------------------------------------------------------------------- Multimedia Technologies, Ericsson Research EAB/TVM ---------------------------------------------------------------------- Ericsson AB | Phone +46 10 7148287 Färögatan 6 | Mobile +46 73 0949079 SE-164 80 Stockholm, Sweden| mailto: magnus.westerlund@ericsson.com ----------------------------------------------------------------------
_______________________________________________ Rtp-congestion mailing list Rtp-congestion@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtp-congestion

On 10 Oct 2011, at 10:22, Magnus Westerlund wrote:
I want to lift an important non-technical topic for discussion. Namely, what is needed in the specifications and how to get that to happen.
It is clear that RTCWEB can't be the home to specify any new congestion control algorithms or procedures. That needs to be defined elsewhere.
Looking at what is available from a specification point of view, I think the only that is fully specified for RTP is TFRC in DCCP.
There's a TCP-like CCID for DCCP that could be used too, if we went down that route. TFWC could be written up as a CCID fairly easily too, as it's reasonably well specified. One of the strengths of DCCP is that it made the choice of congestion control algorithm something that is negotiated at connection setup time. Whether or not we adopt DCCP as a lower-layer protocol, it'd be valuable to put in the necessary hooks to negotiate different congestion control, since whatever we develop now is unlikely to be the final answer in this space.
The technical discussion appears to be desiring something else. Thus I think we need to have the discussion of how to get that written up in a specification eventually. What is the plan here? From my perspective we will need something that is acceptable to get the RTCWEB's RTP usage specification through IESG. Especially as we have identified clear security threats to no having congestion control in the browser part preventing significant over uses.
So what are our alternative here?
1) Pick TFRC for now while developing something better? Possibly ensure that the RTP mapping gets published in a reasonable time frame.
2) Try to write clear requirements on the implementation, but no specification and hopes that goes through?
3) Develop something and delay the publication of any part that needs this until it is done?
4) ?
Regarding 3), I don't see how that is going to complete in less than 2 more likely 3 years. Spin up a TSV WG, develop a solution. Simulate and discuss corner cases for a while before getting good enough out.
So what are your views on this issue?
Requirements, with a specification developing in parallel if none of the existing solutions are sufficient (and I agree that they're probably not suitable as they stand). -- Colin Perkins http://csperkins.org/

On 10/10/2011 5:22 AM, Magnus Westerlund wrote:
So what are our alternative here?
1) Pick TFRC for now while developing something better? Possibly ensure that the RTP mapping gets published in a reasonable time frame.
2) Try to write clear requirements on the implementation, but no specification and hopes that goes through?
3) Develop something and delay the publication of any part that needs this until it is done?
4) ?
Regarding 3), I don't see how that is going to complete in less than 2 more likely 3 years. Spin up a TSV WG, develop a solution. Simulate and discuss corner cases for a while before getting good enough out.
I vote for 2, and provide a sample implementation and let people innovate. I would also support trying to standardize the sample implementation under 3 in parallel, with the understanding it will take A Long Time. Here's a set of proposed stab at requirements for rtcweb implementations: As part of rtcweb, congestion control must be addressed: 1. All WebRTC media and data streams MUST be congestion-controlled. 2. The congestion algorithms used MUST cause WebRTC streams to act fairly with TCP and other congestion-controlled flows, such as DCCP and TFRC, and other WebRTC flows. Note that WebRTC involves multiple data flows which "normally" would be separately congestion-controlled. 3. In order to support better overall user experiences and to allow applications to have better interaction with congestion control, a new AVPF feedback message [ insert name here] shall be defined to allow reporting of total predicted bandwidth for receiving data, as opposed to TMMBR, which requests a sending rate for a single SSRC flow. [ This is roughly equivalent to b=CT:xxx ] We may want to give the estimation algorithm the option to not include or exclude the data-channel bandwidth, but it SHOULD include that. 4. In order to facilitate better operation of bandwidth-estimation algorithms on the receiving side, the sending side MAY include a transmit-time RTP header extension (TBD) to some or all media streams. Note that this will add about 12 bytes to each RTP packet. An optimization may be to only include these timestamps if they deviate by more than [ some amount TBD from the running average and from the number of bytes preceding it with the same timestamp ]. This is based on the fact that for many devices, the sample->send interval is fairly consistent at the levels of accuracy needed here, and so significant bandwidth savings can be made. 5. The receiver SHOULD attempt to minimize the number of bandwidth reports when there is little or no change, while reporting quickly when there is a significant change. 6. Congestion control MUST work even if there are no media channels, or if the media channels are inactive in one or both directions. 7. The congestion control algorithm SHOULD attempt to keep the total bandwidth controlled so as to minimize the media- stream end-to-end delays between the participants. 8. When receiving a [ insert new AVPF message here ], the sender shall attempt to comply with the overall bandwidth requirements by adjusting parameters it can control, such as codec bitrates and modes, and how much data is sent on the data channels. Not part of our IETF requirements, at the JS level bandwidth changes should be reported to the application along so that it has the option to make changes that we can't make automatically, such as removing or adding a stream, or controlling the parameters of a stream (frame rate, etc). Note that if the application doesn't do anything, the automatic adaptation will still occur. -- Randell Jesup randell-ietf@jesup.org

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 26/10/11 01:57 PM, Randell Jesup wrote:
Not part of our IETF requirements, at the JS level bandwidth changes should be reported to the application along so that it has the option to make changes that we can't make automatically, such as removing or adding a stream, or controlling the parameters of a stream (frame rate, etc). Note that if the application doesn't do anything, the automatic adaptation will still occur.
Web developers are also generally very interested in metrics like this, so they can keep track of call quality. -r -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJOqH3/AAoJEEcAD3uxRB3vujkH/ijsDugxJ9OIiQB5DPs56a2j 1eojpNSvtF1l+PeOFPJHoIDPfa+J5s3QufuOe0rOpUkD5QAwLWK4oH+r0IJyxRV0 5zsoWIYBlR029xvOMlZ3icOVIXEXuUdskrDCerh7L1ny2PabA2G9h9YD5pJkQJ+g 8jyptdeGiugran0lGtdHT3IyfS5BrK6y95YnS6j/uIMf67XuzqsQR1508d1rYf6P 32SE37ZBxS88Jq3Nx+/VKeQ9g2ALWdmXx5hj2y1hZLWa5cxJToVypOilxESNOSPV trRW//ELbSrp3DPo3eYZiq2epx3YdXpn2sgJUIbe2qGUnagla7/PnQJpM/XEzkk= =NVSL -----END PGP SIGNATURE-----

Commenting on Randell's proposal, not on the alternatives, so changing subject.... On 10/26/2011 01:57 PM, Randell Jesup wrote:
I vote for 2, and provide a sample implementation and let people innovate. I would also support trying to standardize the sample implementation under 3 in parallel, with the understanding it will take A Long Time.
Here's a set of proposed stab at requirements for rtcweb implementations:
As part of rtcweb, congestion control must be addressed:
1. All WebRTC media and data streams MUST be congestion-controlled.
2. The congestion algorithms used MUST cause WebRTC streams to act fairly with TCP and other congestion-controlled flows, such as DCCP and TFRC, and other WebRTC flows. Note that WebRTC involves multiple data flows which "normally" would be separately congestion-controlled.
I'd use "reasonably fairly" to reduce (slightly) the chances of getting lost in the definition of "fair" and the number of angels who can dance on the head of a pin.
3. In order to support better overall user experiences and to allow applications to have better interaction with congestion control, a new AVPF feedback message [ insert
Suggest moving the part before the comma to an intro paragraph. It's an overall goal.
name here] shall be defined to allow reporting of total predicted bandwidth for receiving data, as opposed to
"reporting of a recipient's estimate of available bandwidth for receiving data"
TMMBR, which requests a sending rate for a single SSRC flow. [ This is roughly equivalent to b=CT:xxx ]
We may want to give the estimation algorithm the option to not include or exclude the data-channel bandwidth, but it SHOULD include that.
4. In order to facilitate better operation of bandwidth-estimation algorithms on the receiving side, the sending side MAY include a transmit-time RTP header extension (TBD) to some or all media streams. Note that this will add about 12 bytes to each RTP packet.
8 bytes in the proposal I'm editing in another window....
An optimization may be to only include these timestamps if they deviate by more than [ some amount TBD from the running average and from the number of bytes preceding it with the same timestamp ]. This is based on the fact that for many devices, the sample->send interval is fairly consistent at the levels of accuracy needed here, and so significant bandwidth savings can be made.
would make more sense to me to only include it if it varied more than N microseconds from (the time specified by the RTP timestamp for the frame + a constant delay). Suggest omitting this for what you propose to the larger group.
5. The receiver SHOULD attempt to minimize the number of bandwidth reports when there is little or no change, while reporting quickly when there is a significant change.
6. Congestion control MUST work even if there are no media channels, or if the media channels are inactive in one or both directions.
What does "work" mean if there is no data?
7. The congestion control algorithm SHOULD attempt to keep the total bandwidth controlled so as to minimize the media- stream end-to-end delays between the participants.
Not sure I understand this. If I understand it, suggest to rewrite as 7. The congestion control algorithm SHOULD attempt to minimize the media-stream end-to-end delays between the participants, by controlling bandwidth appropriately.
8. When receiving a [ insert new AVPF message here ], the sender shall attempt to comply with the overall bandwidth requirements by adjusting parameters it can control, such as codec bitrates and modes, and how much data is sent on the data channels.
Suggest "shall" -> "may".
Not part of our IETF requirements, at the JS level bandwidth changes should be reported to the application along so that it has the option to make changes that we can't make automatically, such as removing or adding a stream, or controlling the parameters of a stream (frame rate, etc). Note that if the application doesn't do anything, the automatic adaptation will still occur.
There may be adjustments that need communicating with the other end too (renegotiation). It's reasonable to assume that these are only performed when requested through the JS layer.

Hi, Comments inline On Fri, Oct 28, 2011 at 00:04, Harald Alvestrand <harald@alvestrand.no> wrote:
Commenting on Randell's proposal, not on the alternatives, so changing subject....
On 10/26/2011 01:57 PM, Randell Jesup wrote:
I vote for 2, and provide a sample implementation and let people innovate. I would also support trying to standardize the sample implementation under 3 in parallel, with the understanding it will take A Long Time.
One thing about 1. (TFRC) is that it is fair, but has rate oscillations that degrade call quality etc. While I agree 1 is sub-optimal, it depends on what we mean by fairness and how much do we emphasize it. Else I agree with the process of starting on 2 now and 3 in parallel for the long term.
Here's a set of proposed stab at requirements for rtcweb implementations:
As part of rtcweb, congestion control must be addressed:
1. All WebRTC media and data streams MUST be congestion-controlled.
2. The congestion algorithms used MUST cause WebRTC streams to act fairly with TCP and other congestion-controlled flows, such as DCCP and TFRC, and other WebRTC flows. Note that WebRTC involves multiple data flows which "normally" would be separately congestion-controlled.
I'd use "reasonably fairly" to reduce (slightly) the chances of getting lost in the definition of "fair" and the number of angels who can dance on the head of a pin.
I agree with Harald on this.
3. In order to support better overall user experiences and to allow applications to have better interaction with congestion control, a new AVPF feedback message [ insert
Suggest moving the part before the comma to an intro paragraph. It's an overall goal.
name here] shall be defined to allow reporting of total predicted bandwidth for receiving data, as opposed to
"reporting of a recipient's estimate of available bandwidth for receiving data"
using data is a bit confusing because data else where in the text means data channel. "reporting of a recipient's estimate of available bandwidth for receiving the combined media and data streams" Just to confirm: the receiver calculates the rate per stream and then adds them up: CT_combined = R_audio + R_video + R_data and the audio and video channels calculate the rate as described in the proposal. The sender uses the receiver CT_combined and the current sending rate of each channel to re-allocate the distribution. Is the aim to try and get the distribution similar to what the receiver envisioned or is the sender free to do whatever?
TMMBR, which requests a sending rate for a single SSRC flow. [ This is roughly equivalent to b=CT:xxx ]
We may want to give the estimation algorithm the option to not include or exclude the data-channel bandwidth, but it SHOULD include that.
Is the data sent using the same congestion control mechanism?
4. In order to facilitate better operation of bandwidth-estimation algorithms on the receiving side, the sending side MAY include a transmit-time RTP header extension (TBD) to some or all media streams. Note that this will add about 12 bytes to each RTP packet.
8 bytes in the proposal I'm editing in another window....
I agree with the MAY as it is yet to be proven useful.
An optimization may be to only include these timestamps if they deviate by more than [ some amount TBD from the running average and from the number of bytes preceding it with the same timestamp ]. This is based on the fact that for many devices, the sample->send interval is fairly consistent at the levels of accuracy needed here, and so significant bandwidth savings can be made.
would make more sense to me to only include it if it varied more than N microseconds from (the time specified by the RTP timestamp for the frame + a constant delay). Suggest omitting this for what you propose to the larger group.
5. The receiver SHOULD attempt to minimize the number of bandwidth reports when there is little or no change, while reporting quickly when there is a significant change.
Will we propose an algorithm or lower or upper-bound for this? like: not quicker than once per RTT even if the 5% rule allows it. Or send an early report when loss, inter packet delay, etc exceeds a given threshold.
6. Congestion control MUST work even if there are no media channels, or if the media channels are inactive in one or both directions.
What does "work" mean if there is no data?
MUST be enabled?
7. The congestion control algorithm SHOULD attempt to keep the total bandwidth controlled so as to minimize the media- stream end-to-end delays between the participants.
Not sure I understand this. If I understand it, suggest to rewrite as
7. The congestion control algorithm SHOULD attempt to minimize the media-stream end-to-end delays between the participants, by controlling bandwidth appropriately.
The receiver doesn't know the end-to-end delay, the RTT is calculated at the sender. So is the sender making this decision or the receiver?
8. When receiving a [ insert new AVPF message here ], the sender shall attempt to comply with the overall bandwidth requirements by adjusting parameters it can control, such as codec bitrates and modes, and how much data is sent on the data channels.
Suggest "shall" -> "may".
Does the application know apriori how much data would be sent on the data channel? because it is likely that the sender would want to use all of the signaled bandwidth for audio and video. (Especially if the data channel is used for IM, which may be used sporadically).
Not part of our IETF requirements, at the JS level bandwidth changes should be reported to the application along so that it has the option to make changes that we can't make automatically, such as removing or adding a stream, or controlling the parameters of a stream (frame rate, etc). Note that if the application doesn't do anything, the automatic adaptation will still occur.
There may be adjustments that need communicating with the other end too (renegotiation). It's reasonable to assume that these are only performed when requested through the JS layer.
_______________________________________________ Rtp-congestion mailing list Rtp-congestion@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtp-congestion

On 10/28/2011 6:00 AM, Varun Singh wrote:
Here's a set of proposed stab at requirements for rtcweb implementations:
As part of rtcweb, congestion control must be addressed:
1. All WebRTC media and data streams MUST be congestion-controlled.
2. The congestion algorithms used MUST cause WebRTC streams to act fairly with TCP and other congestion-controlled flows, such as DCCP and TFRC, and other WebRTC flows. Note that WebRTC involves multiple data flows which "normally" would be separately congestion-controlled.
I'd use "reasonably fairly" to reduce (slightly) the chances of getting lost in the definition of "fair" and the number of angels who can dance on the head of a pin.
I agree with Harald on this.
I was trying to avoid defining 'fair' at all costs. Hopefully no one will want to define "reasonable"...
3. In order to support better overall user experiences and to allow applications to have better interaction with congestion control, a new AVPF feedback message [ insert
Suggest moving the part before the comma to an intro paragraph. It's an overall goal.
Sure.
name here] shall be defined to allow reporting of total predicted bandwidth for receiving data, as opposed to
"reporting of a recipient's estimate of available bandwidth for receiving data"
using data is a bit confusing because data else where in the text means data channel.
"reporting of a recipient's estimate of available bandwidth for receiving the combined media and data streams"
Yes, thanks
Just to confirm: the receiver calculates the rate per stream and then adds them up: CT_combined = R_audio + R_video + R_data and the audio and video channels calculate the rate as described in the proposal.
Or it calculates a total rate directly - or it makes a wild-assed-guess. :-) We're not specifying how it gets these numbers here (in the normative wordings). And if running them separately, one will 'notice' congestion before the others, always, so you need to think about how you'd merge individual reports - doing as you mention (with the straight algorithm in Harald's draft) would likely lead to not going far enough down until the other channels noticed the congestion, if they did. If as I suggested you use the 'slope' of the incoming packet rate to estimate the amount under/over bandwidth, even with Harald's algorithm, you could apply that slope correction factor across all the estimates. Without it, you could apply a correction factor based on recent state changes in the individual readings; more complex than just adding them.
The sender uses the receiver CT_combined and the current sending rate of each channel to re-allocate the distribution. Is the aim to try and get the distribution similar to what the receiver envisioned or is the sender free to do whatever?
I this case the sender has no idea what the receiver envisioned - but even if it did, I'd say it's free to do whatever. If the receiver wants to control individual channels more (no guarantees), it should use individual TMMBR reports.
TMMBR, which requests a sending rate for a single SSRC flow. [ This is roughly equivalent to b=CT:xxx ]
We may want to give the estimation algorithm the option to not include or exclude the data-channel bandwidth, but it SHOULD include that.
Is the data sent using the same congestion control mechanism?
We would like it to be included or at least interactive with the algorithm, but that's not currently required and we can't guarantee it will happen. SCTP does allow for variant CC algorithms, so we certainly have the option if we can work it out.
4. In order to facilitate better operation of bandwidth-estimation algorithms on the receiving side, the sending side MAY include a transmit-time RTP header extension (TBD) to some or all media streams. Note that this will add about 12 bytes to each RTP packet.
8 bytes in the proposal I'm editing in another window....
I agree with the MAY as it is yet to be proven useful.
An optimization may be to only include these timestamps if they deviate by more than [ some amount TBD from the running average and from the number of bytes preceding it with the same timestamp ]. This is based on the fact that for many devices, the sample->send interval is fairly consistent at the levels of accuracy needed here, and so significant bandwidth savings can be made.
would make more sense to me to only include it if it varied more than N microseconds from (the time specified by the RTP timestamp for the frame + a constant delay). Suggest omitting this for what you propose to the larger group.
Ok, but we'll need to resolve this issue later. I'm fine with not mentioning optimizations here.
5. The receiver SHOULD attempt to minimize the number of bandwidth reports when there is little or no change, while reporting quickly when there is a significant change.
Will we propose an algorithm or lower or upper-bound for this? like: not quicker than once per RTT even if the 5% rule allows it. Or send an early report when loss, inter packet delay, etc exceeds a given threshold.
My inclination would be to not put any limit on it. Even the RTT thing isn't necessarily a good idea; I may have a more accurate estimate shortly after sending a preliminary report. I wouldn't (as a sender) increase my sending rate more than once per RTT (actually probably slower than that), but I might decrease it in less than an RTT - think satellite with 1s RTT...
6. Congestion control MUST work even if there are no media channels, or if the media channels are inactive in one or both directions.
What does "work" mean if there is no data?
MUST be enabled?
6. Data channels MUST be congestion-controlled even if there are no media channels, or if the media channels are inactive in one or both directions.
7. The congestion control algorithm SHOULD attempt to keep the total bandwidth controlled so as to minimize the media- stream end-to-end delays between the participants.
Not sure I understand this. If I understand it, suggest to rewrite as
7. The congestion control algorithm SHOULD attempt to minimize the media-stream end-to-end delays between the participants, by controlling bandwidth appropriately.
The receiver doesn't know the end-to-end delay, the RTT is calculated at the sender. So is the sender making this decision or the receiver?
Almost by definition the sender. The receiver is reporting data for the sender to use. (You can consider the receiver part of the algorithm, of course.)
8. When receiving a [ insert new AVPF message here ], the sender shall attempt to comply with the overall bandwidth requirements by adjusting parameters it can control, such as codec bitrates and modes, and how much data is sent on the data channels.
Suggest "shall" -> "may".
For any CC algorithm to work, it really *should* try. It may not be able to comply, the algorithm may include some smoothing, and the sender may have additional information, so it's not MUST, but wouldn't MAY be too weak?
Does the application know apriori how much data would be sent on the data channel? because it is likely that the sender would want to use all of the signaled bandwidth for audio and video. (Especially if the data channel is used for IM, which may be used sporadically).
It may or may not know ahead of time. It controls how much data IS sent, however.
Not part of our IETF requirements, at the JS level bandwidth changes should be reported to the application along so that it has the option to make changes that we can't make automatically, such as removing or adding a stream, or controlling the parameters of a stream (frame rate, etc). Note that if the application doesn't do anything, the automatic adaptation will still occur.
There may be adjustments that need communicating with the other end too (renegotiation). It's reasonable to assume that these are only performed when requested through the JS layer.
-- Randell Jesup randell-ietf@jesup.org

Rest of comments look fine, I think.... On 10/28/2011 06:18 AM, Randell Jesup wrote:
8. When receiving a [ insert new AVPF message here ], the sender shall attempt to comply with the overall bandwidth requirements by adjusting parameters it can control, such as codec bitrates and modes, and how much data is sent on the data channels.
Suggest "shall" -> "may".
For any CC algorithm to work, it really *should* try. It may not be able to comply, the algorithm may include some smoothing, and the sender may have additional information, so it's not MUST, but wouldn't MAY be too weak?
I was thinking that in some applications over some codecs, it may simply drop packets, layers or channels, without adjusting any parameters. Suggest that we insert ", for instance" in front of "by adjusting" - so that it becomes "the sender shall attempt to comply with the overall bandwidth requirements, for instance by adjusting parameters it can control". 8. When receiving a [ insert new AVPF message here ], the sender shall attempt to comply with the overall bandwidth requirements, for instance by adjusting parameters it can control, such as codec bitrates and modes, and how much data is sent on the data channels. This should be OK for the first cut - long term, we have to discuss what "comply" means ... whether a leaky bucket algorithm (which would permit some over-bandwidth burstiness) is an appropriate model, and if so, how deep the bucket should be, or whether that's a tunable parameter that can be freely set by the implementation within some boundaries, or...... for now, let's get a simple statement that is understandable to the main list.

Hi, Comments inline, apologies for the delay in response. On Fri, Oct 28, 2011 at 16:18, Randell Jesup <randell-ietf@jesup.org> wrote:
On 10/28/2011 6:00 AM, Varun Singh wrote:
Just to confirm: the receiver calculates the rate per stream and then adds them up: CT_combined = R_audio + R_video + R_data and the audio and video channels calculate the rate as described in the proposal.
Or it calculates a total rate directly - or it makes a wild-assed-guess. :-) We're not specifying how it gets these numbers here (in the normative wordings). And if running them separately, one will 'notice' congestion before the others, always, so you need to think about how you'd merge individual reports - doing as you mention (with the straight algorithm in Harald's draft) would likely lead to not going far enough down until the other channels noticed the congestion, if they did. If as I suggested you use the 'slope' of the incoming packet rate to estimate the amount under/over bandwidth, even with Harald's algorithm, you could apply that slope correction factor across all the estimates. Without it, you could apply a correction factor based on recent state changes in the individual readings; more complex than just adding them.
The sender uses the receiver CT_combined and the current sending rate of each channel to re-allocate the distribution. Is the aim to try and get the distribution similar to what the receiver envisioned or is the sender free to do whatever?
I this case the sender has no idea what the receiver envisioned - but even if it did, I'd say it's free to do whatever. If the receiver wants to control individual channels more (no guarantees), it should use individual TMMBR reports.
So is using individual TMMBR reports acceptable? if it is possible then the above text should reflect that. If not then the above text is okay.
TMMBR, which requests a sending rate for a single SSRC flow. [ This is roughly equivalent to b=CT:xxx ]
We may want to give the estimation algorithm the option to not include or exclude the data-channel bandwidth, but it SHOULD include that.
Is the data sent using the same congestion control mechanism?
We would like it to be included or at least interactive with the algorithm, but that's not currently required and we can't guarantee it will happen. SCTP does allow for variant CC algorithms, so we certainly have the option if we can work it out.
Alright.
5. The receiver SHOULD attempt to minimize the number of bandwidth reports when there is little or no change, while reporting quickly when there is a significant change.
Will we propose an algorithm or lower or upper-bound for this? like: not quicker than once per RTT even if the 5% rule allows it. Or send an early report when loss, inter packet delay, etc exceeds a given threshold.
My inclination would be to not put any limit on it. Even the RTT thing isn't necessarily a good idea; I may have a more accurate estimate shortly after sending a preliminary report. I wouldn't (as a sender) increase my sending rate more than once per RTT (actually probably slower than that), but I might decrease it in less than an RTT - think satellite with 1s RTT...
I agree that to update the preliminary report, the endpoint will have to send it less than RTT.
6. Congestion control MUST work even if there are no media channels, or if the media channels are inactive in one or both directions.
What does "work" mean if there is no data?
MUST be enabled?
6. Data channels MUST be congestion-controlled even if there are no media channels, or if the media channels are inactive in one or both directions.
Ok.
7. The congestion control algorithm SHOULD attempt to keep the total bandwidth controlled so as to minimize the media- stream end-to-end delays between the participants.
Not sure I understand this. If I understand it, suggest to rewrite as
7. The congestion control algorithm SHOULD attempt to minimize the media-stream end-to-end delays between the participants, by controlling bandwidth appropriately.
The receiver doesn't know the end-to-end delay, the RTT is calculated at the sender. So is the sender making this decision or the receiver?
Almost by definition the sender. The receiver is reporting data for the sender to use. (You can consider the receiver part of the algorithm, of course.)
Probably, adding "sending-side" before congestion control would clarify it.
8. When receiving a [ insert new AVPF message here ], the sender shall attempt to comply with the overall bandwidth requirements by adjusting parameters it can control, such as codec bitrates and modes, and how much data is sent on the data channels.
Suggest "shall" -> "may".
For any CC algorithm to work, it really *should* try. It may not be able to comply, the algorithm may include some smoothing, and the sender may have additional information, so it's not MUST, but wouldn't MAY be too weak?
Does the application know apriori how much data would be sent on the data channel? because it is likely that the sender would want to use all of the signaled bandwidth for audio and video. (Especially if the data channel is used for IM, which may be used sporadically).
It may or may not know ahead of time. It controls how much data IS sent, however.
Okay. -- http://www.netlab.tkk.fi/~varun/

Changing the subject again, since I'm diving into one little corner of the thread... On 10/28/2011 03:00 AM, Varun Singh wrote:
7. The congestion control algorithm SHOULD attempt to keep the total bandwidth controlled so as to minimize the media- stream end-to-end delays between the participants.
Not sure I understand this. If I understand it, suggest to rewrite as
7. The congestion control algorithm SHOULD attempt to minimize the media-stream end-to-end delays between the participants, by controlling bandwidth appropriately. The receiver doesn't know the end-to-end delay, the RTT is calculated at the sender. So is the sender making this decision or the receiver? We shouldn't make this decision as part of the requirements list, but:
It's not necessary to know the absolute value of something in order to attempt to minimize it; the delay-based algorithm is jiggling things around and seeing if the sender->receiver end-to-end delay increases or decreases. There's a floor somewhere based on speed of light in fiber, distance and clocking intervals, but it's not necessary to know what that floor is in order to try to approach it. I would argue that RTT is in fact a distraction for this optimization; the time it takes packets to go from receiver to sender does not give any information about the one-way end-to-end delay from sender to receiver. (Imagine using this with a typical cable TV network with a fat, uncongested downlink and an anemic, highly congested uplink - if optimizing a downlink stream based on RTT measurements, we would experience wild swings in RTT because of the upstream congestion delaying the feedback packets, but tweaking the sending rate based on this information would downgrade, not improve, the experience.)
participants (7)
-
Colin Perkins
-
Harald Alvestrand
-
Justin Uberti
-
Magnus Westerlund
-
Ralph Giles
-
Randell Jesup
-
Varun Singh