
Randell- Yup, this all makes sense. Regarding Netflix and other ABR flows...... I would add that the increasing prevalence of bursty Adaptive BitRate video (HTTP get-get-get of 2-10 second chunks of video) makes the detection of spare link capacity and/or cross traffic much more difficult. The ABR traffic pattern boils down to a square wave pattern of a totally saturated last mile for a few seconds followed by an idle link for a few seconds. The square wave actually has the TCP sawtooth modulated on top of it, so there are secondary effects. Throw in a few instances of ABR video on a given last mile and things get very interesting. The solution to this problem is not in scope for the RTCWeb/RTP work, but I sure wish that the ABR folks would find a way to smooth out their flows. We have been knocking around some ideas in this area in other discussions, so if anybody is interested in this please drop me a note. bvs -----Original Message----- From: rtp-congestion-bounces@alvestrand.no [mailto:rtp-congestion-bounces@alvestrand.no] On Behalf Of Randell Jesup Sent: Friday, May 04, 2012 12:33 PM To: rtp-congestion@alvestrand.no Subject: Re: [R-C] Packet loss response - but how? On 5/4/2012 9:50 AM, Bill Ver Steeg (versteb) wrote:
The RTP timestamps are certainly our friends.
I am setting up to run some experiments with the various common buffer management algorithms to see what conclusions can be drawn from inter-packet arrival times. I suspect that the results will vary wildly from the RED-like algorithms to the more primitive tail-drop-like algorithms. In the case of RED-like algorithms, we will hopefully not get to much delay/bloat before the drop event provides a trigger. For the tail-drop-like algorithms, we may have to use the increasing delay/bloat trend as a trigger.
This would match my experience. As mentioned, I found access-link congestion loss (especially when not competing with sustained TCP flows, which is pretty normal for home use, especially if no one is watching Netflix...) results in a sawtooth delay with losses at the delay drops. This also happens (with more noise and often a faster ramp) when competing, especially when competing with small numbers of flows. Not really unexpected. As this sort of drop corresponds to a full buffer, it's pretty much a 'red flag' for a realtime flow. RED drops I found to be more useful in avoiding delay (of course). My general mechanism was to drop transmission rate (bandwidth estimate) an amount proportional to the drop rate; and tail-queue type drops (sawtooth) cause much sharper bandwidth drops. I simply assume all drops are in some way related to congestion.
As I think about the LEDBAT discussions, I am concerned about the interaction between the various algorithms - but some data should be informative.
Absolutely.
We may even be able to differentiate between error-driven loss and congestion driven loss, particularly if the noise is on the last hop of the network and thus downstream of the congested queue (which is typically where the noise occurs). In my tiny brain, you should be able to see a gap in the time record corresponding to a packet that was dropped due to last-mile noise. A packet dropped in the queue upstream of the last mile bottleneck would not have that type of time gap. You do need to consider cross traffic in this thought exercise, but statistical methods may be able to separate persistent congestion from persistent noise-driven loss.
Exactly the mechanism I used to differentiate "fishy" losses from "random" ones; "fishy" losses as mentioned cause bigger responses. I still dropped bandwidth on "random" drops, which can be congestion drops from RED in a core router so long as the router queue isn't too long. You'd also see those from "minimal queue" tail-drop routers. I did use a separate jitter buffer for determining losses (and for my filter info) from the normal jitter buffer, which being adaptive might not hold the data long enough for me. I actually kept around a second of delay/loss data on the video channel, and *if there was no loss* or large delay ramp I only reported stats every second or two.
TL;DR - We can probably tell that we have queues building prior to the actual loss event, particularly when we need to overcome limitations of poor buffer management algorithms.
If the queues are large enough, or if the over-bandwidth is low enough, yes. If there's a heavy burst of traffic (think modern browsers maximizing pageload time to sharded servers), then you may not get a chance; you may go from no delay to 200ms taildrop in an RTT or two - or even between two 20 or 30ms packets. And you need to filter enough to decide if it's jitter or delay. (You can make an argument that 'jitter' is really the sum of deltas in path queues, but that doesn't help you much in deciding whether to react to it or not.) Data would be useful... :-) Generally, for realtime media you really want to be undershooting slightly most of the time in order to make sure the queues stay at/near 0. The more uncertainty you have, the more you want to undershoot. A stable delay signal makes it fairly safe to probe for additional bandwidth because you'll get a quick response, and if the probe is a "small" step relative to current bandwidth, then the time to recognize the filtered delay signal and inform the other side and have them adapt (roughly filter delay + RTT + encoding delay). High jitter can be the result of wireless or cross-traffic, unfortunately. Also, especially near startup, my equivalent to slow-start was much more aggressive initially to find the safe point, but with each overshoot (and drop back below the apparent rate) in the same bandwidth range I would reduce the magnitude of the next probes until we had pretty much determined the safe rate. This is most effective in finding the channel bandwidth without significant sustained competing traffic on the bottleneck link. If I believed I'd found the channel bandwidth, I would remember that, and be much less likely to probe over that limit, though I would do so occasionally to see if there had been a change. This allowed for faster recovery from short-duration competing traffic (the most common case) without overshooting the channel bandwidth. Note that the more effective your queue detection logic is, the less you need that sort of heuristic; it may have been overkill on my part. -- Randell Jesup randell-ietf@jesup.org _______________________________________________ Rtp-congestion mailing list Rtp-congestion@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtp-congestion