On Wed, Oct 12, 2011 at 7:12 AM, Randell Jesup <randell-ietf@jesup.org> wrote:

Jim: We're moving this discussion to the newly-created mailing sub-list -
Rtp-congestion@alvestrand.no
http://www.alvestrand.no/mailman/listinfo/rtp-congestion

If you'd like to continue this discussion (and I'd love you to do so), please join the mailing list. (Patrick, you may want to join too and read the very small backlog of messages (perhaps 10 so far)).

On 10/11/2011 4:17 PM, Jim Gettys wrote:

On 10/11/2011 03:11 AM, Henrik Lundin wrote:

I do not agree with you here. When an over-use is detected, we propose
to measure the /actual/ throughput (over the last 1 second), and set
the target bitrate to beta times this throughput. Since the measured
throughput is a rate that evidently was feasible (at least during that
1 second), any beta< 1 should assert that the buffers get drained,
but of course at different rates depending on the magnitude of beta.

Take a look at the data from the ICSI netalyzr: you'll find scatter
plots at:

http://gettys.wordpress.com/2010/12/06/whose-house-is-of-glasse-must-not-throw-stones-at-another/

Note the different coloured lines. They represent the amount of
buffering measured in the broadband edge in *seconds*. Also note that
for various reasons, the netalyzr data is actually likely
underestimating the problem.

Understood. Though that's not entirely relevant to this problem, since the congestion-control mechanisms we're using/designing here are primarily buffer-sensing algorithms that attempt to keep the buffers in a drained state. If there's no competing traffic at the bottleneck, they're likely to do so fairly well, though more simulation and real-world tests are needed. I'll note that several organizations (Google/GIPS, Radvision and my old company WorldGate) had found that these types of congestion-control algorithms are quite effective in practice.

However, it isn't irrelevant to the problem either:

This class of congestion-control algorithms are subject to "losing" if faced with a sustained high-bandwidth TCP flow like some of your tests, since they back off when TCP isn't seeing any restriction (loss) yet. Eventually TCP will fill the buffers.

More importantly, perhaps, bufferbloat combined with the high 'burst' nature of browser network systems (and websites) optimizing for page-load time means you can get a burst of data at a congestion point that isn't normally the bottleneck.

The basic scenario goes like this:

1. established UDP flow near bottleneck limit at far-end upstream>
2. near-end browser (or browser on another machine in the same house)
initiates a page-load
3. near-end browser opens "many" tcp connections to the site and
other sites that serve pieces (ads, images, etc) of the page.
4. Rush of response data saturates the downstream link to the
near-end, which was not previously the bottleneck. Due to
bufferbloat, this can cause a significant amount of data to be
temporarily buffered, delaying competing UDP data significantly
(tenths of a second, perhaps >1 second in cases). This is hard
to model accurately; real-world tests are important.
5. Congestion-control algorithm notices transition to buffer-
induced delay, and tells the far side to back off. The latency
of this decision may help us avoid over-reacting, as we have to
see increasing delay which takes a number of packets (at least
1/10 second, and easily could be more). Also, the result of
the above "inrush"/pageload-induced latency may not trigger the
congestion mechanisms we discuss here, as we might see a BIG jump
in delay followed by steady delay or a ramp down (since if the
buffer has suddenly jumped from drained to full, all it can do is
be stable or drain).

Note that Google's current algorithm (which you comment on above) uses recent history for choosing the reduction; in this case it's hard to say what the result would be: if it invokes the backoff at the start of the pageload, then the bandwidth received recently is the current bandwidth, so the new bandwidth is current minus small_delta. If it happens after data has queued behind the burst of TCP traffic, then when the backoff is generated we'll have gotten almost no data through "recently" and we may back off all the way to min bandwidth; an over-reaction, depending on the time constant and level of how fast that burst can fill the downstream buffers.

Now, in practice this is likely messier and the pageload doesn't generate a huge sudden block of data that fills the buffers, so there's some upward slope to delay as you head to saturation of the downstream buffers. And there's very little you can do about this - and backing off a lot may help in that the less data you put onto the end of this overloaded queue (assuming the pageload flow has ended or soon will), the sooner the queue will drain and low-latency will be re-established.

Does the ICSI data call out *where* the buffer-bloat occurs?

Then realise that when congested, nothing you do can react faster than
the RTT including the buffering.

So if your congestion is in the broadband edge (where it often/usually
is), you are in a world of hurt, and you can't use any algorithm that
has fixed time constants, even one as long as 1 second.

Wish this weren't so, but it is.

Bufferbloat is a disaster...

Given the loss-based algorithms for TCP/etc, yes. We have to figure out how to (as reliably *as possible*) deliver low-latency data in this environment.

--
Randell Jesup
randell-ietf@jesup.org