Bufferbloat is a drag, indeed. The way I see it, a delay-sensing CC will have to lose the battle against TCP flows over "bufferbloated" devices, at least in the steady-state case (i.e., persistent TCP flows). Otherwise we will have created an algorithm that may just as well fill up the buffers for itself, even in a situation without competing flows. I'm pessimistic that there is any other way around this than QoS.

For the transient case that you are describing here, I think that we still cannot do much about the actual filling of the buffers; we are not to blame for it. But as you point out, we can probably do some more about how to respond when the transient cross-traffic hits. I'm not sure how, though.

/Henrik



On Wed, Oct 12, 2011 at 7:12 AM, Randell Jesup <randell-ietf@jesup.org> wrote:
Jim:  We're moving this discussion to the newly-created mailing sub-list -
  Rtp-congestion@alvestrand.no
  http://www.alvestrand.no/mailman/listinfo/rtp-congestion

If you'd like to continue this discussion (and I'd love you to do so), please join the mailing list.  (Patrick, you may want to join too and read the very small backlog of messages (perhaps 10 so far)).

On 10/11/2011 4:17 PM, Jim Gettys wrote:
On 10/11/2011 03:11 AM, Henrik Lundin wrote:


I do not agree with you here. When an over-use is detected, we propose
to measure the /actual/ throughput (over the last 1 second), and set
the target bitrate to beta times this throughput. Since the measured
throughput is a rate that evidently was feasible (at least during that
1 second), any beta<  1 should assert that the buffers get drained,
but of course at different rates depending on the magnitude of beta.
Take a look at the data from the ICSI netalyzr: you'll find scatter
plots at:

http://gettys.wordpress.com/2010/12/06/whose-house-is-of-glasse-must-not-throw-stones-at-another/

Note the different coloured lines.  They represent the amount of
buffering measured in the broadband edge in *seconds*.  Also note that
for various reasons, the netalyzr data is actually likely
underestimating the problem.

Understood.  Though that's not entirely relevant to this problem, since the congestion-control mechanisms we're using/designing here are primarily buffer-sensing algorithms that attempt to keep the buffers in a drained state.  If there's no competing traffic at the bottleneck, they're likely to do so fairly well, though more simulation and real-world tests are needed.  I'll note that several organizations (Google/GIPS, Radvision and my old company WorldGate) had found that these types of congestion-control algorithms are quite effective in practice.

However, it isn't irrelevant to the problem either:

This class of congestion-control algorithms are subject to "losing" if faced with a sustained high-bandwidth TCP flow like some of your tests, since they back off when TCP isn't seeing any restriction (loss) yet. Eventually TCP will fill the buffers.

More importantly, perhaps, bufferbloat combined with the high 'burst' nature of browser network systems (and websites) optimizing for page-load time means you can get a burst of data at a congestion point that isn't normally the bottleneck.

The basic scenario goes like this:

1. established UDP flow near bottleneck limit at far-end upstream>
2. near-end browser (or browser on another machine in the same house)
  initiates a page-load
3. near-end browser opens "many" tcp connections to the site and
  other sites that serve pieces (ads, images, etc) of the page.
4. Rush of response data saturates the downstream link to the
  near-end, which was not previously the bottleneck.  Due to
  bufferbloat, this can cause a significant amount of data to be
  temporarily buffered, delaying competing UDP data significantly
  (tenths of a second, perhaps >1 second in cases).  This is hard
  to model accurately; real-world tests are important.
5. Congestion-control algorithm notices transition to buffer-
  induced delay, and tells the far side to back off.  The latency
  of this decision may help us avoid over-reacting, as we have to
  see increasing delay which takes a number of packets (at least
  1/10 second, and easily could be more).  Also, the result of
  the above "inrush"/pageload-induced latency may not trigger the
  congestion mechanisms we discuss here, as we might see a BIG jump
  in delay followed by steady delay or a ramp down (since if the
  buffer has suddenly jumped from drained to full, all it can do is
  be stable or drain).

Note that Google's current algorithm (which you comment on above) uses recent history for choosing the reduction; in this case it's hard to say what the result would be: if it invokes the backoff at the start of the pageload, then the bandwidth received recently is the current bandwidth, so the new bandwidth is current minus small_delta.  If it happens after data has queued behind the burst of TCP traffic, then when the backoff is generated we'll have gotten almost no data through "recently" and we may back off all the way to min bandwidth; an over-reaction, depending on the time constant and level of how fast that burst can fill the downstream buffers.

Now, in practice this is likely messier and the pageload doesn't generate a huge sudden block of data that fills the buffers, so there's some upward slope to delay as you head to saturation of the downstream buffers.  And there's very little you can do about this - and backing off a lot may help in that the less data you put onto the end of this overloaded queue (assuming the pageload flow has ended or soon will), the sooner the queue will drain and low-latency will be re-established.

Does the ICSI data call out *where* the buffer-bloat occurs?

Then realise that when congested, nothing you do can react faster than
the RTT including the buffering.

So if your congestion is in the broadband edge (where it often/usually
is), you are in a world of hurt, and you can't use any algorithm that
has fixed time constants, even one as long as 1 second.

Wish this weren't so, but it is.

Bufferbloat is a disaster...

Given the loss-based algorithms for TCP/etc, yes.  We have to figure out how to (as reliably *as possible*) deliver low-latency data in this environment.


--
Randell Jesup
randell-ietf@jesup.org