New subject: Congestion Control BOF

11 Oct 2011

      Randell,

This is a good start, indeed! Some comments on the details of your initial
email are below.

About simulation: Does anyone have any suggestions for suitable simulation
tools? ns2 is one choice of course, but I'm not very pleased with it.

/Henrik & Stefan @ Google

On Sat, Oct 8, 2011 at 12:38 AM, Randell Jesup <randell@jesup.org> wrote:
...
This email is being sent to everyone who expressed an interest in a
congestion-control BOF group or who responded to the previous thread on the
subject.  Feel free to ignore it or ask to be dropped if you're not
interested.
If people feel we should, I'm happy to take this back to the main rtcweb
list.
So, per my posting on congestion control, I'm advocating combining all the
media streams into one congestion control domain, and modifying the
Google/GIPS algorithm to take the changes this brings into account.  (I have
a few other suggested tweaks as well.)  This goes into more detail on
exactly how one would modify the algorithm.
First, some comments on the current algorithm:
1. Frames with internal packet loss should not be fed as an input to the
kalman filter.  Especially if the packet was lost after the bottleneck it
may give a false input (if lost at/before the bottleneck it probably would
be ok).  In general sequences with packet loss may give suspect results; see
point 4 below.
Agree. The reason we are using frames with losses in the current
implementation is mainly code technicality.
...
2. I would use something more adaptive than the current _beta in
RemoteRateControl.  Currently if we're over-bandwidth, it cuts bandwidth by
5 or 10% (depending on where we are relative to the "known good max") and
hopes that's below the bottleneck bandwidth.  On a large change (or bad
mis-estimation) the bandwidth might still be above bottleneck rate, and you
*really* want to get down below it to drain the buffers fast.
I would prefer to use the effective incoming "slope" (which the magnitude
of the Kalman filter should tell us) to estimate how far over-bandwidth we
are.  If we're receiving 500Kbps, and 33ms frames are coming in at an
average of 36ms, the bandwidth estimate would be ~458Kbps, so to be safe and
allow those buffers to drain, I'd probably want to target something like
430Kbps until they drain.  If they came in at 45ms, which implies ~366Kbps
available, I'd drop to somewhere around 300 at a guess - dropping to 430
would cause delay to continue to increase.  Also, if there's loss I feel you
should drop the rate more, since the bottleneck router (if it has short
queues) may have started to tail-drop - and if it random-drops, your
estimate may be too high.
I do not agree with you here. When an over-use is detected, we propose to
measure the *actual* throughput (over the last 1 second), and set the target
bitrate to beta times this throughput. Since the measured throughput is a
rate that evidently was feasible (at least during that 1 second), any beta <
1 should assert that the buffers get drained, but of course at different
rates depending on the magnitude of beta.

3. Similarly, I'd want to allow for a smaller 'beta' (faster cuts, alpha for
...
faster increases) at the start of a call.  It's unclear to me from the code
if that's the case currently, though if it is it's a side-effect of other
things like the rcRegion and the maxHoldRate.
I'm not sure I fully understand what you mean by "faster cuts", but I gather
that you're aiming at speeding up the convergence (up or down) during the
startup phase. In general, we do agree that it's a good idea to let the
algorithm be more agile in the beginning. Regarding the beta value, see my
comment to #2 above.
...
Overall, the general approach in RateIncreaseFactor for alpha and the code
that calls it is good and matches my understanding of reasonable
approaches.  I might also want to slow the rate of increase (especially over
the max rate where we didn't see delay increase) every time we get knocked
back, which allows for a fast converging search for the channel bandwidth,
again especially at the start of a call, though this could be in play after
any significant change.
This was already implemented in the algorithm, although not described in the
document.
...
4. If a drop occurs at or before the bottleneck, you'll get an
unusually-short arrival delta compared to the Ts delta.  I would consider
such drops "fishy" as they may indicate tail-dropping somewhere in the
chain, and if we see "too many" of them drop bandwidth by a larger amount,
even if the Kalman filter indicates Hold.
This would be especially important if the bottleneck has a (very) short
queue - it may build up, drop, build up, drop, etc and to the Kalman filter
it might look like you weren't over-bandwidth, just a lot of jitter.
This is probably an important corner case (maybe not even a corner...). We
would have to come up with some extended loss processing at the receiver.
It's probably necessary to come test this in a simulation environment were
the bottleneck queues can be varied at will.
...
5. It appears that the code in remote_rate_control.cc (i.e.receiver)
currently starts at the max-configured bitrate; is this correct?  What's the
impact of this; what does the sender-side start at?
This is correct, and means that the receiver side does not enforce or ask
for any rate reduction until it has experienced the first over-bandwidth.
...
Ok, now on to merging the media stream congestion control states.
The overall idea is that all the packets from all media streams are fed
into the same 'bin' as if they were parts of the same source.  There are two
primary ways to do this:
1)  calculate the delta for each stream separately (produce
CompensatedTimeDelta(), which is in ms), and feed into a single Kalman
filter
2) calculate the delta as if all the packets are one stream, by mapping the
timestamps of the separate streams to one drift-compensated NTP time base,
then produce CompensatedTimeDelta() and feed to Kalman.
(see
http://code.google.com/p/webrtc/source/browse/trunk/src/modules/rtp_rtcp/sou...
)
#2 (probably) needs drift compensation since we can't guarantee that the
timestamps of different streams are sourced from the same clock.  It's
possible that for "normal" drifts this wouldn't be an issue, but I'm leary
since I've seen some PBXes and devices that have significant drift (though
in this case we only really care about differential drift).
Note that currently the UpdateKalman() function has the drift compensation
inside of it; for either of these it would need to move out of
UpdateKalman().
If you follow the code, you'll see that the CurrentDrift() method is a dummy
always returning 1.0. TBD :)
...
A primary change needed (in both of these) is that the current Kalman
filter is based around an assumption of 30fps source rate.  From page 6 of
Harald's draft:
The variance var_v = sigma(v,i)^2 is estimated using an exponential
   averaging filter, modified for variable sampling rate
var_v_hat = beta*sigma(v,i-1)^2 + (1-beta)*z(i)^2
beta = (1-alpha)30/(1000 * f_max)
where f_max = max {1/(T(j) - T(j-1))} for j in i-K+1...i is the
   highest rate at which frames have been captured by the camera the
   last K frames and alpha is a filter coefficient typically chosen as a
   number in the interval [0.1, 0.001].  Since our assumption that v(i)
   should be zero mean WGN is less accurate in some cases, we have
   introduced an additional outlier filter around the updates of
   var_v_hat.  If z(i) > 3 var_v_hat the filter is updated with 3
   sqrt(var_v_hat) rather than z(i).  In a similar way, Q(i) is chosen
   as a diagonal matrix with main diagonal elements given by
diag(Q(i)) = 30/(1000 * f_max)[10^-10 10^-2]^T
It is necessary to scale these filter parameters with the frame rate
   to make the detector respond as quickly at low frame rates as at high
   frame rates.
This assumes that the frame rate == 1/shortest_ts_diff, which is a slightly
loose assumption, even without merging streams - though I agree most
encoders encode at a fairly fixed rate, and 30 is the most common.  But it
still could be well off especially at transition points (say from still to
moving) - the encoder might run at 30fps (or more) when still, and drop to 5
or 10 fps (or lower) when the scene transitions into high motion.
I'll also note that the noise estimate is also based on a 30fps assumption,
though that might be easier to modify.  The noise estimate slews more
quickly for the first ~ 10 seconds.
The 30 fps that you see in the equations is because the tuning of the design
parameters was done for a nominal 30 fps. But the formulae are actually
there to compensate the parameters such that the system should work for
other frame rates too. It looks hard-coded, but it's not. You could argue
that using the max frame rate (over some window) is not correct. We tried
with mean frame rate at one stage, but had some problems with bias. Probably
we just didn't gather enough facts and understanding to resolve the issue
properly.
...
I'm not familiar enough with Kalman filters to *know* if it would be best
to change the scale factor to be based on the recent actual frame rate
(instead of recent max frame rate); I suspect it's better but not "right".
From checking on the theory of Kalman filters, there's no inherent reason
that it can't take inputs at different timestamp deltas - for that matter,
with some adjustments, it should be possible to use every packet as an input
to the filter, not just the last packet of a frame.  I'm not sure if in
practice this would improve the output or not, but it might be worth trying.
So, without going into line-by-line changes, my suggestion is to remove the
reliance on 30fps; to use an average observed frame rate instead of the max
recent frame rate, and we should feed all the inputs into one kalman filter
(perhaps trying both variants above), and see how it works in practice and
simulation.
I'll note that my practice I've found that you don't need a very good
filter to identify over-bandwidth situations; even a simple averaging filter
run over a second or so can usually pick up on bottleneck buffer states
pretty accurately.  A Kalman filter certainly should be better than that and
provide faster response, but my point is that it doesn't have to be perfect
to work.
BTW - there's lots of code in there to generate matlab output, to run it
standalone (though that seems turned off), etc.  Any guidance or pointers to
docs on how to build and run these tests, and how to get these stats out of
running calls?  I'm sure I can root around and figure it out, but a pointer
would save a bunch of time.  Thanks!
Once upon a time, we used this code for real-time visualization of the
bandwidth estimator using a Matlab Engine. It worked well on Windows with
the ancient Matlab 5 or 6, or so. We're currently working in a linux
environment with the latest and greatest Matlab release, and now the
real-time plotting is quite far from real-time. There is unfortunately no
docs that describes what's going on. I can see what old code I can dig out.
...
Thanks!  I think we can help make something here that will really raise the
bar.
Randell Jesup
  (sent from my private email address)

Re: [R-C] Congestion Control BOF

Henrik Lundin

Randell Jesup

Henrik Lundin

Varun Singh

Randell Jesup

Varun Singh

Henrik Lundin

Varun Singh

Jim Gettys

tags

participants (4)