Re: [R-C] [ledbat] LEDBAT vs RTCWeb

On 4/23/2012 5:27 PM, Rolf Winter wrote:
I searched the email archives and didn't see anything discussing it (though I might have missed it) or discussing interaction with VoIP (other than that one paper, which also showed the sensitivity to the allowed delay and the advantage of a user willing to accept longer delays, plus how latecomers could mis-read the buffer state.
The problem is that a 100ms-delay-scavenger algorithm like this "poisons the well" for any application that desires/needs lower delay. That's why I believe there are only two reasonable targets: 0 (drained queues) and full queues (assuming AQM isn't implemented to keep queues low). I think worrying about a poisoned well is a little late with loss-based congestion control being deployed.
We know when faced with deep buffers and large loss-based flow, a delay flow will lose or have to transition to loss-based and live with delay (see Cx-TCP). We also know that in practice, the (residential) bottleneck links are almost always the access link (DSL, fiber, etc) or sometimes the WiFi connection. Corporate/educational can be a bit different, but there's a better chance of AQM or service-based traffic shaping there. We can deal (mostly) with fighting with TCP and expected it. We didn't expect scavenger protocols to be pumping the queues up when TCP is idle or bursty.
I don't think this is just a theoretical problem, I think it's a very serious issue, which if LEDBAT gets used a lot will cause a lot of problems. Perhaps it's *better* than old-style saturating bittorrent flows, in that VoIP will be possible without shutting down bittorrent, but that doesn't mean it's good, and the perception that bittorrent doesn't break other apps with the new protocols will encourage people to leave it running - and depending on the vagaries of bittorrent, it might be fine when you start the call and then slam you. I'm especially worried about OS's and apps using it for background update transfers with no user control or knowledge, etc. Well, this reads a bit too dramatic for my taste.
Perhaps. But I think a fundamental "how does this affect the network" case was glossed over/ignored.
LEDBAT typically comes with the app so you're a bit more agile regarding the congestion control parameterization (and even the implementation as the BitTorrent folks have shown).
I note that LEDBAT is now in Linux and MacOS, so it's not just in application code anymore, and app developers (even OS developers) may assume it's safe to blindly use without user involvement. (And even if informed, assuming users understand the interaction of protocols is ... a stretch.)
BTW, as long as update services do not use the uplink capacity of customers, I don't see much of a problem (assuming we agree that the biggest problem is home gateways).
It's not just uplink, but also downlinks (though uplinks typically are the worst-configured). An update service for the OS or a large app can fill the downstream buffers with 100ms of packets, and then any incoming VoIP data is delayed with no real way to solve it until the update/download ends. In some cases that might be hours or even days. Other apps like cloud backup services might assume it's safe to use LEDBAT for their background transfers, and keep your upstream saturated at 100ms delay at all times.
Not sure how BITS compares in all of this either. In either case it beats TCP and that is used without the users' knowledge, too. The best you could probably do is design better home gateways which would solve many of the problems we see today (the better ones actually today implement traffic shaping with really good results, probably better than any end-to-end congestion control algorithm).
Note that no routers are likely to recognize rtcweb traffic as VoIP since the signaling is opaque and application-dependant, so ALGs and the like have nothing to work against. It would have to 'guess' that the packet stream leading byte pattern and STUN/ICE traffic signified use of the port for VoIP. Eventually they may learn, but it will take years. I stand by the contention this is a real, significant problem which could get far worse if non-user-centric services start using LEDBAT. -- Randell Jesup randell-ietf@jesup.org

My take on the 1000-foot level: - The RTPCongestion effort will result in an algorithm that works in 2 modes: - "Queueing delay is comfortably low, and we're working to keep it that way" - "Queueing delay is uncomfortably high, and we're trying to get our fair share" - TCP is the main contender for bandwidth *now* - LEDBAT is a possible contender for bandwidth in the near future - TCP and LEDBAT will (I hope) be detectable as different patterns of "conflict" - An important output of RTPCongestion is *instrumentation* - knowing which mode we're in, and who we're contending with - Once we get large scale RTCWEB deployment, and with it large scale deployment of RTPCongestion algorithms, in applications that read this instrumentation and harvest the results, my hope (perhaps unrealistic) is that we'll be in a better position to make *informed* decisions about where the problems are, and what the places of maximum bang-for-the-buck are to "make things better". Very short summary of the above: Let's put instrumentation in as a primary goal. Harald
participants (2)
-
Harald Alvestrand
-
Randell Jesup