
On 10/14/2011 04:28 PM, Randell Jesup wrote:
On 10/13/2011 7:43 PM, Jim Gettys wrote:
On 10/13/2011 06:46 PM, Randell Jesup wrote:
Yes - though in my case for desktops it's generally the main internet or the other end's downstream, and for wireless it's usually 802.11 (I have FiOS something like 30 or 35Mbps down, 20Mbps up).
The problem is that wireless is highly variable, and roughly comparable to broadband bandwidth. So we get the bottleneck going back and forth (particularly since wireless is shared, and so others sharing the wireless can slow the wireless bandwidth).
Best strategy for most home users is to try to get the bottleneck firmly into the broadband link and use bandwidth shaping to control the buffering there, since the host OS is not under your control. So you have a good excuse to go buy the shiny 802.11n router you have been lusting after and hadn't convinced your wife/husband to buy..... If you do that, you can get really good behaviour today (until you wander too far from your AP).
That's fine for me, that advice doesn't generally help our users.
Yeah, ergo shining the light on the problem.
Yup. Ergo the screed, trying to get people to stop before making things worse. The irony is that I do understand that, were it not for the fact that browsers have long since discarded HTTP's 2 connection rule, it might be a good idea, and help encourage better behaviour.
SPDY might help some here (though part of SPDY's purpose is to continue to saturate that TCP connection even better, so maybe not).
It helps the transient problem. It won't help if you are using SPDY for bulk download of something the way HTTP is often abused for.
And it takes time for the buffers to fill, so it might help quite a lot. The buffers fill at one packet/ack I gather; the acks get further and further apart as the buffer fills.
So, it might help if for no other reason than reducing the number of TCP connections and startups, and reducing the number of congestion-control streams.
Even one TCP stream will fill the buffers... Using fewer connections reduces the transient problem.
We can't control other browsers/devices on the same connection; we may be able to control other code within the same browser.
My point is that while external flows are outside our control, internal browser TCP flows are within our control.
We can't really make our jitter buffers so big as to make for decent audio/video when bufferbloat is present, unless you like talking to someone half way to the moon (or further). Netalyzr shows the problem in broadband, but our OS's and home routers are often even worse.
The jitter buffers don't have to be that large - in steady-state, you have a lot of delay. You do have to manage delay some, but delay in the network doesn't directly affect you. Transitions in and out of bufferbloat will, but the jitter buffer should handle that.
I fear the spikes I see in my packet traces. I see multiple retransmits and a bunch of packets out of order each time I go through one of the buffer fill cycles.
Not something I expect with RTP data - we don't retransmit on drops. When looking at TCP data I would expect retransmits once those queues fill.
I'm referring to the fact there are multiple packet losses very close together in time.
In the short/immediate term, mitigations are possible. My home network now works tremendously better than it did a year ago, and yours can immediately too, even with many existing home routers. But doing so is probably beyond non-network wizards today.
Yes. Useful for looking into, but not for solving the problem. (And pressuring router makers - but that's a near-0-margin game for most of them.
Again, the approach I have is to build a home router that actually works right; ergo CeroWrt; the vendors can pick up the results as they see fit.
That's about the only obvious way; they mostly license the base router code from the HW vendor or a 3rd-party SW vendor, then put their "corporate UI" and some features on top of it, from what I can tell.
The problem I would expect is that "hobbyist" router firmware is often not usable by manufacturers for license issues, or if it is it's too hard to reskin in their corporate layout, or it's too hard for them to easily configure out stuff they don't want, etc. And there's no one *trying* to sell them on this, unless you can get the SoC/reference-design people to pick it up.
Actually, OpenWrt is the "upstream" for some of the smaller router vendors already. And yes, I'm trying to get people to realise that having a good upstream is better than where they are today. Only time will tell if we succeed. And it's our way to get changes/fixes into the upstream projects that are used by everybody, though the large commercial vendors currently ship bits that have fermented (rotted) for 5 years or more. So the way I look at it is that at worst, the fixes eventually trickle into the commercial code base; and some will ship much faster.
o exposing the bloat problem so that blame can be apportioned is *really* important. Timestamps would help greatly here in rtp in doing so. Modern TCP's (may) have the TCP timestamp option turned on (I know modern Linux systems do), so I don't know of anything needed there beyond ensuring the TCP information is made available somehow, if it isn't already. Being able to reliably tell people: "The network is broken, you need to fix (your OS/your router/your broadband gear)." is productive. and to deploy IPv6 we're looking to deploying new home kit anyway.
We can look into that. Suggestions welcome.
The first step is detection: simple timestamps get you that.
We can detect delay already (at least RTT delay; one-way is tough, but we can approximate how much we are above the low point of one-way delay).
The next step is to locate the hop; basically, a traceroute like algorithm that looks for the hop where the latency goes up unexpectedly identifies what hop. There is a commercial tool called "pingplotter" which roughly does this and plots the result graphically.
So diagnostic tools.
o designing good congestion avoidance that will work in in an unbroken, unbloated network is clearly needed. But I don't think heroic engineering around bufferbloat is worthwhile right now for RTP; that effort is better put into the solutions outlined above, I think. Trying to do so when we've already lost the war (teleconferencing isn't interesting when talking half way to the moon) is not productive, and getting stable servo systems to work not just at the 100ms level, but the multi-second level, when multi-second level isn't even usable for the application is a waste. RTP == Real-Time Transport Protocol, when the network is no longer real time, is an oxymoron.
In practice it really does work most of the time. But not all.
Yes, but I worry as more applications that move big stuff around deploy, and Windows XP retires, the situation is only going to get worse.
Could be.