
I'm responding to the new list, but CCing people for now until they have a chance to join; if you want to continue in this discussion join the list at http://www.alvestrand.no/mailman/listinfo/rtp-congestion On 10/8/2011 5:13 PM, Harald Alvestrand wrote:
On 10/08/2011 04:53 PM, Randell Jesup wrote:
On 10/8/2011 7:26 AM, Harald Alvestrand wrote:
Randell, this seems like a very good start!
Two thoughts from random discussion:
- WRT timestamps, there's a header extension proposed in the TFRC-for-RTP draft that would tag each packet at sending time rather than at frame generation time. Could we switch to recommending this timestamp, and using it if it's available?
That could mess up A/V sync, I'm afraid. How? The A/V timestamps would still be available for syncing, but we'd use the send-time timestamps to compute throughput (and then would have the ability to measure each packet's transit time individually, rather than looking at frame generation -> frame arrival timing).
Well, I'm probably being overly-worried about processing delays (and in particular differing delays for audio and video). Let's say audio gets sampled at X, and (ignoring other processing steps) takes 1ms to encode. It gets to the wire at X + <other steps> + 1. Lets say video is also sampled at X, and (ignoring other processing steps) takes 10ms to encode. It gets to the wire at X + <other steps> + 10. So we've added a 9ms offset to all our A/V sync, and in this case it's in the "wrong" direction (people are more sensitive to early-audio than early-video). And if "other steps" on each side don't balance (and they may not), it could be worse. I also worry more that in a browser, with no access to true RT_PRI processing, the delays could be significantly variable (we get preempted by some other process/thread for 10 or 20ms, etc). Also, if the receiver isn't careful it could be tricked into skipping frames it should be displaying due to jitter in the packet-to-packet timestamps. So perhaps I'm not being overly-worried. I realize that I'm trading off accuracy in bandwidth estimation (or if you prefer, reaction speed) for ease in getting a consistent framerate and best-possible A/V sync. In a perfect world we'd record the sampling time and the delta until it was submitted to sendto(), so we'd have both. (You could use a header extension to do that). Also... I've been thinking more about my two listed options for feeding the combined Kalman filter. I *think* the first option might not work in practice (treat all the packets as in one stream, and calculate arrival deltas for each packet received against the previous one). Without a hard relationship between the two timestamp clocks (in particular relative offset), you can't calculate inter-packet send times properly. You can choose an arbitrary offset between them and update it periodically, but that will not match reality; it will be offset in one direction or the other. This means that A,B will be biased in one direction, and B,A will be biased in the other - and A,A or B,B won't be biased. This might "balance out" and happen to work, but I'm distrustful without more info about timestamp-synchronization being passed. We should look into it and how well option 1 works with sync sources, because in some cases they are synchronized or could be. Ironically Harald's suggestion above (or the header-extension variant) would help work around this. -- Randell Jesup randell-ietf@jesup.org