Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)

I think we should consider the balance between cost, risk, quality, and existing adoption, and it would be foolish to omit cost-bearing codecs from that analysis, as H.264 is widely used already.
I am not sure where this discussion is going, though it reminds us of the discussions when arguing about SIP vs. H.323 in the IETF. "Everybody" was shipping H.323 in overwhelming quantity, but somehow the IETF did not buy it. As an hopeless optimist; maybe H.264 will have the same fate since at least it's considerable IP baggage is so well known... It is hard to imagine the IETF and indeed the market will ignore the creativity of all the codec developers out there and the evolving technology that empowers them. Plain self interest should motivate embracing new IP-free a/v codecs for the RTC Web. They will arrive anyway one way or another. [Well deployed technology has a proven way to make it over the threshold into history :-)] Henry On 12/22/10 5:46 PM, "David Singer" <singer@apple.com> wrote:
I should say that I am perfectly happy to have the discussion about whether, and what, to mandate codec(s). If the placeholder is there to spark the discussion, you succeeded!
For me, significant IPR risk is worse than a payment -- free but risky is not an improvement on costs-but-low-risk. But others may differ, of course.
I do think we should be looking carefully at layered design. Ideally, we define the missing technology bits -- HTML, Javascript/DOM, and so on; and then, perhaps, we write an umbrella specification that uses those, and other technology pieces, to achieve an interoperable end. But the pieces should 'make sense' by themselves, and be usable with other assemblages, I think, if the specs are to have 'legs' and survive over years. The pieces are ideally stable standards/publications in their own right (and I agree, sometimes, rarely, something is stable and well-known enough, such as ZIP or ID3 tags, without being a 'standard').
I also agree that RF codecs are happening and are here to stay. Those of you who know me from MPEG will have heard this before. They will exist, and they will have a place. MPEG is working (slowly) on RF codecs as well.
As to what MPEG-LA is doing, I am afraid I don't actually know. We'd have to ask them, and they tend not to reply. The silence is strange, but I don't think that mitigates the possibility that there is an IPR entanglement.
Despite Henry's position (that mentioning VP8 results in no rat-holes and flames, and that mentioning H.264 will) I think we should consider the balance between cost, risk, quality, and existing adoption, and it would be foolish to omit cost-bearing codecs from that analysis, as H.264 is widely used already.
The link between open-source and royalty-free is not perfect; there are quality open-source implementations of non-free codecs, for a start, and there are companies who license proprietary systems royalty-free. Let's not confuse the two, even if they often occur together.
David Singer Multimedia and Software Standards, Apple Inc.
_______________________________________________ dispatch mailing list dispatch@ietf.org https://www.ietf.org/mailman/listinfo/dispatch

Heinrich, 'best' is not always IPR-cost-free. Sometimes it is, sometimes it isn't. You seem unable to see any other possibility than your own, alas. I could wish for 'fates' for any number of technologies, but I don't: I choose them when they suit, and others when they don't. I suggest we do the same. I have no objection to the development and deployment of new codecs, with varying terms, quality, complexity, and so on. This is a varied market that deserves varied tools. I do object to making decisions based on only one criterion, however. On Dec 26, 2010, at 18:12 , Heinrich Sinnreich wrote:
I think we should consider the balance between cost, risk, quality, and existing adoption, and it would be foolish to omit cost-bearing codecs from that analysis, as H.264 is widely used already.
I am not sure where this discussion is going, though it reminds us of the discussions when arguing about SIP vs. H.323 in the IETF. "Everybody" was shipping H.323 in overwhelming quantity, but somehow the IETF did not buy it.
As an hopeless optimist; maybe H.264 will have the same fate since at least it's considerable IP baggage is so well known...
It is hard to imagine the IETF and indeed the market will ignore the creativity of all the codec developers out there and the evolving technology that empowers them. Plain self interest should motivate embracing new IP-free a/v codecs for the RTC Web. They will arrive anyway one way or another.
[Well deployed technology has a proven way to make it over the threshold into history :-)]
David Singer Multimedia and Software Standards, Apple Inc.

For video codecs, "self interest" may be influenced by a number of factors. For example, for a mobile applications developer, "self interest" may focus on aspects such as performance, battery life and maintenance costs. If a given codec is supported in the hardware or operating system of their target platform, then the developer may perceive it being low "cost" to them. For a chipset manufacturer, "self interest" may be determined by the demand for chipsets incorporating a given codec, as well as the associated licensing fees. Typically the goal is to maximize revenue minus cost, not just to minimize "cost". These concepts of "self interest" not necessarily align with each other, let alone with the "self interest" of users, who may primarily care about how many other users they can connect with. -----Original Message----- From: rtc-web-bounces@alvestrand.no [mailto:rtc-web-bounces@alvestrand.no] On Behalf Of David Singer Sent: Wednesday, December 29, 2010 8:08 PM To: Heinrich Sinnreich Cc: rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00) Heinrich, 'best' is not always IPR-cost-free. Sometimes it is, sometimes it isn't. You seem unable to see any other possibility than your own, alas. I could wish for 'fates' for any number of technologies, but I don't: I choose them when they suit, and others when they don't. I suggest we do the same. I have no objection to the development and deployment of new codecs, with varying terms, quality, complexity, and so on. This is a varied market that deserves varied tools. I do object to making decisions based on only one criterion, however. On Dec 26, 2010, at 18:12 , Heinrich Sinnreich wrote:
I think we should consider the balance between cost, risk, quality, and existing adoption, and it would be foolish to omit cost-bearing codecs from that analysis, as H.264 is widely used already.
I am not sure where this discussion is going, though it reminds us of the discussions when arguing about SIP vs. H.323 in the IETF. "Everybody" was shipping H.323 in overwhelming quantity, but somehow the IETF did not buy it.
As an hopeless optimist; maybe H.264 will have the same fate since at least it's considerable IP baggage is so well known...
It is hard to imagine the IETF and indeed the market will ignore the creativity of all the codec developers out there and the evolving technology that empowers them. Plain self interest should motivate embracing new IP-free a/v codecs for the RTC Web. They will arrive anyway one way or another.
[Well deployed technology has a proven way to make it over the threshold into history :-)]
David Singer Multimedia and Software Standards, Apple Inc. _______________________________________________ RTC-Web mailing list RTC-Web@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtc-web

Hi, A couple of questions and comments on draft-alvestrand-dispatch-rtcweb-datagram-00: * Section 4 defines four channel types: UDP, TCP, TLS and DTLS. Is it expected that all clients MUST support all of these? I suppose the reason why both UDP and TCP are included is that depending on the types of middleboxes the peers are behind, they may get just one or the other working. I.e. first try out UDP, if it does not work, attempt TCP. Is that correct? * Section 4.5 states that TURN and relaying are needed. How about things like HTTP/TLS tunneling? In many cases that is the only way to get the transport channel working. TURN may be helpful in some, hopefully increasing, amount of cases, but that alone will still leave a lot of corporate users unserved. * Section 3 mentions that things like pseudoTCP can be run over the datagram transport. In case only UDP works end-to-end that seems useful. However, if we can get a "native" TCP connection up, it seems natural to use that as is rather than via some kind of generic datagram abstraction layer. I think we probably should define both datagram and bytestream services separately. Bytestream would be either TCP or pseudoTCP/UDP. If we only do datagram service, leave the whole pseudoTCP reference out. * Section 6 defines a URI with a number of IP address:transport:port candidates. I'm not too clear on how that URI would be used. It looks to me as something that is received as a result of gathering the candidates (with STUN, TURN etc.) as the first step of ICE. If Jingle or SDP offer/answer or some proprietary protocol were used to pass the candidate information to the other peer, that information would be encoded according to that particular protocol. So is this specific URI just meant for local representation at the API and not something that is passed over the network as such? What's the need or benefit of making it a URI? Thanks, Markus

On 12/30/10 16:52, Markus.Isomaki@nokia.com wrote:
Hi,
A couple of questions and comments on draft-alvestrand-dispatch-rtcweb-datagram-00:
* Section 4 defines four channel types: UDP, TCP, TLS and DTLS. Is it expected that all clients MUST support all of these? I suppose the reason why both UDP and TCP are included is that depending on the types of middleboxes the peers are behind, they may get just one or the other working. I.e. first try out UDP, if it does not work, attempt TCP. Is that correct? I started out by defining the types I thought might be needed - so this is not the "must implement" list. The "Must implement" list is still a matter for discussion - not having UDP in there is certainly a non-starter, so UDP has to be "must implement". For others, I'd like to see arguments. * Section 4.5 states that TURN and relaying are needed. How about things like HTTP/TLS tunneling? In many cases that is the only way to get the transport channel working. TURN may be helpful in some, hopefully increasing, amount of cases, but that alone will still leave a lot of corporate users unserved. At Google, we've implemented a form of HTTP/TLS tunnelling using a TURN variant (see libjingle source for details), so I didn't differentiate between those options - I see HTTP/TLS as a form of relaying; I don't believe simultaneous-open TCP is going to work reliably enough for our purposes. * Section 3 mentions that things like pseudoTCP can be run over the datagram transport. In case only UDP works end-to-end that seems useful. However, if we can get a "native" TCP connection up, it seems natural to use that as is rather than via some kind of generic datagram abstraction layer. I think we probably should define both datagram and bytestream services separately. Bytestream would be either TCP or pseudoTCP/UDP. If we only do datagram service, leave the whole pseudoTCP reference out. I agree with your way of putting it. The reference is intended as informative; bytestream COULD be layered over the datagram service, but doesn't need to be.
Background: I had a discussion with a coworker who wanted to have a single API to both streaming and datagram service; I believe it makes more sense to define an API to a datagram service and an API to a streaming service separately - the big argument for PseudoTCP is the client-to-client connection case, where one often finds that UDP works and TCP doesn't. The argument for defining TCP over datagram, rather than TCP over UDP, is that we might easily find ourselves needing functions like ICE establishment or TURN relaying for client-to-client TCP sessions; doing it this way saves us one duplicate reference set. (The concept of pseudoTCP running over datagrams tunneled over HTTP/TLS somewhat boggles the mind, though....)
* Section 6 defines a URI with a number of IP address:transport:port candidates. I'm not too clear on how that URI would be used. It looks to me as something that is received as a result of gathering the candidates (with STUN, TURN etc.) as the first step of ICE. If Jingle or SDP offer/answer or some proprietary protocol were used to pass the candidate information to the other peer, that information would be encoded according to that particular protocol. So is this specific URI just meant for local representation at the API and not something that is passed over the network as such? What's the need or benefit of making it a URI? This is an example of speculative standardization, or "trial balloon" - W3C folks have had a habit of using URIs for any form of structured parameter in APIs; if that is what the W3C effort concludes that they want, this is intended to show that it can be given to them. That said - the fact that this is intended to represent, exactly, the semantics of a=candidate lines from SDP needs to be made clear (as I said in another thread). Let's not have more different semantics than we need to. Thanks, Markus Thanks for the review1

On Sat, Jan 1, 2011 at 10:04 PM, Harald Alvestrand <harald@alvestrand.no>wrote:
On 12/30/10 16:52, Markus.Isomaki@nokia.com wrote:
Hi,
A couple of questions and comments on draft-alvestrand-dispatch-rtcweb-datagram-00:
* Section 4 defines four channel types: UDP, TCP, TLS and DTLS. Is it expected that all clients MUST support all of these? I suppose the reason why both UDP and TCP are included is that depending on the types of middleboxes the peers are behind, they may get just one or the other working. I.e. first try out UDP, if it does not work, attempt TCP. Is that correct?
I started out by defining the types I thought might be needed - so this is not the "must implement" list. The "Must implement" list is still a matter for discussion - not having UDP in there is certainly a non-starter, so UDP has to be "must implement". For others, I'd like to see arguments.
* Section 4.5 states that TURN and relaying are needed. How about things
like HTTP/TLS tunneling? In many cases that is the only way to get the transport channel working. TURN may be helpful in some, hopefully increasing, amount of cases, but that alone will still leave a lot of corporate users unserved.
At Google, we've implemented a form of HTTP/TLS tunnelling using a TURN variant (see libjingle source for details), so I didn't differentiate between those options - I see HTTP/TLS as a form of relaying; I don't believe simultaneous-open TCP is going to work reliably enough for our purposes.
* Section 3 mentions that things like pseudoTCP can be run over the
datagram transport. In case only UDP works end-to-end that seems useful. However, if we can get a "native" TCP connection up, it seems natural to use that as is rather than via some kind of generic datagram abstraction layer. I think we probably should define both datagram and bytestream services separately. Bytestream would be either TCP or pseudoTCP/UDP. If we only do datagram service, leave the whole pseudoTCP reference out.
I agree with your way of putting it. The reference is intended as informative; bytestream COULD be layered over the datagram service, but doesn't need to be.
The issue with a "native" TCP connection is that we need to perform ICE-y checks on it to enforce the same-origin policy. When using PseudoTCP, the ICE checks are done automatically as part of the datagram protocol. However, as Harald points out, if we are using the "native" TCP for a connection to a relay or conference server, we can simply run our datagram protocol over top of the native TCP connection. Basically what I am saying is that while we may use "native" TCP connections internally for handling scenarios where UDP is blocked, I'm not sure we can expose them from the API.
Background: I had a discussion with a coworker who wanted to have a single API to both streaming and datagram service; I believe it makes more sense to define an API to a datagram service and an API to a streaming service separately - the big argument for PseudoTCP is the client-to-client connection case, where one often finds that UDP works and TCP doesn't.
I had been thinking the datagram and streaming services should use a single API - much of the API ends up being the same, and that's how BSD sockets work. There are a few differences for streaming services, but they are mostly (entirely?) additive (flow control being the main one).
The argument for defining TCP over datagram, rather than TCP over UDP, is that we might easily find ourselves needing functions like ICE establishment or TURN relaying for client-to-client TCP sessions; doing it this way saves us one duplicate reference set. (The concept of pseudoTCP running over datagrams tunneled over HTTP/TLS somewhat boggles the mind, though....)
* Section 6 defines a URI with a number of IP address:transport:port
candidates. I'm not too clear on how that URI would be used. It looks to me as something that is received as a result of gathering the candidates (with STUN, TURN etc.) as the first step of ICE. If Jingle or SDP offer/answer or some proprietary protocol were used to pass the candidate information to the other peer, that information would be encoded according to that particular protocol. So is this specific URI just meant for local representation at the API and not something that is passed over the network as such? What's the need or benefit of making it a URI?
This is an example of speculative standardization, or "trial balloon" - W3C folks have had a habit of using URIs for any form of structured parameter in APIs; if that is what the W3C effort concludes that they want, this is intended to show that it can be given to them. That said - the fact that this is intended to represent, exactly, the semantics of a=candidate lines from SDP needs to be made clear (as I said in another thread). Let's not have more different semantics than we need to.
Thanks, Markus
Thanks for the review1
_______________________________________________ RTC-Web mailing list RTC-Web@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtc-web

On 01/06/11 08:05, Justin Uberti wrote:
Background: I had a discussion with a coworker who wanted to have a single API to both streaming and datagram service; I believe it makes more sense to define an API to a datagram service and an API to a streaming service separately - the big argument for PseudoTCP is the client-to-client connection case, where one often finds that UDP works and TCP doesn't.
I had been thinking the datagram and streaming services should use a single API - much of the API ends up being the same, and that's how BSD sockets work. There are a few differences for streaming services, but they are mostly (entirely?) additive (flow control being the main one).
The huge difference is that a streaming API gives you a sequence of bytes, in strict order, while a datagram API gives you a (not necessarily ordered) sequence of datagrams. Each datagram is guaranteed to be treated as an unit, but order is not preserved. There is no guarantee about block boundaries in a stream (you have to create them yourself if you want them), but order is strictly preserved. For most of the rest of the functions, the APIs can (and should) be the same, but the sending and reception of data is different. Harald

On Thu, Jan 6, 2011 at 7:38 AM, Harald Alvestrand <harald@alvestrand.no>wrote:
On 01/06/11 08:05, Justin Uberti wrote:
Background: I had a discussion with a coworker who wanted to have a single API to both streaming and datagram service; I believe it makes more sense to define an API to a datagram service and an API to a streaming service separately - the big argument for PseudoTCP is the client-to-client connection case, where one often finds that UDP works and TCP doesn't.
I had been thinking the datagram and streaming services should use a single API - much of the API ends up being the same, and that's how BSD sockets work. There are a few differences for streaming services, but they are mostly (entirely?) additive (flow control being the main one).
The huge difference is that a streaming API gives you a sequence of bytes, in strict order, while a datagram API gives you a (not necessarily ordered) sequence of datagrams. Each datagram is guaranteed to be treated as an unit, but order is not preserved. There is no guarantee about block boundaries in a stream (you have to create them yourself if you want them), but order is strictly preserved.
For most of the rest of the functions, the APIs can (and should) be the same, but the sending and reception of data is different.
Sure, but that doesn't seem like something the API has to worry about, other than to allow the mode to be specified (datagram or streaming) when the transport is created. For example, BSD sockets use the same APIs for stream and datagram operations; send and recv each take a pointer and a length, although the semantics are slightly different. (Note that sendto and recvfrom also exist in BSD sockets, but aren't relevant in this case, where the destination is fixed as a result of the ICE process).
Harald

On 01/06/11 22:32, Justin Uberti wrote:
On Thu, Jan 6, 2011 at 7:38 AM, Harald Alvestrand <harald@alvestrand.no <mailto:harald@alvestrand.no>> wrote:
On 01/06/11 08:05, Justin Uberti wrote:
Background: I had a discussion with a coworker who wanted to have a single API to both streaming and datagram service; I believe it makes more sense to define an API to a datagram service and an API to a streaming service separately - the big argument for PseudoTCP is the client-to-client connection case, where one often finds that UDP works and TCP doesn't.
I had been thinking the datagram and streaming services should use a single API - much of the API ends up being the same, and that's how BSD sockets work. There are a few differences for streaming services, but they are mostly (entirely?) additive (flow control being the main one).
The huge difference is that a streaming API gives you a sequence of bytes, in strict order, while a datagram API gives you a (not necessarily ordered) sequence of datagrams. Each datagram is guaranteed to be treated as an unit, but order is not preserved. There is no guarantee about block boundaries in a stream (you have to create them yourself if you want them), but order is strictly preserved.
For most of the rest of the functions, the APIs can (and should) be the same, but the sending and reception of data is different.
Sure, but that doesn't seem like something the API has to worry about, other than to allow the mode to be specified (datagram or streaming) when the transport is created. For example, BSD sockets use the same APIs for stream and datagram operations; send and recv each take a pointer and a length, although the semantics are slightly different. (Note that sendto and recvfrom also exist in BSD sockets, but aren't relevant in this case, where the destination is fixed as a result of the ICE process).
The semantics are a whole lot different; send(3 bytes) + send(3 bytes) on a streaming socket can get you a recv(6 bytes); send(3 bytes) + send(3 bytes) on a datagram socket gives you recv(3 bytes), twice. The fact that the BSD socket interface tries to hide this distinction is a weakness of BSD sockets. (My master's thesis work back in 1984 involved an email exchange with Jon Postel to verify the semantics of TCP wrt packet boundaries - it seems that it's been an argument within the ARPA community for quite some time before that, too). The programmer has to be completely clear on whether packet boundaries will be preserved over the API or not. This is clearest when it's visible in the API. Harald

Hi, A few comments and questions on rtcweb-protocols-00: In general I believe we need a framework document that describes how the various protocol and API specs are used to build an interoperable "RTC-Web" implementation. In my opinion that document should not be like RFC 5411 (Hitchhiker's guide to SIP), which is merely listing of different SIP related standards. Instead, the RTC-Web framework document should define what is the minimum set that everyone needs to implement to get interop at a reasonable level. (What that reasonable level is needs to be agreed upon, of course. I assume it includes at least transport and NAT traversal for audio/video streams.) I understand that the current document is not yet intended for that purpose, but it is just trying to get discussion started. But if we pick this draft as a baseline going forward, it would be useful to make the use of language more consistent. At the moment some things are defined with "MUST" statements while others are more vague. I think the "MUST" statements are good and everything that is really required by an implementation needs to be expressed that way. (It is naturally good to have some descriptive text around the exact requirements, as long as the requirements are clear.) Section 4 defines data framing and security. I believe the challenging part of data framing will be to ensure we get the video calls interoperable. I don't have that much experience on the details, but I know that getting RTP/AVPF stuff implemented and interoperable is not that trivial. Various groups such as IMTC and UCIF have worked on interoperability profiles for video calls. The easy approach might be just to mandate the basics (RTP/AVP), get that well working across browsers and then extend based on that experience. If those other groups come up with something useful and public, those profiles could be borrowed. Section 6 is about connection management. That will be the really hard part of this exercise. I do support the notion that at least initially we should focus on transport, framing and formats of media, and say that those can be setup in proprietary ways (presumably the browser using HTTP or websocket as transport for the actual setup). For that the APIs would "only" need to support what is input/output to ICE/STUN/TURN and codec selection, while the rest happens in Javascript within the application. But going forward I think we do need to pick up either SIP/SDP or XMPP/Jingle as the baseline, in order to make things easy to use. Markus

On 12/30/10 17:10, Markus.Isomaki@nokia.com wrote:
Hi,
A few comments and questions on rtcweb-protocols-00:
In general I believe we need a framework document that describes how the various protocol and API specs are used to build an interoperable "RTC-Web" implementation. In my opinion that document should not be like RFC 5411 (Hitchhiker's guide to SIP), which is merely listing of different SIP related standards. Instead, the RTC-Web framework document should define what is the minimum set that everyone needs to implement to get interop at a reasonable level. (What that reasonable level is needs to be agreed upon, of course. I assume it includes at least transport and NAT traversal for audio/video streams.) I agree, and that is where I want this to go. In addition, we need use case documentation that allows us to verify that the use cases we envision are satisfiable within the framework. I understand that the current document is not yet intended for that purpose, but it is just trying to get discussion started. But if we pick this draft as a baseline going forward, it would be useful to make the use of language more consistent. At the moment some things are defined with "MUST" statements while others are more vague. I think the "MUST" statements are good and everything that is really required by an implementation needs to be expressed that way. (It is naturally good to have some descriptive text around the exact requirements, as long as the requirements are clear.) Yes, the vagueness is in direct proportion to how far I'm away from being able to read some sort of consensus that stuff needs inclusion. As discussion goes forward, I expect it to be sharpened. Note that even the MUSTs are merely trial balloons at the moment - I wouldn't be unhappy with carrying most of them forward, but it is highly likely that some of them are wrong, and others may want to be left open rather than mandated. Discussion will show. Section 4 defines data framing and security. I believe the challenging part of data framing will be to ensure we get the video calls interoperable. I don't have that much experience on the details, but I know that getting RTP/AVPF stuff implemented and interoperable is not that trivial. Various groups such as IMTC and UCIF have worked on interoperability profiles for video calls. The easy approach might be just to mandate the basics (RTP/AVP), get that well working across browsers and then extend based on that experience. If those other groups come up with something useful and public, those profiles could be borrowed. Agreed 100%. It might even help interoperability. Do you have citable references to the current state of that work? I'm worried about not including AVPF, since there is stuff in AVPF (including the various NACKs and reference frame requests) that is vital for getting efficient reliable communication over lossy networks using some of the potential codecs. Matter for further discussion. Section 6 is about connection management. That will be the really hard part of this exercise. I do support the notion that at least initially we should focus on transport, framing and formats of media, and say that those can be setup in proprietary ways (presumably the browser using HTTP or websocket as transport for the actual setup). For that the APIs would "only" need to support what is input/output to ICE/STUN/TURN and codec selection, while the rest happens in Javascript within the application. But going forward I think we do need to pick up either SIP/SDP or XMPP/Jingle as the baseline, in order to make things easy to use. If we can have a common definition of "connection" that we can offer up (through APIs) to SIP/SDP engines or XMPP/Jingle engines as "the object to be manipulated", I think we've already won a lot. I don't want to trap us at the forefront of more battlefronts than we have to have. While it would be nice to have only one engine to relate to, I am afraid of bringing this discussion into this group. Markus

In general I believe we need a framework document that describes how the various protocol and API specs are used to build an interoperable "RTC-Web" implementation. In my opinion that document should not be like RFC 5411 (Hitchhiker's guide to SIP), which is merely listing of different SIP related standards. Instead, the RTC-Web framework document should define what is the minimum set that everyone needs to implement to get interop at a reasonable level. (What that reasonable level is needs to be agreed upon, of course. I assume it includes at least transport and NAT traversal for audio/video streams.) I agree, and that is where I want this to go. In addition, we need use case documentation that allows us to verify that >the use cases we envision are satisfiable within the framework. We also need to decide were such a FW doc should be developed. IETF could be a natural place, but it is quite close in definition to the "Profile" recommendation proposed by the draft W3C charter.
//Stefan
-----Original Message----- From: rtc-web-bounces@alvestrand.no [mailto:rtc-web-bounces@alvestrand.no] On Behalf Of Harald Alvestrand Sent: den 2 januari 2011 07:45 To: Markus.Isomaki@nokia.com Cc: rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] Comments on draft-alvestrand-dispatch-rtcweb-protocols-00
On 12/30/10 17:10, Markus.Isomaki@nokia.com wrote:
Hi,
A few comments and questions on rtcweb-protocols-00:
In general I believe we need a framework document that describes how the various protocol and API specs are used to build an interoperable "RTC-Web" implementation. In my opinion that document should not be like RFC 5411 (Hitchhiker's guide to SIP), which is merely listing of different SIP related standards. Instead, the RTC-Web framework document should define what is the minimum set that everyone needs to implement to get interop at a reasonable level. (What that reasonable level is needs to be agreed upon, of course. I assume it includes at least transport and NAT traversal for audio/video streams.) I agree, and that is where I want this to go. In addition, we need use case documentation that allows us to verify that the use cases we envision are satisfiable within the framework. I understand that the current document is not yet intended for that purpose, but it is just trying to get discussion started. But if we pick this draft as a baseline going forward, it would be useful to make the use of language more consistent. At the moment some things are defined with "MUST" statements while others are more vague. I think the "MUST" statements are good and everything that is really required by an implementation needs to be expressed that way. (It is naturally good to have some descriptive text around the exact requirements, as long as the requirements are clear.) Yes, the vagueness is in direct proportion to how far I'm away from being able to read some sort of consensus that stuff needs inclusion. As discussion goes forward, I expect it to be sharpened. Note that even the MUSTs are merely trial balloons at the moment - I wouldn't be unhappy with carrying most of them forward, but it is highly likely that some of them are wrong, and others may want to be left open rather than mandated. Discussion will show. Section 4 defines data framing and security. I believe the challenging part of data framing will be to ensure we get the video calls interoperable. I don't have that much experience on the details, but I know that getting RTP/AVPF stuff implemented and interoperable is not that trivial. Various groups such as IMTC and UCIF have worked on interoperability profiles for video calls. The easy approach might be just to mandate the basics (RTP/AVP), get that well working across browsers and then extend based on that experience. If those other groups come up with something useful and public, those profiles could be borrowed. Agreed 100%. It might even help interoperability. Do you have citable references to the current state of that work? I'm worried about not including AVPF, since there is stuff in AVPF (including the various NACKs and reference frame requests) that is vital for getting efficient reliable communication over lossy networks using some of the potential codecs. Matter for further discussion. Section 6 is about connection management. That will be the really hard part of this exercise. I do support the notion that at least initially we should focus on transport, framing and formats of media, and say that those can be setup in proprietary ways (presumably the browser using HTTP or websocket as transport for the actual setup). For that the APIs would "only" need to support what is input/output to ICE/STUN/TURN and codec selection, while the rest happens in Javascript within the application. But going forward I think we do need to pick up either SIP/SDP or XMPP/Jingle as the baseline, in order to make things easy to use. If we can have a common definition of "connection" that we can offer up (through APIs) to SIP/SDP engines or XMPP/Jingle engines as "the object to be manipulated", I think we've already won a lot. I don't want to trap us at the forefront of more battlefronts than we have to have. While it would be nice to have only one engine to relate to, I am afraid of bringing this discussion into this group. Markus
_______________________________________________ RTC-Web mailing list RTC-Web@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtc-web

This should mark a good progress for the discussion ending this year and we hopefully can all agree:
These concepts of "self interest" not necessarily align with each other, let alone with the "self interest" of users, who may primarily care about how many other users they can connect with.
The interests of users must be first and foremost in mind for defining a standard for the default video codec. It includes not passing the cost of licensing in perpetuity along to users. Given a choice for open source and/or free video codecs such as Theora and VP8, the IETF has several good options to choose from, also defining of a video codec from ground up. Thanks, Henry On 12/30/10 12:09 AM, "Bernard Aboba" <bernard_aboba@hotmail.com> wrote:
For video codecs, "self interest" may be influenced by a number of factors.
For example, for a mobile applications developer, "self interest" may focus on aspects such as performance, battery life and maintenance costs. If a given codec is supported in the hardware or operating system of their target platform, then the developer may perceive it being low "cost" to them.
For a chipset manufacturer, "self interest" may be determined by the demand for chipsets incorporating a given codec, as well as the associated licensing fees. Typically the goal is to maximize revenue minus cost, not just to minimize "cost".
These concepts of "self interest" not necessarily align with each other, let alone with the "self interest" of users, who may primarily care about how many other users they can connect with.
-----Original Message----- From: rtc-web-bounces@alvestrand.no [mailto:rtc-web-bounces@alvestrand.no] On Behalf Of David Singer Sent: Wednesday, December 29, 2010 8:08 PM To: Heinrich Sinnreich Cc: rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
Heinrich,
'best' is not always IPR-cost-free. Sometimes it is, sometimes it isn't. You seem unable to see any other possibility than your own, alas. I could wish for 'fates' for any number of technologies, but I don't: I choose them when they suit, and others when they don't. I suggest we do the same.
I have no objection to the development and deployment of new codecs, with varying terms, quality, complexity, and so on. This is a varied market that deserves varied tools. I do object to making decisions based on only one criterion, however.
On Dec 26, 2010, at 18:12 , Heinrich Sinnreich wrote:
I think we should consider the balance between cost, risk, quality, and existing adoption, and it would be foolish to omit cost-bearing codecs from that analysis, as H.264 is widely used already.
I am not sure where this discussion is going, though it reminds us of the discussions when arguing about SIP vs. H.323 in the IETF. "Everybody" was shipping H.323 in overwhelming quantity, but somehow the IETF did not buy it.
As an hopeless optimist; maybe H.264 will have the same fate since at least it's considerable IP baggage is so well known...
It is hard to imagine the IETF and indeed the market will ignore the creativity of all the codec developers out there and the evolving technology that empowers them. Plain self interest should motivate embracing new IP-free a/v codecs for the RTC Web. They will arrive anyway one way or another.
[Well deployed technology has a proven way to make it over the threshold into history :-)]
David Singer Multimedia and Software Standards, Apple Inc.
_______________________________________________ RTC-Web mailing list RTC-Web@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtc-web

I'm with Bernard and David on this one. This is different from the audio case, as hardware acceleration is much more important for video, particularly for mobile. Stephen Botzko On Thu, Dec 30, 2010 at 12:02 PM, Henry Sinnreich <henry.sinnreich@gmail.com
wrote:
This should mark a good progress for the discussion ending this year and we hopefully can all agree:
These concepts of "self interest" not necessarily align with each other, let alone with the "self interest" of users, who may primarily care about how many other users they can connect with.
The interests of users must be first and foremost in mind for defining a standard for the default video codec.
It includes not passing the cost of licensing in perpetuity along to users.
Given a choice for open source and/or free video codecs such as Theora and VP8, the IETF has several good options to choose from, also defining of a video codec from ground up.
Thanks,
Henry
On 12/30/10 12:09 AM, "Bernard Aboba" <bernard_aboba@hotmail.com> wrote:
For video codecs, "self interest" may be influenced by a number of factors.
For example, for a mobile applications developer, "self interest" may focus on aspects such as performance, battery life and maintenance costs. If a given codec is supported in the hardware or operating system of their target platform, then the developer may perceive it being low "cost" to them.
For a chipset manufacturer, "self interest" may be determined by the demand for chipsets incorporating a given codec, as well as the associated licensing fees. Typically the goal is to maximize revenue minus cost, not just to minimize "cost".
These concepts of "self interest" not necessarily align with each other, let alone with the "self interest" of users, who may primarily care about how many other users they can connect with.
-----Original Message----- From: rtc-web-bounces@alvestrand.no [mailto: rtc-web-bounces@alvestrand.no] On Behalf Of David Singer Sent: Wednesday, December 29, 2010 8:08 PM To: Heinrich Sinnreich Cc: rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
Heinrich,
'best' is not always IPR-cost-free. Sometimes it is, sometimes it isn't. You seem unable to see any other possibility than your own, alas. I could wish for 'fates' for any number of technologies, but I don't: I choose them when they suit, and others when they don't. I suggest we do the same.
I have no objection to the development and deployment of new codecs, with varying terms, quality, complexity, and so on. This is a varied market that deserves varied tools. I do object to making decisions based on only one criterion, however.
On Dec 26, 2010, at 18:12 , Heinrich Sinnreich wrote:
I think we should consider the balance between cost, risk, quality, and existing adoption, and it would be foolish to omit cost-bearing codecs from that analysis, as H.264 is widely used already.
I am not sure where this discussion is going, though it reminds us of the discussions when arguing about SIP vs. H.323 in the IETF. "Everybody" was shipping H.323 in overwhelming quantity, but somehow the IETF did not buy it.
As an hopeless optimist; maybe H.264 will have the same fate since at least it's considerable IP baggage is so well known...
It is hard to imagine the IETF and indeed the market will ignore the creativity of all the codec developers out there and the evolving technology that empowers them. Plain self interest should motivate embracing new IP-free a/v codecs for the RTC Web. They will arrive anyway one way or another.
[Well deployed technology has a proven way to make it over the threshold into history :-)]
David Singer Multimedia and Software Standards, Apple Inc.
_______________________________________________ RTC-Web mailing list RTC-Web@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtc-web
_______________________________________________ dispatch mailing list dispatch@ietf.org https://www.ietf.org/mailman/listinfo/dispatch

I also agree that codecs such as H.264 AVC need to be considered, because of interworking with non-RTC-web users, conference bridges, etc.. An important part of the proposed charter is: "* interoperate with compatible voice and video systems that are not web based" John
-----Original Message----- From: dispatch-bounces@ietf.org [mailto:dispatch-bounces@ietf.org] On Behalf Of Stephen Botzko Sent: 30 December 2010 17:39 To: Henry Sinnreich Cc: Bernard Aboba; rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [dispatch] [RTW] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
I'm with Bernard and David on this one.
This is different from the audio case, as hardware acceleration is much more important for video, particularly for mobile.
Stephen Botzko
On Thu, Dec 30, 2010 at 12:02 PM, Henry Sinnreich <henry.sinnreich@gmail.com> wrote:
This should mark a good progress for the discussion ending this year and we hopefully can all agree:
These concepts of "self interest" not necessarily align with each other, let alone with the "self interest" of users, who may primarily care about how many other users they can connect with.
The interests of users must be first and foremost in mind for defining a standard for the default video codec.
It includes not passing the cost of licensing in perpetuity along to users.
Given a choice for open source and/or free video codecs such as Theora and VP8, the IETF has several good options to choose from, also defining of a video codec from ground up.
Thanks,
Henry
On 12/30/10 12:09 AM, "Bernard Aboba" <bernard_aboba@hotmail.com> wrote:
For video codecs, "self interest" may be influenced by a number of factors.
For example, for a mobile applications developer, "self interest" may focus on aspects such as performance, battery life and maintenance costs. If a given codec is supported in the hardware or operating system of their target platform, then the developer may perceive it being low "cost" to them.
For a chipset manufacturer, "self interest" may be determined by the demand for chipsets incorporating a given codec, as well as the associated licensing fees. Typically the goal is to maximize revenue minus cost, not just to minimize "cost".
These concepts of "self interest" not necessarily align with each other, let alone with the "self interest" of users, who may primarily care about how many other users they can connect with.
-----Original Message----- From: rtc-web-bounces@alvestrand.no [mailto:rtc-web-bounces@alvestrand.no] On Behalf Of David Singer Sent: Wednesday, December 29, 2010 8:08 PM To: Heinrich Sinnreich Cc: rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
Heinrich,
'best' is not always IPR-cost-free. Sometimes it is, sometimes it isn't. You seem unable to see any other possibility than your own, alas. I could wish for 'fates' for any number of technologies, but I don't: I choose them when they suit, and others when they don't. I suggest we do the same.
I have no objection to the development and deployment of new codecs, with varying terms, quality, complexity, and so on. This is a varied market that deserves varied tools. I do object to making decisions based on only one criterion, however.
On Dec 26, 2010, at 18:12 , Heinrich Sinnreich wrote:
I think we should consider the balance between cost, risk, quality, and existing adoption, and it would be foolish to omit cost-bearing codecs from that analysis, as H.264 is widely used already.
I am not sure where this discussion is going, though it reminds us of the discussions when arguing about SIP vs. H.323 in the IETF. "Everybody" was shipping H.323 in overwhelming quantity, but somehow the IETF did not buy it.
As an hopeless optimist; maybe H.264 will have the same fate since at least it's considerable IP baggage is so well known...
It is hard to imagine the IETF and indeed the market will ignore the creativity of all the codec developers out there and the evolving technology that empowers them. Plain self interest should motivate embracing new IP-free a/v codecs for the RTC Web. They will arrive anyway one way or another.
[Well deployed technology has a proven way to make it over the threshold into history :-)]
David Singer Multimedia and Software Standards, Apple Inc.
_______________________________________________ RTC-Web mailing list RTC-Web@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtc-web
_______________________________________________ dispatch mailing list dispatch@ietf.org https://www.ietf.org/mailman/listinfo/dispatch

On 01/07/11 10:56, Elwell, John wrote:
I also agree that codecs such as H.264 AVC need to be considered, because of interworking with non-RTC-web users, conference bridges, etc.. An important part of the proposed charter is: "* interoperate with compatible voice and video systems that are not web based" This can turn out to be seriously problematic if we don't constrain it carefully - when we wrote this, my thinking was that it meant "if devices send and receive media in formats that we support, and the setup is performed in a reasonable way through intermediaries, we should be able to send media directly to them".
I see the use case that we *have* to support as the browser-to-browser use case. If we are able to support other use cases too, that is a good thing, but very much a lower priority to me. Opinions may differ. Harald

-----Original Message----- From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: 07 January 2011 13:07 To: Elwell, John Cc: Stephen Botzko; Henry Sinnreich; Bernard Aboba; rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
On 01/07/11 10:56, Elwell, John wrote:
I also agree that codecs such as H.264 AVC need to be considered, because of interworking with non-RTC-web users, conference bridges, etc.. An important part of the proposed charter is: "* interoperate with compatible voice and video systems that are not web based" This can turn out to be seriously problematic if we don't constrain it carefully - when we wrote this, my thinking was that it meant "if devices send and receive media in formats that we support, and the setup is performed in a reasonable way through intermediaries, we should be able to send media directly to them".
I see the use case that we *have* to support as the browser-to-browser use case. If we are able to support other use cases too, that is a good thing, but very much a lower priority to me. Opinions may differ. [JRE] I disagree. I believe the ability to work with non-RTP-web users is equally important. Take enterprises for example - they don't want a flag day when every user changes to RTP-web at the same time - they need to migrate users at a convenient pace.
One of the benefits of reusing existing protocols such as RTP is that interworking with non-RTP-web users and other equipment (such as MCUs) should be feasible. But this also means using appropriate codecs, to avoid having to insert transcoders. John
Harald

On 01/07/11 14:41, Elwell, John wrote:
-----Original Message----- From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: 07 January 2011 13:07 To: Elwell, John Cc: Stephen Botzko; Henry Sinnreich; Bernard Aboba; rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
On 01/07/11 10:56, Elwell, John wrote:
I also agree that codecs such as H.264 AVC need to be considered, because of interworking with non-RTC-web users, conference bridges, etc.. An important part of the proposed charter is: "* interoperate with compatible voice and video systems that are not web based" This can turn out to be seriously problematic if we don't constrain it carefully - when we wrote this, my thinking was that it meant "if devices send and receive media in formats that we support, and the setup is performed in a reasonable way through intermediaries, we should be able to send media directly to them".
I see the use case that we *have* to support as the browser-to-browser use case. If we are able to support other use cases too, that is a good thing, but very much a lower priority to me. Opinions may differ. [JRE] I disagree. I believe the ability to work with non-RTP-web users is equally important. Take enterprises for example - they don't want a flag day when every user changes to RTP-web at the same time - they need to migrate users at a convenient pace.
We have the same situation today when people migrate off PABXes onto either VOIP-phones or onto mobile phones; the answer in those cases seems to be gateways. Why wouldn't that work in this case?
One of the benefits of reusing existing protocols such as RTP is that interworking with non-RTP-web users and other equipment (such as MCUs) should be feasible. But this also means using appropriate codecs, to avoid having to insert transcoders. Can you be more specific about what devices you're thinking of, and how many of them there are? In particular, which devices do you expect to see that don't support RTCWeb, but do support STUN?
Harald

-----Original Message----- From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: 07 January 2011 14:25 To: Elwell, John Cc: Stephen Botzko; Henry Sinnreich; Bernard Aboba; rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
On 01/07/11 14:41, Elwell, John wrote:
-----Original Message----- From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: 07 January 2011 13:07 To: Elwell, John Cc: Stephen Botzko; Henry Sinnreich; Bernard Aboba; rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
On 01/07/11 10:56, Elwell, John wrote:
I also agree that codecs such as H.264 AVC need to be considered, because of interworking with non-RTC-web users, conference bridges, etc.. An important part of the proposed charter is: "* interoperate with compatible voice and video systems that are not web based" This can turn out to be seriously problematic if we don't constrain it carefully - when we wrote this, my thinking was that it meant "if devices send and receive media in formats that we support, and the setup is performed in a reasonable way through intermediaries,
we should be
able to send media directly to them".
I see the use case that we *have* to support as the browser-to-browser use case. If we are able to support other use cases too, that is a good thing, but very much a lower priority to me. Opinions may differ. [JRE] I disagree. I believe the ability to work with non-RTP-web users is equally important. Take enterprises for example - they don't want a flag day when every user changes to RTP-web at the same time - they need to migrate users at a convenient pace. We have the same situation today when people migrate off PABXes onto either VOIP-phones or onto mobile phones; the answer in those cases seems to be gateways. Why wouldn't that work in this case? One of the benefits of reusing existing protocols such as RTP is that interworking with non-RTP-web users and other equipment (such as MCUs) should be feasible. But this also means using appropriate codecs, to avoid having to insert transcoders. Can you be more specific about what devices you're thinking of, and how many of them there are? In particular, which devices do you expect to see that don't support RTCWeb, but do support STUN? [JRE] It is true that a lot of existing devices do not support STUN, and rely on intermediaries (SBC) to achieve NAT traversal. However, those intermediaries could be made to mediate between RTC-Web devices and other devices. Getting those intermediaries to handle STUN on the RTC-Web side is feasible - getting them to do transcoding, particularly for video, is an entirely different matter. So in other words, it depends how much functionality we are prepared to put into the gateway.
John
Harald

Though gateways are a key component in the web architecture and critical indeed for connectivity with other networks, I believe the scope as outlined here by Harald is the correct first step. As for interoperability at the video codec level, several items have to be balanced 1. The benefits for users (includes cost and performance) first and foremost 2. Give a chance to new and free codec technologies such as VP8 or Theora 3. Interoperability with existing H.264 based systems Or does anyone think "H.264 for ever"? Thanks, Henry On 1/7/11 11:38 AM, "Elwell, John" <john.elwell@siemens-enterprise.com> wrote:
-----Original Message----- From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: 07 January 2011 14:25 To: Elwell, John Cc: Stephen Botzko; Henry Sinnreich; Bernard Aboba; rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
On 01/07/11 14:41, Elwell, John wrote:
-----Original Message----- From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: 07 January 2011 13:07 To: Elwell, John Cc: Stephen Botzko; Henry Sinnreich; Bernard Aboba; rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
On 01/07/11 10:56, Elwell, John wrote:
I also agree that codecs such as H.264 AVC need to be considered, because of interworking with non-RTC-web users, conference bridges, etc.. An important part of the proposed charter is: "* interoperate with compatible voice and video systems that are not web based" This can turn out to be seriously problematic if we don't constrain it carefully - when we wrote this, my thinking was that it meant "if devices send and receive media in formats that we support, and the setup is performed in a reasonable way through intermediaries,
we should be
able to send media directly to them".
I see the use case that we *have* to support as the browser-to-browser use case. If we are able to support other use cases too, that is a good thing, but very much a lower priority to me. Opinions may differ. [JRE] I disagree. I believe the ability to work with non-RTP-web users is equally important. Take enterprises for example - they don't want a flag day when every user changes to RTP-web at the same time - they need to migrate users at a convenient pace. We have the same situation today when people migrate off PABXes onto either VOIP-phones or onto mobile phones; the answer in those cases seems to be gateways. Why wouldn't that work in this case? One of the benefits of reusing existing protocols such as RTP is that interworking with non-RTP-web users and other equipment (such as MCUs) should be feasible. But this also means using appropriate codecs, to avoid having to insert transcoders. Can you be more specific about what devices you're thinking of, and how many of them there are? In particular, which devices do you expect to see that don't support RTCWeb, but do support STUN? [JRE] It is true that a lot of existing devices do not support STUN, and rely on intermediaries (SBC) to achieve NAT traversal. However, those intermediaries could be made to mediate between RTC-Web devices and other devices. Getting those intermediaries to handle STUN on the RTC-Web side is feasible - getting them to do transcoding, particularly for video, is an entirely different matter. So in other words, it depends how much functionality we are prepared to put into the gateway.
John
Harald

Theora is not a valid choice for a video chat/conferencing codec. It has far too much internal delay and doesn't come close to the real-time performance of vp8/libvpx or H.264. -Aron Aron Rosenberg Sr. Director, Engineering Logitech Inc. (SightSpeed Group) On Fri, Jan 7, 2011 at 10:02 AM, Henry Sinnreich <henry.sinnreich@gmail.com>wrote:
Though gateways are a key component in the web architecture and critical indeed for connectivity with other networks, I believe the scope as outlined here by Harald is the correct first step.
As for interoperability at the video codec level, several items have to be balanced
1. The benefits for users (includes cost and performance) first and foremost 2. Give a chance to new and free codec technologies such as VP8 or Theora 3. Interoperability with existing H.264 based systems
Or does anyone think "H.264 for ever"?
Thanks, Henry
On 1/7/11 11:38 AM, "Elwell, John" <john.elwell@siemens-enterprise.com> wrote:
-----Original Message----- From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: 07 January 2011 14:25 To: Elwell, John Cc: Stephen Botzko; Henry Sinnreich; Bernard Aboba; rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
On 01/07/11 14:41, Elwell, John wrote:
-----Original Message----- From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: 07 January 2011 13:07 To: Elwell, John Cc: Stephen Botzko; Henry Sinnreich; Bernard Aboba; rtc-web@alvestrand.no; dispatch@ietf.org Subject: Re: [RTW] [dispatch] Codec standardization (Re: Fwd: New Version Notification for draft-alvestrand-dispatch-rtcweb-protocols-00)
On 01/07/11 10:56, Elwell, John wrote:
I also agree that codecs such as H.264 AVC need to be considered, because of interworking with non-RTC-web users, conference bridges, etc.. An important part of the proposed charter is: "* interoperate with compatible voice and video systems that are not web based" This can turn out to be seriously problematic if we don't constrain it carefully - when we wrote this, my thinking was that it meant "if devices send and receive media in formats that we support, and the setup is performed in a reasonable way through intermediaries,
we should be
able to send media directly to them".
I see the use case that we *have* to support as the browser-to-browser use case. If we are able to support other use cases too, that is a good thing, but very much a lower priority to me. Opinions may differ. [JRE] I disagree. I believe the ability to work with non-RTP-web users is equally important. Take enterprises for example - they don't want a flag day when every user changes to RTP-web at the same time - they need to migrate users at a convenient pace. We have the same situation today when people migrate off PABXes onto either VOIP-phones or onto mobile phones; the answer in those cases seems to be gateways. Why wouldn't that work in this case? One of the benefits of reusing existing protocols such as RTP is that interworking with non-RTP-web users and other equipment (such as MCUs) should be feasible. But this also means using appropriate codecs, to avoid having to insert transcoders. Can you be more specific about what devices you're thinking of, and how many of them there are? In particular, which devices do you expect to see that don't support RTCWeb, but do support STUN? [JRE] It is true that a lot of existing devices do not support STUN, and
rely
on intermediaries (SBC) to achieve NAT traversal. However, those intermediaries could be made to mediate between RTC-Web devices and other devices. Getting those intermediaries to handle STUN on the RTC-Web side is feasible - getting them to do transcoding, particularly for video, is an entirely different matter. So in other words, it depends how much functionality we are prepared to put into the gateway.
John
Harald
_______________________________________________ RTC-Web mailing list RTC-Web@alvestrand.no http://www.alvestrand.no/mailman/listinfo/rtc-web

On 7 January 2011 10:20, Aron Rosenberg <arosenberg@logitech.com> wrote:
Theora is not a valid choice for a video chat/conferencing codec. It has far too much internal delay and doesn't come close to the real-time performance of vp8/libvpx or H.264.
One frame is too much internal delay? Perhaps you're confusing theora/libtheora with its http streaming performance in the ogg container. If there really is an issue with libtheora's internal delay, could you please elaborate? There are other issues of course. For example, I'm not aware of a theora encoder that supports rolling intra. -r

On Jan 7, 2011, at 10:02 , Henry Sinnreich wrote:
Though gateways are a key component in the web architecture and critical indeed for connectivity with other networks, I believe the scope as outlined here by Harald is the correct first step.
As for interoperability at the video codec level, several items have to be balanced
1. The benefits for users (includes cost and performance) first and foremost 2. Give a chance to new and free codec technologies such as VP8 or Theora 3. Interoperability with existing H.264 based systems
Or does anyone think "H.264 for ever"?
No, MPEG and ITU are always working on the next generation :-). Your list is a start, but it also needs to consider the availability of acceleration and hardware support (maybe that's part of your first bullet), and (alas) IPR risk. David Singer Multimedia and Software Standards, Apple Inc.

On 1/7/11 12:23 PM, David Singer wrote:
Your list is a start, but it also needs to consider the availability of acceleration and hardware support (maybe that's part of your first bullet), and (alas) IPR risk.
You've brought up IPR risk a couple of times now, with the implication that a codec with known IPR entanglements is somehow safer than a codec without. This is a fallacy that we need to dispense with. The argument with purportedly "IPR Free" technologies is that as-yet unidentified patents may apply, and be asserted at a later date. This is true. However, the identification of one or more patents that *do* apply to a technology does nothing to mitigate the risk that additional as-yet unidentified patents may also apply to it. If you have a set of unknown size, finding some elements in the set does nothing to prove that all the elements have been found. All you know is that the set is not empty (which, in this case, is a drawback). This shouldn't be news to anyone with even a passing familiarity with the situation. As I'm certain you are aware, the MPEG-LA licensing pool agreement for H.264 explicitly calls out these potentially unknown patents as a risk, and warns licensees that dealing with any resultant problems is the licensees' problem, not MPEG-LAs. In other words: it is every bit as likely that an entity will assert a previously unidentified, valid patent against H.264 as it is that a company will assert such a patent against Theora or VP8. Implications to the contrary are propaganda. /a

On Jan 7, 2011, at 11:33 , Adam Roach wrote:
On 1/7/11 12:23 PM, David Singer wrote:
Your list is a start, but it also needs to consider the availability of acceleration and hardware support (maybe that's part of your first bullet), and (alas) IPR risk.
You've brought up IPR risk a couple of times now, with the implication that a codec with known IPR entanglements is somehow safer than a codec without. This is a fallacy that we need to dispense with.
No, it's not quite that. It's that a codec that has been through formal standardization (and hence visibility and obligations), has had a formal industry-wide call for patents, has a license or licenses, and is widely deployed, has a lower (agreed, non-zero) IPR risk than a codec that has not. Those steps have done a lot to flush out patent claims into the open. For example, participants to standards bodies that fail to disclose patents face risks if they later try to enforce those patents (recent court cases). So, for at least that set of companies (who form a reasonable proportion of the corpus of patent-holding companies in the field), we have lowered the risk by placing them under a disclosure obligation, and by having formal calls for IPR issued (both by the standards body and by those proposing pools).
This shouldn't be news to anyone with even a passing familiarity with the situation. As I'm certain you are aware, the MPEG-LA licensing pool agreement for H.264 explicitly calls out these potentially unknown patents as a risk, and warns licensees that dealing with any resultant problems is the licensees' problem, not MPEG-LAs.
Yes, this is true, because MPEG-LA is run by lawyers and lawyers always tell you that they aren't perfect (did we know that already?).
In other words: it is every bit as likely that an entity will assert a previously unidentified, valid patent against H.264 as it is that a company will assert such a patent against Theora or VP8. Implications to the contrary are propaganda.
No, I'm sorry, the risk levels are substantially different, and any implications to the contrary is...well, propaganda. Take a hypothetical case: Imagine someone were to define a subset of MPEG-1 or MPEG-2 video for which it believes all the relevant patents have either expired or are RF-licensable. How much of a risk is there that some previously unknown patent would come up and snooker the situation? Why would someone sit in their patent through decades of potentially lucrative licensing, and only pop onto the radar when something free is defined? By leveraging the standards process, the pool process, and the wide temptation to collect money over decades, we could substantially lower the risk of a submarine. (I can't say it's zero, as submarine captains operate to a different definition of 'rational' from me, it seems!) Contrast that with someone who believes that they might have a patent on Theora. While it's an open-source codec in development with limited deployment, where is the incentive (let alone requirement) to check? A statement that Theora needs a license would be unpopular (!), would take real time and effort to check (the analysis), and result in bad PR and no revenue, even if you win. On the other hand, if you wait, and then someone who is both (a) rich and (b) you don't like or wish to fight back against, deploys it, you now have the incentive to check and go after it. There are incentives to wait and see here, alas. I believe that RF standards have their place, and I work hard to make them happen (e.g. I am Apple's AC Rep to the W3C and fully support their patent policy). But we have to be realistic about what we pursue. RF technologies will surely have a place in this project (many of the protocols seem to be in that category), and a complete RF 'profile' may well be achievable and meet some needs (for example, if you want to give away implementations and make your money some other way, you can't afford to pay per-copy on every give-away). I just remain opposed to entering the project with the blinkers on that *only* RF technologies will be considered, because I think royalty-bearing ones may well have a place, and we should discuss the needs and trade-offs. David Singer Multimedia and Software Standards, Apple Inc.

On 1/7/11 1:53 PM, David Singer wrote:
On Jan 7, 2011, at 11:33 , Adam Roach wrote:
On 1/7/11 12:23 PM, David Singer wrote:
Your list is a start, but it also needs to consider the availability of acceleration and hardware support (maybe that's part of your first bullet), and (alas) IPR risk. You've brought up IPR risk a couple of times now, with the implication that a codec with known IPR entanglements is somehow safer than a codec without. This is a fallacy that we need to dispense with. No, it's not quite that. It's that a codec that has been through formal standardization (and hence visibility and obligations), has had a formal industry-wide call for patents, has a license or licenses, and is widely deployed, has a lower (agreed, non-zero) IPR risk than a codec that has not. Those steps have done a lot to flush out patent claims into the open.
I don't agree with the assessment of "lower IPR risk" (because of mitigating factors I discuss below), but the rest of your assertions are true of any technology, not just codecs. And yet we still favor unencumbered technologies. Why? Well, in part because we need to balance these factors against the risk that holders of known patents may behave poorly; cf. http://www.intomobile.com/2010/11/10/motorola-microsoft-su/ This demonstrates how known encumberances are predictable places that things can and do go wrong. If you're worried about tigers, It seems far more sensible to pitch your tent where people have been actively looking for (and not finding) tigers for over a decade than it is to pitch it in a tiger-laden jungle where some of the tigers happen to be on leashes. A blind hope that the leashes are strong and short enough is a bit naïve, especially when tiger maulings are regularly reported in the news (Microsoft v. Motorola, above; Lucent v. Gateway; Multimedia Patent Trust v. Microsoft; etc.)
Take a hypothetical case: Imagine someone were to define a subset of MPEG-1 or MPEG-2 video for which it believes all the relevant patents have either expired or are RF-licensable. How much of a risk is there that some previously unknown patent would come up and snooker the situation? Why would someone sit in their patent through decades of potentially lucrative licensing, and only pop onto the radar when something free is defined?
Like Unisys and the GIF format? Or the Network-1 PoE lawsuits? Or Net2Phone sitting on their VoIP patents until 2006, despite competing commercially profitable services as early as 2000? I don't know. Motivation for this kind of behavior eludes me. But regardless of whether we understand the motivation, the behavior doesn't go away.
Contrast that with someone who believes that they might have a patent on Theora. While it's an open-source codec in development with limited deployment, where is the incentive (let alone requirement) to check? A statement that Theora needs a license would be unpopular (!), would take real time and effort to check (the analysis), and result in bad PR and no revenue, even if you win. On the other hand, if you wait, and then someone who is both (a) rich and (b) you don't like or wish to fight back against, deploys it, you now have the incentive to check and go after it.
Google, which has embedded both Theora and VP8 in Chrome, has a $198B market cap and $23B of annual revenue. How large is a company before you consider it to have deep enough pockets to bother raiding? I mean, if we're talking about software companies, there are only three or so that can be legitimately considered "larger" by any financial metric.
I believe that RF standards have their place, and I work hard to make them happen (e.g. I am Apple's AC Rep to the W3C and fully support their patent policy).
I know, and I've read several of your posts on the topic of HTML5 and video codecs. You were a neutral voice of reason in those discussions. So perhaps you can understand that I'm a bit confused by your seemingly partisan intimation about IPR risk (quoted above), and your propagation of the MPEG-LA's unsubstantiated allegations of VP8 patent infringement (in your email on December 21st). I prefer the "neutral voice of reason" Dave Singer.
But we have to be realistic about what we pursue.
How is considering VP8 and/or Theora as a baseline unrealistic? I'm not saying either should be a foregone conclusion; I'd like to see a well-reasoned discussion around codec selection. But I don't think characterizing RF technologies as inherently having excessive IPR risk falls into the category of "well-reasoned," even if certain licensing entities have engaged in self-serving saber rattling on the topic.
I just remain opposed to entering the project with the blinkers on that *only* RF technologies will be considered, because I think royalty-bearing ones may well have a place, and we should discuss the needs and trade-offs.
I'm not proposing a change to standard operating procedure within the IETF (RFC3979): In general, IETF working groups prefer technologies with no known IPR claims or, for technologies with claims against them, an offer of royalty-free licensing. But IETF working groups have the discretion to adopt technology with a commitment of fair and non-discriminatory terms, or even with no licensing commitment, if they feel that this technology is superior enough to alternatives with fewer IPR claims or free licensing to outweigh the potential cost of the licenses. But your implication that we should penalize a technology *because* it is not known to be encumbered is exactly the opposite of this. If you'd like to reverse the IETF's position, I suggest you take the discussion to a broader forum, like ietf@ietf.org. I suspect support for your proposal will be limited. To be crystal clear: I'm not proposing excluding royalty-bearing codecs from consideration. We need to evaluate each codec on its inherent benefits and drawbacks. What I'm objecting to is throwing unrelated ballast onto the scale when we're trying to make a reasonable comparison. /a

On Jan 8, 2011, at 0:42 , Adam Roach wrote:
Contrast that with someone who believes that they might have a patent on Theora. While it's an open-source codec in development with limited deployment, where is the incentive (let alone requirement) to check? A statement that Theora needs a license would be unpopular (!), would take real time and effort to check (the analysis), and result in bad PR and no revenue, even if you win. On the other hand, if you wait, and then someone who is both (a) rich and (b) you don't like or wish to fight back against, deploys it, you now have the incentive to check and go after it.
Google, which has embedded both Theora and VP8 in Chrome, has a $198B market cap and $23B of annual revenue. How large is a company before you consider it to have deep enough pockets to bother raiding? I mean, if we're talking about software companies, there are only three or so that can be legitimately considered "larger" by any financial metric.
You address only one of my points, and answer a strawman. "IF the patent holder's only interest is money THEN why haven't they sued Google?". Perhaps they have a cross-license. Perhaps they are in negotiation with Google over this or other matters. Perhaps...
But we have to be realistic about what we pursue.
How is considering VP8 and/or Theora as a baseline unrealistic? I'm not saying either should be a foregone conclusion; I'd like to see a well-reasoned discussion around codec selection.
Then we agree; that's what I want also.
But I don't think characterizing RF technologies as inherently having excessive IPR risk falls into the category of "well-reasoned," even if certain licensing entities have engaged in self-serving saber rattling on the topic.
The IPR risk varies by codec and how it was developed, and is only poorly-correlated with any stated RF status.
But your implication that we should penalize a technology *because* it is not known to be encumbered is exactly the opposite of this.
That's not what I am saying. I am saying we should be careful of codecs that were developed in such a way that their encumbrance status is not as clear we might like. David Singer Multimedia and Software Standards, Apple Inc.

Whether such work is supposedly IPR free or not does not guarantee against future IPR claims. I'm not even convinced that claims by an author that something is IPR free gives any better guarantee of such. The only hope you have is that a particular solution has been worked on in an organisation which requires declaration of IPR from participants, and that all the player in the field with enough money to initiate an IPR claim were involved in that work. So if I wanted the safest solution against future IPR claims, I'd go for the one that was worked on by the largest number of significant companies in an SDO with an appropriate IPR policy. Note that this is safest, not safe. Keith
-----Original Message----- From: dispatch-bounces@ietf.org [mailto:dispatch-bounces@ietf.org] On Behalf Of Adam Roach Sent: Friday, January 07, 2011 7:33 PM To: David Singer Cc: Bernard Aboba; Harald Alvestrand; rtc-web@alvestrand.no; dispatch@ietf.org Subject: [dispatch] Known encumberances != exhaustive list of encumberances (was Gateways and does anyone think "H.264 for ever"?)
On 1/7/11 12:23 PM, David Singer wrote:
Your list is a start, but it also needs to consider the availability of acceleration and hardware support (maybe that's part of your first bullet), and (alas) IPR risk.
You've brought up IPR risk a couple of times now, with the implication that a codec with known IPR entanglements is somehow safer than a codec without. This is a fallacy that we need to dispense with.
The argument with purportedly "IPR Free" technologies is that as-yet unidentified patents may apply, and be asserted at a later date. This is true.
However, the identification of one or more patents that *do* apply to a technology does nothing to mitigate the risk that additional as-yet unidentified patents may also apply to it. If you have a set of unknown size, finding some elements in the set does nothing to prove that all the elements have been found. All you know is that the set is not empty (which, in this case, is a drawback).
This shouldn't be news to anyone with even a passing familiarity with the situation. As I'm certain you are aware, the MPEG-LA licensing pool agreement for H.264 explicitly calls out these potentially unknown patents as a risk, and warns licensees that dealing with any resultant problems is the licensees' problem, not MPEG-LAs.
In other words: it is every bit as likely that an entity will assert a previously unidentified, valid patent against H.264 as it is that a company will assert such a patent against Theora or VP8. Implications to the contrary are propaganda.
/a _______________________________________________ dispatch mailing list dispatch@ietf.org https://www.ietf.org/mailman/listinfo/dispatch
participants (14)
-
Adam Roach
-
Aron Rosenberg
-
Bernard Aboba
-
David Singer
-
DRAGE, Keith (Keith)
-
Elwell, John
-
Harald Alvestrand
-
Heinrich Sinnreich
-
Henry Sinnreich
-
Justin Uberti
-
Markus.Isomaki@nokia.com
-
Ralph Giles
-
Stefan Håkansson LK
-
Stephen Botzko