
Guys the draft IETF charter below might be of interest to the potential participants of the workshop. FYI... * * * * ** * * * * **** ** * * * * * * ** * * MALT - Multi-stream Attributes for Lifelike Telepresence COCKTAIL - Communication and Correlation of Key Telepresence Attributes for Interoperable Links MAITAI - Multi-stream Attributes for Improving Telepresence Application Interoperability TEQUILA - Telepresence Encoding of QUalifiers for Interoperable Lifelike Applications MOJITO - Multi-stream Orientation for Joining of Interoperable Telepresence Operations In the context of this WG, the term telepresence is used in a general manner to describe systems that provide high definition, high quality audio/video enabling a "being-there" experience. One example is an immersive telepresence system using specially designed and special purpose rooms with multiple displays permitting life size image reproduction using multiple cameras, encoders, decoders, microphones and loudspeakers. Current telepresence systems are based on open standards such as RTP, SIP, H.264, the H.323 suite, however, they cannot easily interoperate with each other without operator assistance and expensive additional equipment which translates from one vendor to another. A major factor in the inability of telepresence systems to interwork is that there is no standardized way to describe and negotiate the use of the multiple streams of audio and video that comprise the media flows. In addition, there is no standardized way to exchange semantic information about what each media stream represents. The WG will create specifications for SIP-based conferencing systems to enable communication of enough information about each media stream so that each receiving system or bridge system can make reasonable decisions about selecting and rendering media streams. This enables systems to make display choices that optimize the "just like being there" experience. This working group is chartered to specify the information about media streams from one entity to another entity: * Spatial relationships of cameras, displays, microphones, and Speakers - in relation to each other and to likely positions of participants * Specific characteristics such as viewpoint, field of view/capture for camera/microphone/display/speaker - so that senders and middleboxes can understand how best to compose streams for receivers, and the receivers will know the characteristics of its received streams *Usage of the stream, for example whether the stream is presentation, or document camera output * Aspect ratio of cameras and displays * Which sources a receiver wants to receive. For example, it might want the source for the left camera, or might want the source chosen by VAD (Voice Activity Detection). Information between sources and sinks about media stream capabilities will be exchanged. The working group will define the semantics, syntax, and transport mechanism necessary for communicating the necessary information. It will consider whether the existing signaling mechanisms (e. g., SDP) can be extended, or another messaging method should be used. The scope of the work includes describing relatively static relations between entities (participants and devices). It also includes handling more dynamic relationships, such as identifying the audio and video streams for the current speaker. The scope includes both systems that provide a fully immersive experience, and systems that interwork with them and therefore need to understand the same multiple stream semantics. The focus of this work is on multiple audio and video streams. Other media types may be considered, however development of methodologies for them is not within the scope of this work. Interoperation with SIP and related standards for audio and video is required. However, backwards compatibility with existing non-standards compliant telepresence systems is not required. This working group is not currently chartered to work on issues of continuous conference control including: far end camera control, indication of fast frame update for video codecs or other rapid switches, floor control, conference roster. Reuse of existing protocols and backwards compatibility with SIP-compliant audio/video endpoints are important factors for the working group to consider. The work will closely coordinate with the appropriate areas and working groups including OPS Area, AVT, MMUSIC, MEDIACTRL, XCON, and SIPCORE. Milestones Nov 2010 Submit information draft to IESG on use cases and requirements Nov 2011 Submit standards track specification to IESG indicating spatial relationships of screens cameras (including variable field of view and orientation), speakers and microphones; and the "usage" of a stream as defined in the charter. Semantics, language and transport mechanism will be specified. David Singer Multimedia and Software Standards, Apple Inc.