RE: Media type versioning (was: Re: Scripting Media Types)

From: Bruce Lilly Sent: Thursday, February 10, 2005 11:30 PM On Thu February 10 2005 15:31, Larry Masinter wrote:
My opinion is that content negotiation for versioning capability using MIME type parameters is unworkable.
Negotiation isn't always possible; sometimes it's a one-way street.
The use of the MIME type is to describe the payload sufficiently so that you know HOW to process it, but the content-type label cannot (and thus should not) be extended in an attempt to use it to determine WHETHER a given processor knows enough to be able to process it.
We're in agreement that the media type/subtype tag should not be used for versioning (that would result in an unnecessary proliferation of tags).
Is this really so bad? There are all of 93 application/foo's registered.
So whether version information should or should not be in a media type parameter depends pretty much on whether there is an embedded, easy-to-find version indicator in the data itself; if there isn't, and processors need to know the version to choose between different processing methods, then the version parameter should be mandatory. There is no strong use case for an optional version parameter, or in general for duplicating, in MIME parameters, information that is readily obtained from the content itself.
Maybe; if transfer encoding is applied, that would entail decoding and opening the media content to determine version information. If the media type is specified via message/external-body or a similar mechanism, additional steps are necessary to obtain the media in order to determine version information. If a particular instance of content is large, that may result in a substantial waste of resources if it turns out to have an incompatible version.
I don't like the idea of embedded data being required to determine if the entity can accept the message. By that point in time, it is too late. If you need a data element to determine, at the protocol level, you can process a message, then you need that data element at the protocol level. E.g., a MIME parameter rather than an XML tag in a body part.
If you want to do content negotiation, then consider using media features and media feature negotiation; with media features you can negotiate not only version information, but other parameters that might also be necessary to know in order to determine interpretability, e.g., availability of compression modes, codecs, fonts, color capabilities, buffer size limitations, etc.
That's fine for protocols that support negotiation, but not all protocols do so (HTTP does, RFC 3297 layered on top of messaging can, but FTP does not).
Well, if you need negotiation, use a protocol that offers negotiation :)
Summary: o media content might contain version information, but reliance on that can result in wasted resources
And a layer violation and asking for hacked stacks.
o negotiation via content-features may be a viable mechanism where negotiation is possible, but it is not always possible (and some means of indicating version must be available to the content negotiation mechanism)
Agreed.
o proliferation of type/subtypes tags for the sole purpose of versioning is undesirable
I'm not sold on this one. If a new version of a media type is not backwards-compatible, I would offer that it is a different media type. As noted above, in 8 years of MIME types, we have all of 93 application/foo registered types.
that leaves parameters associated with the media type as an efficient mechanism for indicating version information.
After saying that this is not the only method, I also would offer that one is ALWAYS free, in the definition of the MIME type, list a parameter and describe normative behavior for processing that parameter. That could be something like "version". That said, I am really uneasy in saying that one MUST use a version tag. Some of the thread above suggested that one could generically use the combination of MIME type and "version" as an indicator of MIME type. For example, if the "version" is "bigger" than what I know, I know to generically barf on the message. Besides the issues that Ned pointed out (V1, V2, Vn vs. 1.2.3, 1.2.4, m.n.o vs. ...), some MIME types, like the PostScript example, still work on "older" processors, while others are completely different between versions. The result is the semantics and normative behavior for receivers for the "version" parameter is MIME type specific, and thus NOT subject to generic behavior specification.

My opinion is that content negotiation for versioning capability using MIME type parameters is unworkable. Negotiation isn't always possible; sometimes it's a one-way street.
I was using "negotiation" in more general sense. If the are no choices to be made, then there's no point in sending any additional information. And my point isn't that doing content negotiation or, in general, making choices about content is impossible -- rather that the specific proposal of using MIME type _parameters_ is a bad design choice for conveying information as slippery as _version_.
The use of the MIME type is to describe the payload sufficiently so that you know HOW to process it, but the content-type label cannot (and thus should not) be extended in an attempt to use it to determine WHETHER a given processor knows enough to be able to process it. We're in agreement that the media type/subtype tag should not be used for versioning (that would result in an unnecessary proliferation of tags). Is this really so bad? There are all of 93 application/foo's registered.
The original point was about whether it is possible to decide whether a receiver is capable of interpreting something, in general, just based on the MIME type. But registering and using a new MIME type for a new version is often just the right thing: Unless the old MIME type was defined carefully with sufficient attention to extensibility so that all old processors will behave gracefully when presented with new content, registering a new type may be the only choice.
So whether version information should or should not be in a media type parameter depends pretty much on whether there is an embedded, easy-to-find version indicator in the data itself; if there isn't, and processors need to know the version to choose between different processing methods, then the version parameter should be mandatory. There is no strong use case for an optional version parameter, or in general for duplicating, in MIME parameters, information that is readily obtained from the content itself.
Maybe; if transfer encoding is applied, that would entail decoding and opening the media content to determine version information. If the media type is specified via message/external-body or a similar mechanism, additional steps are necessary to obtain the media in order to determine version information. If a particular instance of content is large, that may result in a substantial waste of resources if it turns out to have an incompatible version.
I don't think those are problems; aren't these forwarding bodies (that are looking at MIME types and parameters) also unwrapping transfer encodings and fetching external bodies? You're optimizing what I assume is mainly a failure case, anyway: you got sent something you can't read. In some cases, you're just deciding _which_ processor to send it to, so fetching the whole body isn't an extra overhead anyway.
I don't like the idea of embedded data being required to determine if the entity can accept the message. By that point in time, it is too late. If you need a data element to determine, at the protocol level, you can process a message, then you need that data element at the protocol level. E.g., a MIME parameter rather than an XML tag in a body part.
Using MIME parameters isn't the only place to put auxiliary information. If you need auxiliary information, put it somewhere else -- such as in a content-features header. Don't overloading everything into the same header.
If you want to do content negotiation, then consider using media features and media feature negotiation; with media features you can negotiate not only version information, but other parameters that might also be necessary to know in order to determine interpretability, e.g., availability of compression modes, codecs, fonts, color capabilities, buffer size limitations, etc.
That's fine for protocols that support negotiation, but not all protocols do so (HTTP does, RFC 3297 layered on top of messaging can, but FTP does not).
FTP in most cases is done without MIME anyway -- the receiver guesses the MIME type based on the file extension and sniffing of information about the host operating system. I meant "negotiation" in a more general way: describing content in a way that subsequent processors can make decisions about whether or how to process without opening the content itself. In this case, I recommend putting this auxiliary data somewhere where it won't just confuse and pollute the processing that is necessary for the content-type itself. "Content-features" is useful in cases where there is no back-channel. You can ignore it if you want, but if you're in a situation where further processing of some content-types is expensive (expensive local processor, or forwarding agent to further processor) and you have some information about the next step's capabilities, you could try to match the next step's capabilities against the content-features.
Summary: o media content might contain version information, but reliance on that can result in wasted resources
And a layer violation and asking for hacked stacks.
Hardly. Pulling out redundant information and sticking it on the wrapper to facilitate processing is an optimization, but it is also a "layer violation". Sometimes we tolerate layer violations if they're important for optimizing performance.
o negotiation via content-features may be a viable mechanism where negotiation is possible, but it is not always possible (and some means of indicating version must be available to the content negotiation mechanism)
Wrong. "Content-Features" can be used to carry version information as well as the parameters of "Content-Type". There is no situational difference.
o proliferation of type/subtypes tags for the sole purpose of versioning is undesirable
I'm not sold on this one. If a new version of a media type is not backwards-compatible, I would offer that it is a different media type. As noted above, in 8 years of MIME types, we have all of 93 application/foo registered types.
New subtypes are often necessary for new versions, although not always "desirable".
that leaves parameters associated with the media type as an efficient mechanism for indicating version information.
I disagree -- it's extraneous information, unnecessary and often lost in processing pipelines. We haven't really talked much about the terrible deployment experience and prospects for *any* MIME parameters. They usually get lost; most of the operating systems that support MIME type mappings to files don't support parameters (Windows file associations, MacOS, mime.types, etc.).
After saying that this is not the only method, I also would offer that one is ALWAYS free, in the definition of the MIME type, list a parameter and describe normative behavior for processing that parameter.
We could write standards about castles in the air all day, and I suppose it would be "free" in some sense of the word, but it is wrong, and detracts from the value of the standard. Don't specify things that either don't work or are undeployable. Otherwise, what's the point? Larry -- http://larry.masinter.net

On Thu March 10 2005 03:26, Larry Masinter wrote:
the specific proposal of using MIME type _parameters_ is a bad design choice for conveying information as slippery as _version_.
Please define "slippery" in this context, then please explain precisely how your pet preference (Content-Features) is a more suitable choice in that respect, including the specific registered feature tag that you propose using.
Maybe; if transfer encoding is applied, that would entail decoding and opening the media content to determine version information. If the media type is specified via message/external-body or a similar mechanism, additional steps are necessary to obtain the media in order to determine version information. If a particular instance of content is large, that may result in a substantial waste of resources if it turns out to have an incompatible version.
I don't think those are problems; aren't these forwarding bodies (that are looking at MIME types and parameters) also unwrapping transfer encodings and fetching external bodies?
No (i.e. not necessarily).
You're optimizing what I assume is mainly a failure case, anyway: you got sent something you can't read.
Bad assumption. One might have been sent one or more pointers to external data. One would like to be able to determine ahead of time whether any of them can be read *before* retrieving huge amounts of data, and that may depend on version information.
Using MIME parameters isn't the only place to put auxiliary information. If you need auxiliary information, put it somewhere else -- such as in a content-features header.
Version information is not a feature. Your suggestion requires registration of a feature tag (RFC 2506), waiting for that tag to be recognized (i.e. replacement of deployed implementations of feature tag parsers), generating an additional field (viz. Content-Features) in addition to Content-Type, hoping that that additional field isn't lost or mangled in transit, and extreme optimism regarding whether a receiving implementation will pay any attention at all to a Content-Features field, much less be able to interpret that tag *and* its associated value. That rigamarole is certainly not as light-weight as simply using a version parameter in the Content-Type field. N.B. there is nothing resembling "version" registered as a media feature tag http://www.iana.org/assignments/media-feature-tags
I meant "negotiation" in a more general way: describing content in a way that subsequent processors can make decisions about whether or how to process without opening the content itself. In this case, I recommend putting this auxiliary data somewhere where it won't just confuse and pollute the processing that is necessary for the content-type itself.
"Content-features" is useful in cases where there is no back-channel.
But not for version information. Content-Type is similarly useful (including for version information) in the absence of a negotiation mechanism.
Hardly. Pulling out redundant information and sticking it on the wrapper to facilitate processing is an optimization,
One might only have the wrapper (e.g. message/external-body), in which case there is no redundancy, and provision for appropriately dealing with the content consistent with the Internet Architecture (RFC 1958, specifically section 3.1); assuming that all recipients will have unlimited (electrical) power, infinite communications bandwidth, gibibytes of local storage, and every conceivable version of every type of decoder is not consistent with that architecture. It is not an optimization; it is information which is essential to the operator of a low-power, limited bandwidth, small-memory device to be able to determine whether or not it is advisable to download some content of a specific media type.
o negotiation via content-features may be a viable mechanism where negotiation is possible, but it is not always possible (and some means of indicating version must be available to the content negotiation mechanism)
Wrong. "Content-Features" can be used to carry version information as well as the parameters of "Content-Type". There is no situational difference.
See above re. no registered feature tag for version.
that leaves parameters associated with the media type as an efficient mechanism for indicating version information.
I disagree -- it's extraneous information, unnecessary and often lost in processing pipelines.
It's not extraneous or unnecessary (often essential). And of course Content-Features fields are *always* preserved and considered, right?
We haven't really talked much about the terrible deployment experience and prospects for *any* MIME parameters. They usually get lost; most of the operating systems that support MIME type mappings to files don't support parameters (Windows file associations, MacOS, mime.types, etc.).
And that differs from Content-Features ... how? Why should we base design decisions on abysmally bad implementations of things (e.g. OSes) unrelated to what is being specified (media types)?
We could write standards about castles in the air all day, and I suppose it would be "free" in some sense of the word, but it is wrong, and detracts from the value of the standard. Don't specify things that either don't work or are undeployable. Otherwise, what's the point?
Ooh, that sounds like the sort of remarks that could come back to haunt a fellow :-).
participants (3)
-
Bruce Lilly
-
Eric Burger
-
Larry Masinter