Request for advice: sbml+xml Media Type

Hi, I work on SBML (Systems Biology Markup Language) at Caltech. We are thinking of proposing a new XML MIME media type. When learning about the process for getting a new IETF-tree media type approved, I was strongly advised to consult the members of the ietf-types and ietf-xml-mime discussion lists for advice before diving in. So here I am. I'll appreciate any advice you can give me, especially any that saves us from making some stupid mistakes that the biological modeling community will regret for years to come. Please forgive me if I reveal my ignorance in some questions below. First, a little background. SBML is an XML format for representing systems of biochemical reactions. Making it a MIME media type would enable browser-based simulation tools to conveniently download, run, and edit models. Work is now beginning on a web infrastructure to make it easy for biological researchers to share models on the web, download models used in published papers, etc. Two "levels" of SBML have been defined so far. Specifications, including the XML schemas, are at: http://www.sbw-sbml.org/sbml/docs/papers/sbml-level-1-version-2/sbml-level-1... http://www.sbw-sbml.org/sbml/docs/papers/sbml-level-2-version-1/sbml-level-2... We are thinking that the ideal name would be either: application/sbml+xml or: model/sbml+xml Now, here are a few questions. 1. Would it be a bad idea if we used RFC3236 (The application/xhtml+xml Media Type) as a model for the document w write? I'm hoping that we don't need to explain the full semantics of SBML in the RFC, since there are already some weighty papers that do that (referenced above). At only 8 pages, RFC3236 seems like a model of simplicity and clarity that we would like to emulate. Or is it possible to get even simpler? Some of the docs I found for XML MIME media types seemed to do little more than list the name of the type and who submitted it. 2. We are thinking of including required parameters of "level" and "version". Anything to watch out for here? Is this a wrong idea? SBML has multiple levels to enable different simulation tools to interoperate at different levels of complexity and sophistication. Each level can come in different versions. More levels are planned. 3. Is it completely stupid to even consider model/sbml+xml? The other model/ media types have been for spatial models. SBML is primarily used for spatial models of reactions that occur within biological cells, and has some notions of spatial relation, but an SBML model does not necessarily have the minimum 3 orthogonal dimensions specified in RFC2077. We're wondering if SBML is still within the spirit of the model/ top-level content type, though. RFC2077 speaks of economic models, behavioral models, and so on, and seems to encourage a situation where modeling tools might work successfully on models from radically different domains. 4. Any other advice you'd care to offer? Thanks in advance for your assistance, Ben -- Ben Kovitz Systems Biology Workbench (SBW) Development Group, Caltech http://www.sbw-sbml.org bkovitz at caltech.edu

Hi Ben, On Fri, Jul 04, 2003 at 08:22:00PM -0700, Ben Kovitz wrote:
1. Would it be a bad idea if we used RFC3236 (The application/xhtml+xml Media Type) as a model for the document w write? I'm hoping that we don't need to explain the full semantics of SBML in the RFC, since there are already some weighty papers that do that (referenced above). At only 8 pages, RFC3236 seems like a model of simplicity and clarity that we would like to emulate.
I'm not sure I'd agree, but thanks!
Or is it possible to get even simpler? Some of the docs I found for XML MIME media types seemed to do little more than list the name of the type and who submitted it.
There's a continuum, of course. For RFC 3236, Peter and I went out of our way to include all the pertinent information that we felt was required by the media type registration process. The problem with this, of course, is that some information in the specification needs to be duplicated in the registration. Luckily, this issue was recognized by the IETF, and more recent W3C-initiated media type registrations are using a "shell" registration document which permits the IANA media type registration form to be included within the W3C-maintained specification. See, for example, the SOAP registration; http://www.ietf.org/internet-drafts/draft-baker-soap-media-reg-03.txt Unfortunately for you, I don't think this would apply, as I believe this arrangement is fairly unique between the W3C and IETF, or at least between the IETF and other similarly trusted organizations.
2. We are thinking of including required parameters of "level" and "version". Anything to watch out for here? Is this a wrong idea? SBML has multiple levels to enable different simulation tools to interoperate at different levels of complexity and sophistication. Each level can come in different versions. More levels are planned.
In my observation, these rarely work out as extensibility mechanisms. text/html used to have one ("level"), and it wasn't used, so wasn't included in RFC 2854. application/xhtml+xml has one ("profile"), but it's there for a single purpose only; to help WAP apps distinguish between XHTML Basic and XHTML 1.0. I'm not aware of anybody using it for other reasons. You might want to consider whether prescribing sufficiently extensible processing behaviour - such that "higher level" (more complex) content may be properly processed by a "lower level" processor - would be adequate for your needs. Feel free to contact me off-line if you'd like some advice on how that might be done. MB -- Mark Baker. Ottawa, Ontario, CANADA. http://www.markbaker.ca

On Fri, 4 Jul 2003, Ben Kovitz wrote:
Some of the docs I found for XML MIME media types seemed to do little more than list the name of the type and who submitted it.
This is indeed the case. However: these exist for historical reasons, and also because the responsible people are very pragmatic about things. When the IESG decides whether to pass the RFC or not, they will get back to you and ask you to produce an RFC that describes the content of this transport type (they did with me). Thus the lengthy documents refered (inaccessible to me right now) should preferabley reside with some standards body, IETF RFCs are preferred, W3.org documents come next, IEEE standards have been referred (e.g. audio/mpeg etc.) I believe the reason as to why a standards body, and IETF in particular, should be used as a storage holder for the standard spec is that it should be easily and readily accessible by any contemporary AND FUTURE implementors of this standard. Compared to IETF, the current storage of the specification (Sourceforge in your case) is rather new and not generally known as an eternal document store. (My inability to obtain it right now is an indication of its reliability.) This means e.g. URI:s referenced in your transport type could change and at a future date complicate the process of retrieveal for an external party, and the IETF cannot guarantee access to these vital documents, which is bad. If you do not want to submit the entire specification to the IETF as an RFC, prs.-types or vnd.-types should be used instead. If you can argue well for you case, then I believe eventually the IESG will make an exception, but this may take a considerable amount of time. Most of this stems from personal experiences and opinions, so take it simply as a piece of anecdotal knowledge, other people here will probably amend and correct me extensively. Linus
participants (3)
-
Ben Kovitz
-
Linus Walleij
-
Mark Baker