
* Chris Lilley wrote:
BH> Consider a *UTF-8 encoded* document BH> BH> Content-Type: application/xml;charset=iso-8859-1 BH> BH> <?xml version="1.0"?> BH> ... BH> <!--Björn--> BH> ... BH> BH> With no BOM and using only US-ASCII characters for the rest of the BH> document,
So in this case, although the processor that generated it is non conforming, the content is not non conforming (but it should be) and the processor that receives it has two possibilities:
I've actually asked to get a better understanding on how you intend to change RFC3023, yet I am afraid you did not really say what happens with the document above if RFC3023bis gets approved with your changes. I would appreciate to know just a, b, c, or what else for the various cases.
b) it can note that a required encoding declaration is not present, and throw a well formedness error.
It actually can't, 0xC3 0xB6 is a legal sequence in both UTF-8 and ISO-8859-1, it would need to know that I meant to have "Björn" in the comment which it cannot know.
Consider an *8859-1 encoded* document
Content-Type: application/xml;charset=UTF-8
<?xml version="1.0"?> ... <!--Björn--> ...
With your proposal, would the well formedness error (bytes occur that cannot occur in UTF-8) be silently recovered from if the HTTP header overrides it, even for an XML processor, while it would continue to fail in other cases (such as server side processing)?
I do not really think I've made a proposal to change RFC3023 other than that the differences between text/xml and application/xml are removed to properly reflect running code. I can only tell you what XML 1.0 and RFC 3023 require in these cases but you know that already.
It would sometimes be b) and sometimes c) depending on the particular software and whether its reading from disk on the server or over the net. I frankly can't understand how you consider this lack of interoperability to be a desirable thing.
I am in fact most interested to learn how you think this can be improved upon which is why I asked you about the impact of your proposal for the various cases I've mentioned. What applications currently do does not really help me to get a better understanding of that.