
* Chris Lilley wrote:
As you yourself pointed out, per RFC3023
Processors generating XML MIME entities MUST NOT label conflicting charset information between the MIME Content-Type and the XML declaration.
such content is already non conforming.
It does not actually apply to content...
In terms of dealing with such content if it still occurs, the XML well formedness rules already handle that in an entirely satisfactory manner and nothing further need be added. These are already well implemented and highly interoperable.
Consider a *UTF-8 encoded* document Content-Type: application/xml;charset=iso-8859-1 <?xml version="1.0"?> ... <!--Björn--> ... With no BOM and using only US-ASCII characters for the rest of the document, with your proposal, which of the following behaviors of implementations would be considered conforming? a) it fails to process the document due to RFC3023bis/XML 1.0 errors b) it considers the comment to include "Björn" c) it considers the comment to include "Björn" If none of these behaviors would be conforming, what would be conforming instead? What would be the answers for your proposal if application/xml is replaced by the following types and proper content as defined above: * application/xhtml+xml (with no update to RFC3236) * image/svg+xml (as you propose it) For application/xml / application/xhtml+xml this would currently be b) as the document includes 0xC3 0xB6 and the encoding is determined to be ISO-8859-1 which means the sequence above represents "ö".