
Thanks Ben! Your effort is much appreciated. On Wed, 22 Oct 2003 ben@morrow.me.uk wrote:
In section 6 the draft states
| Refer to this DTD as: | | <!ENTITY % SHF PUBLIC "-//IETF//DTD SHF//EN" | "http://www.foo.org/shf.dtd"> | %SHF;
and in section 9.1
| There is no charset parameter.
; however in section 9.5 we have
| Second, neither the "XML" declaration (e.g., ) nor the "DOCTYPE" | declaration (e.g., ) may be present. (Accordingly, if another | character set other than UTF-8 is desired, then the "charset" | parameter must be present.)
. These are inconsistent.
Sorry, I was confusing "charset" with "encoding". Will never do it again... Also the paragraph is unclear, as you say. Could I write something like: Second, neither the "xml" processing instruction nor the "DOCTYPE" declaration need to be present. (Accordingly, if a character set other than UTF-8 is desired, then the "encoding" parameter must be present in an "xml" processing instruction .)
I would suggest removing both restrictions listed in 9.5: their purpose is unclear.
The idea is: if you want to switch character set, do so in the processing instruction. (<?xml version="1.0" encoding="foo" ?>) I hope the above conveys this clearly. So, a pedantically specified SHF file would begin: <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE shf PUBLIC "-//IETF//DTD SHF//EN" "http://ietf.org/shf.dtd"> <dump ... etc
Two additional points: would it not be worth declaring an XML namespace for this format in addition to the DTD?
I have seen no standards for this: there are however two drafts about it, while we don't know if they will be published, we fashined this like e.g. the BEEP standard.
and would it not be worth adding support for using hashes other than SHA-1, both for when the time surely comes that SHA-1 is insufficient security, and to allow simpler checksums in secure environments with limited processing power (such as embedded systems)?
We had this discussion, and SHA-1 is sort of IETF standard (RFC 3174). The purpose of the SHA-1 checksum is plain checksumming of the contents (information integrity) not to counter compromise. The SHF file as a whole may be signed and checked by way of RFC 3275 if need be, as stated. Several checksum algorithms would increase complexity of implementation and was removed for keeping it simple. On processing power, see below.
More generally, although this may be out of the remit of this list, is an XML-based format not a little complex for a hex dump?
The main purpose is transport and storage. In reality, dumps are typically not transferred to embedded systems by way of textual formats anyway, instead a host program ("flasher" etc) on some other machine will typically read the SHF file and transfer the data via serial bus in some custom format. SHF file -> parser/converter -> 01011001001 -> device In the future, as complexity increase in embedded systems, this may change, so that systems parse the SHF file directly. I will try to clarify this. Thanks, Linus