[M3devel] On magic numbers
Hendrik Boom
hendrik at topoi.pooq.com
Sat Jun 13 13:37:43 CEST 2015
On Fri, Jun 12, 2015 at 08:51:31PM +0200, Elmar Stellnberger wrote:
> Basically any random
> number should suffice as with 1.000.000 already registered file formats the
> probability for a clash would just be 1/4000. Nonetheless we could double-
> check against the database of the "file" program.
For more collision-freeness for the foreseeable future, I'd suggest a
64-bit random number. Even if there were a collision with someone
else's 32-bit number, then next 32 bits would likely resolve the issue.
It's not too far-fetched to assume that the number of different file
formats will continue increasing exponentially even as our world-wide
data storage increases.
And maybe it's tie that the hash codes we use for data types also
increase in length. I've always considered 32 bits a bit too small for
this, especially in the days of *huge* program libraries. Maybe a
necessary evil as a concession to antiquated linkers, but it could
legitimately be made platform-dependent.
For backward copatibility, the compiler could just start checking for
the magic number. If it's present, skip it. If it's absent, go on as
at present.
> Not all files have a completely random magic; f.i. pyc (compiled
> python files)
> have xx\r\ndddd as a header where xx is a 2-byte number and dddd must be
> a valid date. However if we can choose things from scratch I would speak for
> a fixed header f.i. FD,10,01,XX and add things like gcc, cm3 version numbers
> and timestamps in the following (*).
> It would be beneficial to have at least a cm3-middleend version number
> encoded since not every backend can be combined with any middle/front-end.
Of course this should still be appended to the 128 (or however many)
bits.
-- hendrik
More information about the M3devel
mailing list