Thursday, July 05, 2007
Is BigTIFF TIFF?
An argumentenergetic discussion is developing on the TIFF mailing list on whether BigTIFF is a new version of TIFF or a new file format. There are good reasons for calling it a new format.
At the most obvious level, no existing application to read TIFF files can read any BigTIFF file. The TIFF specification says that bytes 2-3 contain "An arbitrary but carefully chosen number (42) that further identifies the file as a TIFF file." BigTIFF files contain the value 43 in this location. The TIFF spec says that bytes 4-7 contain "The offset (in bytes) of the first IFD." BigTIFF uses these locations for other purposes, and places the offset of the first IFD starting at byte 8. In addition, a tag structure is 12 bytes long in TIFF; it's 20 bytes in BigTIFF, to accommodate larger counts and offsets.
The BigTIFF proposal calls the value at offset 2 a "version number." But according to the TIFF specification, it isn't a version but a fixed identifier. Putting a different number there is a declaration that a file isn't TIFF.
File formats do change over time, and inevitably some files based on newer versions will not be readable by software based on older versions. But it's reasonable to expect that the newer files be recognizable as incompatible or broken instances of the format. When the format's signature information, other than the part which identifies the version, changes, it's no longer the same format.
BigTIFF does have a lot in common with TIFF, and it should be possible to modify a well-written TIFF reader to read BigTIFF without too much trouble. All tag values and definitions from TIFF are retained.
It's unfortunate that TIFF wasn't designed to include version information. The only way to tell a TIFF 6 file from a TIFF 4 file is by the features (tags and data types) it uses. This puts BigTIFF in a difficult situation. But from the standpoint of format identification, it makes more sense to call BigTIFF a new format derived from TIFF than to call it TIFF, and to give it its own MIME type and file extension(s).
The decision is actually in the hands of Adobe, which hasn't shown much interest in the TIFF standard in a long time. If Adobe does nothing, then the new format isn't TIFF, since it's fundamentally different from any Adobe-approved specification. Otherwise the decision is legally in Adobe's hands.