PDF is a very nicely designed format. A few things glossed over in the article (because they’re not relevant to the overall point):
It’s designed for non-destructive editing, that pointer to the xref table is not a pointer to the xref table, but to an xref table. This lets you add new objects and a new xref table, without modifying the start. To make things more fun, xref tables can refer to older ones, so your new xref table just needs to contain new or modified objects (modified means replaced with a newer version). This is how PDF can easily add annotations without losing any of the existing data.
Early PDF was basically this structure with each object being a subset of PostScript. By adding the structure, you could get most of the benefits of subroutine calls (e.g. draw the same glyph a thousand times by just referring to it) without the unbounded rendering time problems.
When people refer to Quartz as DisplayPDF, they’re really talking about the commands in the objects in a PDF document having a 1:1 mapping with CoreGraphics commands. This is a nice property because it means that there’s a trivial way of writing PDFs from anything that uses PDF, but is sadly doesn’t mean that the display server supports versioned objects of vector drawing commands.
I particularly enjoyed the end:
Eventually I ended up with a PDF that Preview claimed is larger than the entire universe – approximately 37 trillion light years square. Admittedly it’s mostly empty space, but so is the universe.
Please don’t try to print it.
[edit] Oh, one additional thing:
PDF has drawing commands but doesn’t have any kind of quadtree or equivalent spatial partitioning structure. This means that, if you do generate a PDF the size of Germany, then you have to render every object and figure out which regions it’s in. This is why things like the London tube map PDF kill PDF renderers on small devices. That makes sense for a printer: laser printing requires you to rasterise a page, and you rasterise at a fixed pixel density and so the memory requirements are bounded. It’s less useful for a device where you want to zoom in and scroll.
I think that’s unavoidable for typography. PostScript points are the same as the consensus definition of a point that emerged after publishers (and manufacturers of printing presses) all using their own definitions. Font sizes are all defined in terms of points, and so everything text related needs to either use incense or deal with annoying scale factors. One point is 0.3527777778 mm, so rounding error are going to compound text processing if done in SI units.
It’s possible that the publishing industry will move to some SI definition of a point, maybe 0.35mm, at some point but it’ a huge transition and I don’t see it happening any time soon.
PDF is a very nicely designed format. A few things glossed over in the article (because they’re not relevant to the overall point):
It’s designed for non-destructive editing, that pointer to the xref table is not a pointer to the xref table, but to an xref table. This lets you add new objects and a new xref table, without modifying the start. To make things more fun, xref tables can refer to older ones, so your new xref table just needs to contain new or modified objects (modified means replaced with a newer version). This is how PDF can easily add annotations without losing any of the existing data.
Early PDF was basically this structure with each object being a subset of PostScript. By adding the structure, you could get most of the benefits of subroutine calls (e.g. draw the same glyph a thousand times by just referring to it) without the unbounded rendering time problems.
When people refer to Quartz as DisplayPDF, they’re really talking about the commands in the objects in a PDF document having a 1:1 mapping with CoreGraphics commands. This is a nice property because it means that there’s a trivial way of writing PDFs from anything that uses PDF, but is sadly doesn’t mean that the display server supports versioned objects of vector drawing commands.
I particularly enjoyed the end:
[edit] Oh, one additional thing:
PDF has drawing commands but doesn’t have any kind of quadtree or equivalent spatial partitioning structure. This means that, if you do generate a PDF the size of Germany, then you have to render every object and figure out which regions it’s in. This is why things like the London tube map PDF kill PDF renderers on small devices. That makes sense for a printer: laser printing requires you to rasterise a page, and you rasterise at a fixed pixel density and so the memory requirements are bounded. It’s less useful for a device where you want to zoom in and scroll.
Nice reflections, thank you! I also like PDF as a format, but am quite irritated by their use of imperial units.
I think that’s unavoidable for typography. PostScript points are the same as the consensus definition of a point that emerged after publishers (and manufacturers of printing presses) all using their own definitions. Font sizes are all defined in terms of points, and so everything text related needs to either use incense or deal with annoying scale factors. One point is 0.3527777778 mm, so rounding error are going to compound text processing if done in SI units.
It’s possible that the publishing industry will move to some SI definition of a point, maybe 0.35mm, at some point but it’ a huge transition and I don’t see it happening any time soon.