1. 15

  2. 10

    This is a new PDF interpreter for Ghostscript, replacing the previous one that had been in use for decades. The previous interpreter was implemented in Postscript; the new one’s in C. The original one was in Postscript because PDF started life as a Postscript derivative and had a similar enough data model to make it convenient to implement PDF constructs in terms of equivalent Postscript constructs.

    They give three reasons for a rewrite:

    • Large Postscript programs are difficult to maintain, and it’s increasingly hard to find expert Postscript programmers to maintain it.
    • PDF has continued to add features while Postscript hasn’t, so the convenience of using Postscript to implement PDF has lessened over time. Ghostscript has had to extend Postscript with undocumented GS-only Postscript constructs, to mirror new features added to PDF.
    • Some readers (notably Adobe Acrobat) are extremely lenient in accepting malformed PDF files and fixing up errors on the fly, which means such files exist in the wild and users expect GS to handle them. But Postscript isn’t a great language for doing non-trivial error handling and recovery.

    On a meta level, I’m mostly impressed that a ground-up rewrite of something of this complexity seems to have been completed successfully.

    1. 6

      Do they have a publicly available document explaining why they chose C for interpreting a language with such a hostile attack surface? i would think any memory-safe language would be a better choice.

      1. 1

        Your thinking is entirely backwards. Ghostscript is a C project, including its maintainers, and anything different would need very strong reasons. Experience, portability, and control can be named as important factors.