1. 52
  1.  

  2. 10

    As someone who works with an EBCDIC platform, the default US EBCDIC codepage (CCSID 37, at least on i) is capable of representing accents.

    1. 7

      According to the bank’s lawyer accents weren’t possible in 1995, when the system put in service, and that they “have been added since”. In 1995 it was “technically impossible to use accents”.

      A quick search however reveals that e.g. codepage 01047 was published in 1991, so I’m calling bullshit. They also store names in all-caps by the way, and I’m fairly sure you could at least do lower-case letters before that. The also pull the “it’s based on punch cards!” Technically correct, but so is ASCII and by extension UTF-8. We still have the delete control character in that weird position because delete meant “punch all the holes”.

      I suspect they just migrated from a previous IBM system and that this is the real reason.

      From the article:

      Look, I’m not a lawyer (sorry mum!) so I’ve no idea whether this sort of ruling has any impact outside of this specific case.

      It sets a precedent in Belgium, not in other countries.

      1. 7

        In the mainframe/legacy world, it’s totally plausible they couldn’t update their systems to use the new codepage between ’91 and ‘95. It’s not so much about updating a table that stores the customer’s name.

        There are probably hundreds of tables that store the customer’s name, and thousands of data feeds and interfaces that would need to be updated: first to be aware of codepages in the first place, then to support the extended character sets.

        And yeah, 100% guaranteed they migrated from an older system. That’s what mainframes do.

        1. 5

          Precedents de jure do not exist in EU law. It’s a civil and not a common law system. Not even another court in Belgium is bound by this ruling. They exist de facto though, in that courts are encouraged to look at other rulings. Which means this ruling can indeed and reasonably be used to interpret the GDPR in another country.

          See: https://academic.oup.com/icon/article/12/3/832/763797

          It’s a legal corner case though and to my view, every customer would need to sue. (IANAL, etc.) So it may be feasible to just eat the cost.

          1. 1

            Mabe “precedent” isn’t exactly the right word, but previous ruling are usually considered AFAIK; there are a bunch of references to previous ones in this ruling as well. And while not entirely legally binding in the same way as in e.g. the US, this ruling does empower Belgian consumers in similar to some degree.

            1. 6

              Here’s a bit of precedent from Finland: the Parliamentary Ombudsman found in 2018 that the population register using ISO-8859-1 violated the Sami people’s rights because not all Sami names can be expressed in that encoding and it would have been possible to use Unicode. https://www.oikeusasiamies.fi/en/web/guest/-/vaestorekisterikeskus-laiminloi-saamelaisten-oikeudet

      2. 6

        It’s worth scrolling to the comments just for the first one:

        Très intéressant

        Presumably this ruling also means that ASCII systems would also not be able to comply with the GDPR. I had thought that EBCDIC was, like ASCII, a 7-bit encoding, but apparently I was wrong, it’s an family of 8-bit encodings with some characters in the same space in all (but, unlike ASCII-based encodings, not in the first 128 characters). For most European countries you will be able to pick a codepage that can store all names correctly, but you probably won’t be able to handle both Greek and French names with any 8-bit encoding. The old hack for this was to store names as both a string and a codepage index. That should be possible in this system, just expensive to implement (anyone who has to deal with EBCDIC deserves hazard pay).

        1. 1

          BCDIC is (was, I guess we can say by now) 6-bit; EBCDIC jumped straight to 8-bit. The term “code page” originated inside IBM to name the different flavors for EBCDIC.

          1. 1

            Ah, thanks! I thought that might explains a bit of the layout of EBCDIC: The layout of the upper-case bit looks like it’s designed to be easy to map from a 6-bit encoding, but it looks as if it’s completely different layout to the ones the 1401, 1620, and contemporary IBM machines used, so that conversion would need to be done with a look-up table. I suppose even in the ‘60s, a 64-byte lookup table wasn’t too much overhead…

        2. 4

          Time to break out UTF-EBCDIC

          1. 2

            Putting my EBCDIC hat back on: at least on i, no one uses UTF-EBCDIC. It’s much easier to use UCS-2 or UTF-8, considering tagging a column as such is trivial and RPG can use it nowadays. (Maybe it’s different on z, or god forbid, something like BS2000.)

          2. 4

            Oh boy, looking forward to airlines getting sued.

            1. 1

              Hmm, so what was the final outcome? Or are they still lawyering?