1. 8
  1.  

  2. 2

    It’s an interesting question, but I’m not sure I agree the issue raised is going to be the real issue for any online version control system or open source repository.

    And I’d be highly skeptical of anyones opinion of GDPR unless they’re actually working to implement it or preparing to enforce it (I’m in the process of implementing).

    In this case GitHub would be both a controller and processor. Github would need to get consent from anyone with an account. But consider that an IP address is considered personal data and I can grab the URL for a repo and clone it locally without being logged in. Would GitHub need to put up a “Consent Wall” anytime someone looks at a repo while not being logged in and record consent? Or will we start needing this anytime you request something over the command line via ssh, sftp, scp, wget, or curl? What about cryptocurrencies or torrents?

    The owner of any repository hosted on Github would be the controller. Any consent given would need to be recorded for the controller. GitHub and any other services (CI/CD, static analysis, etc) that has been given access to it would be a processor. The owner of any repository who grants access to these other services is on the hook to verify that service meets GDPR and get consent if these new processors are using the personal data in new ways. I’m not even sure how I would handle consent for a git repo for something like this.

    And the reference to this applying to only people with EU citizenship is incorrect. GDPR is not about your citizenship, but rather where you’re at when your data was requested. As a US citizen if I’m in an EU country (business or school) anytime anyone asks for my personal data GDPR should apply.

    The question about the impact of someone writing a blog or book would likely be allowed under by Art. 85 GDPR Processing and freedom of expression and information and Recital 153 Processing of personal data solely for journalistic purposes or for the purposes of academic, artistic or literary expression.

    Sort of feels like we’ve gone back to the code is art/literature debate.

    1. 3

      I ’m not sure I agree the issue raised is going to be the real issue for any online version control system or open source repository.

      It’s almost certainly not.

      But consider that an IP address is considered personal data and I can grab the URL for a repo and clone it locally without being logged in.

      An IP address is not considered personal data:

      A single household PC may have different family members using it under the same login identity. As a result, the IP address and cookies cannot be connected to a single user. Therefore it is unlikely that this information will be personal data.

      But let’s consider something that is almost certainly personal data, like name and address, such as would be found in a copyright notice. Recital 47 explains:

      The legitimate interests of a controller, including those of a controller to which the personal data may be disclosed, or of a third party, may provide a legal basis for processing, provided that the interests or the fundamental rights and freedoms of the data subject are not overriding, taking into consideration the reasonable expectations of data subjects based on their relationship with the controller.

      And once you’ve downloaded the repository and made a change, you may now ignore erasure and stop processing requests because Article 17 allows you to comply with other laws (including the copyright provisions).

      The owner of any repository who grants access to these other services is on the hook to verify that service meets GDPR and get consent if these new processors are using the personal data in new ways.

      No they’re not. Recital 26 explains:

      To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.

      GDPR is not about your citizenship, but rather where you’re at when your data was requested.

      Article 3 (2) (b) does indeed refer to personal data monitoring that occurs inside the union, but (1) protects everyone in the world from controllers and processors that are in the union (i.e. regardless of where the data subject is) and (2) (a) also protects European Citizens (wherever they are in the world) if the processor or controller is offering goods or services to the data subject.

    2. 1

      It does, but also on anything else code hosting repos do. Let’s say I sign up for github, and agree that they can use my POO for the purposes of providing github features. Now I create a new repo, and set it as the remote of another local repo, then push all of the commits to github.

      At this point, github have to make sure that everyone else who has PII in that repo (in commit messages or otherwise) has opted in, before the pushed content can be part of github. For most of my repos that’s fine, as I already am every committer But it I push up something like a clone of the Linux repository or gcc, that’s a lot of work contacting all the committers

      1. 4

        That is already a requirement today.

        GitHub’s ToS specifically requires rights that e.g. the GPL does not grant you. You can not push any GPL project to GitHub without contacting every contributor, violating GitHub’s ToS, or the law.

        I ended up doing exactly that, luckily I know most people whose code I built upon personally. Larger projects such as the Linux kernel will just get a free pass. And projects somewhere in the middle?