Threads for finn

  1. 38

    A company “bought” Audacity and added spyware. The same company also did it to MuseScore.

    You know, it really was and still is a stretch to describe basic, opt-in telemetry as spyware just because they made the unfortunate decision to use Google Analytics as a backend.

    1. 19

      Also, from what I heard they are doing decent work, actually paying maintainers to work on the software. You know, the exact thing that OP is complaining about not happening.

      1. 5

        please explain how Google Analytics isn’t spyware? it is software that monitors user behavior and reports it to a 3rd party, all typically without user consent.

        1. 20

          Audacity/GA would be spyware if it was monitoring usage of other things the user was doing on their computer. Using the term to describe the app recording usage of itself is hyperbole.

          1. 5

            If my business was audio engineering, having a tool that started reported on my usage of it would be problematic. I would immediately start looking for alternatives. Why should I have to look through the code to find out exactly what it’s logging? File names? My use of licensed plugins? The inference that the lead singer needs pitch correction on every single track, or that we brought in a pro “backup” singer who is actually 85% of the lead on the final mix?

            When I am editing my family’s amateur instrumental work, I think I can reasonably feel equally peeved at having my sound editor report back to base.

            Calling it spyware is not necessarily hyperbole.

            1. 5

              Fortunately the scenario you described doesn’t exist since the telemetry is opt-in.

          2. 19

            all typically without user consent

            Except here it is opt-in, as pekkavaa said.

            1. 2

              thanks, i missed that.

              I was curious what kind of consent was involved, and honestly it’s better than I expected. Based on the issue linked in the OP it seems Audacity now displays a dialog asking users to help “improve audacity” by allowing them to collect “anonymous” usage data. They don’t seem to mention that it also reports this to Google.

            2. 8

              Counting how many people clicked the big red button and bought products, or how many users have a 4K monitor, or how fast the page loads technically involves monitoring.. but it’s not really the same as what you would typically imagine when you hear the word “spying” is it?

              It’s rather silly to equate performance metrics, usability studies and marketing analytics to a secret agent carefully reading all your messages.

          1. 13

            I distribute my Go-based commercial software in an alpine-based Docker container which weighs in at 15MB. 5MB is my software, 3-4MB is ca-certificates. Alpine and musl are tiny and wonderful.

            1. 10

              Good to know I could still fit my livelihood on a few 1.44MB floppies. 🤣

              1. 5

                . Single JAR file is a statically linked package for Java apps for example. You can do the same thing with AWS Lambda packages (single zip).

                Are you using some custom ca? AFAIK, those certs are still around <1MB.

                du -sh /usr/share/ca-certificates/

                616.0K /usr/share/ca-certificates/

                1. 4

                  Why not use a scratch container? What does all that overhead get you?

                1. 5

                  I’ve been using Theia for development recently. As far as I can tell that’s basically the way to DIY this sort of thing. It’s pretty slick, basically vscode in a browser.

                  1. 33

                    Can we please stop normalizing spyware? I’m going to repost my comment from last time this bullshit was on this lobste.rs:

                    This site is claiming to offer a “standard for opting out of telemetry”, but that is something we we already have: Unless I actively opt into telemetry, I have opted out. If I run your software and it reports on my behavior to you without my explicit consent, your software is spyware.

                    1. 4

                      A shame I can’t upvote this more than once. This madness needs to stop.

                      1. 2

                        I disagree. Telemetry, which is a feature which does not collect personally-identifiable information (or attempt to fingerprint) by definition, is not spyware.

                        Moreover, I disagree that “actively opt[ing] into telemetry” is needed for software to report (again, by definition) non-PII usage information about itself, for the benefit of the software developers, and by extension the rest of the community. This biases the data and makes it much less useful, and because telemetry is non-PII by definition, there’s no harm to the user.

                        Now, that’s on a philosophical level. On a practical level, telemetry from trustworthy actors (Mozilla, and open-source projects more generally) is usually trustworthy, while “telemetry” from non-trustworthy sources (e.g. Microsoft, Google) is often not merely telemetry and collects fingerprinting and/or PII information (which, again, is not “telemetry” to begin with) - and so, the only safe action to take is just to turn off every switch that you have.

                        This “console do not track” proposal doesn’t work for either the good or bad actors. It fails for the good actors because, while I don’t want ads (which a good actor wouldn’t include in the first place), I do want telemetry and crash reporting, and most people want automatic updates. It fails for the bad actors because they won’t respect this switch anyway.

                      1. 6

                        Another fact-free hit piece on Signal!

                        Can anyone point to any evidence that this isnt simply a case of Signal not changing their server code for a while? When everything is end to end encrypted and all the server has to do is move encrypted data from one person to another, its hardly surprising to me that the server hasn’t changed in a while.

                        Articles like this don’t help anyone and just serve to spread FUD.

                        1. 16

                          Not sure who told the author of this piece that security by obscurity is bad, but what I have always heard is that security through obscurity is simply not to be relied upon. It’s not that you shouldn’t do it, but you should assume it will be defeated.

                          So if you want to change your SSH port, fine, but don’t leave password authentication enabled and go thinking you’re safe

                          1. 5

                            “Security by obscurity is bad” is the line that is parroted by many who don’t understand.

                            1. 1

                              There seems to be a consensus that “6!x8GWqufk-EL6tv_A4.E” is a stronger password than “letmein”. The only significant difference I see between these passwords is obscurity.

                              I wonder if this can be considered an example of “security by obscurity” that is widely considered neither “bad” nor likely to be defeated?

                              1. 20

                                There’s a long history of distinguishing obscure information like passwords or cryptographic keys from obscure methods like encryption algorithms. The key difference, I think, is that the only purpose of the secret information is to be secret, and you can measure its properties in that respect; that’s not true of code that’s meant to be secret, and competing requirements like “needs to run on someone else’s machine” make obscurity an unreliable crutch in many situations.

                                EDIT: Another key difference is that “obscurity” can be taken as “the information is still present in whatever the adversary can access, it’s just harder to read”, e.g. obfuscated source code in a JavaScript file. That’s also different from a secret like a password, which should be protected by not exposing it at all.

                                Like most maxims, “Security through obscurity is bad” is an oversimplification, but in my opinion it’s a good rule of thumb to be disregarded only when you know what you’re doing.

                                1. 3

                                  I think the “security by obscurity is bad” aphorism is quite a bit narrower than the original meaning: security by algorithmic obscurity is bad because one has to presume that a motivated attacker will be able to identify or acquire the algorithm. Therefore, any additional security from algorithmic obscurity is ephemeral, and sacrifices the very real benefit of allowing the cryptographic community to examine the algorithm for weakness (since weaknesses are often non-obvious, especially to the creator). As such, one could say that it’s a corollary to one of Kerchoff’s principles (rephrased by Shannon as simply [assume that] “the enemy knows the system”).

                                  The aphorism has been adopted by those lacking the technical knowledge to understand the full meaning and generalized further than it should be.

                                  The artificial distinction between “secrecy” (which is necessary to protect the key) and “obscurity” (which is generally used to apply to the system) is most important to understanding the aphorism and unfortunately the distinction appears non-obvious to the layman and leads to confusion.

                                  Edit: Ugh, just realized that this is essentially paraphrasing an old Robert Graham blog post: https://blog.erratasec.com/2016/10/cliche-security-through-obscurity-again.html. Also corrected a sentence in which I nonsensically used “security” in place of “obscurity.”

                                  1. 2

                                    That definition makes sense and clears up something I had been wondering about for a long time. Thanks!

                                  2. 3

                                    I think to rectify these definitions you need to have an idea of the system under test. The system expects, takes in, comments on the quality of its inputs and is required, when assumptions are satisfied, to produce trusted output.

                                    Security by obscurity says that the system is more difficult to break if the adversary doesn’t know what it is. This is generally true, it at least adds research costs to the adversary and may even substantially increase the effort required to make an attack.

                                    The general maxim is that security by obscurity should not be relied upon. In other words, you should have confidence that your system is still reliable even in the circumstance where your adversary knows everything about it.

                                    So, ultimately, the quality of the password isn’t really about the system. The system could, for instance, choose to reject bad passwords and improve its quality. The adversary knowing about the system now knows not to test a certain subset of weak passwords (no chance of success) but the system is still defensible.

                                    1. 2

                                      The difference is not only obscurity; it’s (quantifiable) cryptographic strength.

                                      Your website uses 256-bit AES, because it’s impossible to brute-force without using more energy than is contained in our solar system. You wouldn’t use 64-bit AES, though. Is the difference that the former algorithm’s key is more obscure?

                                      1. 1

                                        An obscure system will be understood, and therefore cracked if its only advantage was obscurity. Passphrase-protected crypto systems are not obscure. Their operation is laid open for all to see, including what they do with passwords. If you can go from that to cracking specific cryptexts, that’s a flaw everyone will admit. However, if you must skip the system entirely and beat a passphrase out of someone in order to break the cryptext, that’s no flaw of the system under discussion. It might be a flaw of some larger system, but I believe it is universally acknowledged that, if you’re beating a passphrase out of someone and will only stop when you get the information you’re looking for or you kill the person you’re beating, the person will almost certainly give the passphrase before they die.

                                    1. 1

                                      What options are available for blocking Plausible? They seem to encourage website operators to use CNAME a subdomain to them, specifically to avoid blocklists.

                                      Also, am I wrong in thinking Plausible is almost more immoral than Google Analytics? It seems like they’re trying to deliver spyware to people who have gone out of their way to block such things

                                      1. 2

                                        They seem to encourage website operators to use CNAME a subdomain to them, specifically to avoid blocklists.

                                        That hasn’t worked against uBlock Origin or PiHole for months.

                                        1. 2

                                          How do they do that?

                                          1. 1

                                            uBlock Origin runs a CNAME query against everything before letting the request go through.

                                            PiHole is a DNS server, so it already knows about every recursive request.

                                            1. 1

                                              Interesting, thanks for the info.

                                      1. 1

                                        This is cool, and reminds me of something I’ve been wondering recently:

                                        Why do people choose Websockets in these scenarios, when things like EventSource/SSE exists? It seems like a much simpler protocol if you’re not doing bidirectional messages.

                                        1. 3

                                          In my case it’s because I didn’t realize http4s had support for them, but apparently it does! I will see about giving that a try

                                        1. 4

                                          I wonder if there is a Firefox extension to detect such behaviour and let me know about that.

                                          This has probably happened and I haven’t noticed…

                                          1. 3

                                            Not exactly what you’re looking for, but extensions like uMatrix should mostly block this type of attack. The defaults are to block loading media, scripts and XHR/websockets from 3rd party domains, so it breaks most websites and isn’t super user friendly

                                          1. 2

                                            I have a GXP2160 and have played with the XML applications mentioned at the bottom of this article. It’s similar to the a conventional HTML-based browser application: user presses button to start the “application”, the phone makes an HTTP(S) request to your server and gets back a bunch of XML, which it renders to the screen. It can include some buttons for the user to press, and various other input options. Of course they have their own XML-based language that’s horribly documented, at least a few years ago when I was doing this. All documentation was in PDF form, with not much in the way of actual examples.

                                            I hacked together a little PHP script that ran on my PBX to allow me to arbitrarily reconfigure my outbound number, it’s quite handy.

                                            1. 3

                                              that’s really awesome. I think a lot of laypeople might underestimate how configurable enterprise hardware can be, and all the possibilities it could open.

                                              Almost makes me want to have a need for this kind of phone. Almost….

                                              1. 1

                                                I don’t want it, but I am surprised how affordable it actually is. Around the 80 Euros ballpark, I would’ve expected more for an ‘enterprise solution’.

                                            1. 65

                                              This site is claiming to offer a “standard for opting out of telemetry”, but that is something we we already have: Unless I actively opt into telemetry, I have opted out. If I run your software and it reports on my behavior to you without my explicit consent, your software is spyware.

                                              1. 11

                                                but that is something we we already have: Unless I actively opt into telemetry, I have opted out.

                                                I know this comes up a lot, but I disagree with that stance. The vast majority of people leaves things on their defaults. The quality of information you get from opt-in telemetry is so much worse than from telemetry by default that it’s almost not worth it.

                                                The only way I could see “opt-in” telemetry actually work is caching values locally for a while and then be so obnoxiously annoying about “voluntarily” sending the data that people will do it just to shut the program up about it.

                                                1. 28

                                                  That comment acts like you deserve to have the data somehow? Why should you get telemetry data from all the people that don’t care about actively giving it to you?

                                                  1. 12

                                                    That comment acts like you deserve to have the data somehow?

                                                    I’ve got idiosyncratic views on what “deserving” is supposed to mean, but I’ll refrain from going into philosophy here.

                                                    Why should you get telemetry data from all the people that don’t care about actively giving it to you?

                                                    Because the data is better and more accurate. Better and more accurate data can be used to improve the program—which is something everyone will eventually benefit from. But if you skew the data towards the kinds of people who opt into telemetry.

                                                    Without any telemetry, you’ll instead either (a) get the developers’ gut instinct (which may fail to reflect real-world usage), or (b) the minority that opens bug tickets dictate the UI improvements instead, possibly mixed with (a). Just as hardly anyone (in the large scale of things) bothers with opting into telemetry, hardly anyone bothers opening bug tickets. Neither group may be representative of the silent majority that just wants to get things done.

                                                    Consider the following example for illustration of what I mean (it is a deliberate oversimplification, debate my points above, not the illustration):

                                                    Assume you have a command-line program that has 500 users. Assume you have telemetry. You see that a significant percentage of invocations involve the subcommand check, but no such command exists; most such invocations are immediately followed by the correct info command. Therefore, you decide to add an alias. Curiously, nobody has told you about this yet. However, once the alias is there, everyone is happier and more productive.

                                                    Had you not had telemetry, you would not have found out (or at least not found out as quickly, only when someone got disgruntled enough to open an issue). The “quirk” in the interface may have scared off potential users to alternatives, not actually giving your program a fair shot because of it.

                                                    1. 3

                                                      Bob really wants a new feature in a software he uses. Bob suggests it to developers, but they don’t care. As far as they can tell, Bob is the only one wanting it. Bob analyzes the telemetry-related communication and writes a simple script that imitates it.

                                                      Developers are concerned about privacy of their users and don’t store IP addresses (it’s less than useless to hash it), only making it easier for Bob to trick them. What appears as a slow growth of active users, and a common need for a certain feature, is really just Bob’s little fraud.

                                                      It’s possible to make this harder, but it takes effort. It takes extra effort to respect users’ privacy. Is developing a system to spy on the users really more worthy than developing the product itself?

                                                      You also (sort of) argued that opt-in telemetry is biased. That’s not exactly right, because telemetry is always biased. There are users with no Internet access, or at least an irregular one. And no, we don’t have to be talking about developing countries here. How do you know majority of your users aren’t medical professionals or lawyers whose computers are not connected to the Internet for security reasons? I suspect it might be more common than we think. Then on the other hand, there are users with multiple devices. What can appear as n different users can really just be one.

                                                      It sort of depends on you general philosophical view. You don’t have to develop a software for free, and if you do, it’s up to you to decide the terms and conditions and the level of participation you expect from your users. But if we talk about a free software, I think that telemetry, if any, should be completely voluntary on a per-request basis, with a detailed listing of all information that’s to be sent in both human- and machine- readable form (maybe compared to average), and either smart enough to prevent fraudulent behavior, or treated with a strong caution, because it may as well be just an utter garbage. Statistically speaking, it’s probably the case anyway.

                                                      I’m well aware that standing behind a big project, such as Firefox, is a huge responsibility and it would be really silly to advice developers to rather trust their guts instead of trying to collect at least some data. That’s why I also suggested how I imagine a decent telemetry. I believe users would be more than willing to participate if they saw, for example, that they used a certain feature above-average number of times, and that their vote could stop it from being removed. It’s also possible to secure per-request telemetry with a captcha (or something like that) to make it slightly more robust. If this came up once in a few months, “hey, dear users, we want to ask”, hardly anyone would complain. That’s how some software does it, after all.

                                                      1. 1

                                                        The fraud thing is an interesting theory, but I am unaware how likely it is; you’ve theorised a Bob who can generate fraudulent analytics but couldn’t fake an IP address or use multiple real IP addresses or implement the feature he actually wants.

                                                        1. 0

                                                          It’s not that he couldn’t do it, it’s just much simpler without that. It’s really about the cost. It’s easy to curl, it’s more time consuming or expensive to use proxies, and even more so to solve captchas (or any other puzzles). The lower the cost, the higher the potential inaccuracy. And similarly, with higher cost, even legitimate users might be less willing to participate.

                                                          I don’t have some universal solution or anything. It’s just something to consider. Sometimes it might be reasonable to put effort into making a robust telemetric system, sometimes none at all would be preferred. I’m trying to think of a case “in between”, but don’t see a single situation where jokingly-easy-to-fake results could be any good.

                                                      2. 2

                                                        Telemetry benefits companies, otherwise companies wouldn’t use it. Perhaps it can benefit users, if the product is improved as a result of telemetry. But it also harms users by compromising their privacy.

                                                        The question is whether the benefits to users outweigh the costs.

                                                        Opt-out telemetry-using companies obviously aren’t concerned about the costs to users, compared to the benefits they (the companies) glean from telemetry-by-default. They are placing their own interests first, ahead of their users. That’s why they resort to dark patterns like opt-out.

                                                    2. 13

                                                      You assume that we actually need telemetry to develop good software. I’m not so sure. We developed good software for decades without telemetry; why do we need it now?

                                                      When I hear the word “telemetry”, I’m reminded of an article by Joel Spolsky where he compared Sun’s attempts at developing a GUI toolkit for Java (as of 2002) to Star Trek aliens watching humans through a telescope. The article is long-winded, but search for “telescope” to find the relevant passage. It’s no coincidence that telemetry and telescope share the same prefix. With telemetry, we’re measuring our users’ behavior from a distance. There’s not a lot of signal there, and probably a lot of noise.

                                                      It helps if we can develop UsWare, not ThemWare. And I think this is why it’s important for software development teams to be diverse in every way. If our teams have people from diverse backgrounds, with diverse abilities and perspectives, then we don’t need telemetry to understand the mysterious behaviors of those mysterious people out there.

                                                      (Disclaimer: I work at Microsoft on the Windows team, and we do collect telemetry on a de-facto opt-out basis, but I’m posting my own opinion here.)

                                                      1. 5

                                                        we don’t need telemetry to understand the mysterious behaviors of those mysterious people out there

                                                        Telemetry usually is not about people’s behaviors, it’s about the mysterious environments the software runs in, the weird configurations and hardware combinations and outdated machines and so on.

                                                        Behavioral data should not be called telemetry.

                                                        1. 3

                                                          One concrete benefit of telemetry: “How many people are using this deprecated feature? Should we delete it in this version or leave it in a while longer?”

                                                          We developed good software for decades without telemetry; why do we need it now?

                                                          Decades-old software is carrying decades-old cruft that we could probably delete, but we just don’t know for sure. And we all pay the complexity costs one paper cut at a time.

                                                          I’m as opposed to surveillance as anybody else in this forum. But there’s a steelman question here.

                                                        2. 14

                                                          The quality of information you get from opt-in telemetry is so much worse than from telemetry by default that it’s almost not worth it.

                                                          A social scientist could likewise say: “The quality of information you get from observing humans in a lab is so much worse than when you plant video cameras in their home without them knowing.”

                                                          How is this an argument that it’s ok?

                                                          1. 1

                                                            There are three differences as far as I can tell:

                                                            The data from a hidden camera is not anonymizable. Telemetry, if done correctly (anonymization of data as much as possible, no persistent identifiers, transparency as to what data is and has been sent in the past), cannot be linked to a natural person or an indvidual handle. Therefore, I see no harm to the individual caused by telemetry implemented in accordance with best data protection practices.

                                                            Furthermore, the data from the hidden camera cannot cause corrective action. The scientist can publish a paper, maybe it’ll even have revolutionary insight, but can take no direct action. The net benefit is therefore slower to be achieved and very commonly much less than the immediate, corrective action that a software developer can take for their own software.

                                                            Finally, it is (currently?) unreasonable to expect a hidden camera in your own home, but there is an increased amount of awareness of the public that telemetry exists and settings should be inspected if this poses a problem. People who do care to opt out will try to find out how to opt out.

                                                            1. 2

                                                              Finally, it is (currently?) unreasonable to expect a hidden camera in your own home, but there is an increased amount of awareness of the public that telemetry exists and settings should be inspected if this poses a problem. People who do care to opt out will try to find out how to opt out.

                                                              I think this is rather deceptive. Basically it’s saying: “we know people would object to this, but if we slowly and covertly add it everywhere we can eventually say that we’re doing it because everyone is doing it and you’ve just got to deal with it”.

                                                              1. 1

                                                                I still disagree but I upvoted your post for clearly laying out your argument in a reasonable way.

                                                            2. 3

                                                              You seem to miss a very easy, obvious, opt-in only strategy that worked for the longest time without feeling like your software was that creepy uncle in the corner undressing everyone. As you pointed out everyone keeps the defaults, you know what else most normies do? Click next until they can start their software. So you add a dialog in that first run dialog that is supposed to be there to help the users and it has a simple “Hey we use telemetry to improve our software (here is where you can see your data)[https://yoursoftware.com/data] and our (privacy policy)[https://yoursoftware.com/privacy]. By checking this box you agree to telemetry and data collection as outlined in our (data collection policy)[https://yoursoftware.com/data_collection] [X]”

                                                              and boom you satisfy both conditions, the one where people don’t go out of their way to opt into data collection and the other where you’re not the creepy uncle in the corner undressing everyone.

                                                            3. 4

                                                              You can also view this as an standardized way for opt-in, which isn’t currently available either.

                                                              1. 3

                                                                No, it is not. It is a standardized way for opt-out.

                                                              2. 3

                                                                This is a bad comment, because it doesn’t add anything except for “I think non-consensual tracking is bad”, and is only tangentially related to OP insofar as OP is used as a soapbox for the above sentiment. Therefor I have flagged the comment as “Me-too”, regardless however much I may agree with it.

                                                                1. 23

                                                                  Except that in the European Union, the GDPR requires opt-in in most cases. IANAL, but I think it applies to the analytics that Homebrew collects as well. From the Homebrew website:

                                                                  A Homebrew analytics user ID, e.g. 1BAB65CC-FE7F-4D8C-AB45-B7DB5A6BA9CB. This is generated by uuidgen and stored in the repository-specific Git configuration variable homebrew.analyticsuuid within $(brew –repository)/.git/config.

                                                                  https://docs.brew.sh/Analytics

                                                                  From the GDPR:

                                                                  The data subjects are identifiable if they can be directly or indirectly identified, especially by reference to an identifier such as a name, an identification number, location data, an online identifier or one of several special characteristics, which expresses the physical, physiological, genetic, mental, commercial, cultural or social identity of these natural persons.

                                                                  I am pretty sure that this UUID falls under identification number or online identifier. Personally identifyable information may not be collected without consent:

                                                                  Consent should be given by a clear affirmative act establishing a freely given, specific, informed and unambiguous indication of the data subject’s agreement to the processing of personal data relating to him or her, such as by a written statement, including by electronic means, or an oral statement.

                                                                  So, I am pretty sure that Homebrew is violating the GDPR and EU citizens can file a complaint. They can collect the data, but then they should have an explicit step during the installation and the default should (e.g. user hits RETURN) be to disable analytics.

                                                                  The other interesting implication is that (if this is indeed collection of personal information under the GDPR) is that any user can ask Homebrew which data they collected and/or to remove the data. To which they should comply.

                                                                  1. 3

                                                                    The data subjects are identifiable if they can be directly or indirectly identified, especially by […]

                                                                    As far as I can tell, you’re not actually citing the GDPR (CELEX 32016R0679), but rather a website that tries to make it more understandable.

                                                                    GDPR article 1(1):

                                                                    This Regulation lays down rules relating to the protection of natural persons with regard to the processing of personal data and rules relating to the free movement of personal data.

                                                                    GDPR article 4(1) defines personal data (emphasis mine):

                                                                    ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;


                                                                    Thus it does not apply to data about people that are netiher identified nor identifiable. An opaque identifier like 1BAB65CC-FE7F-4D8C-AB45-B7DB5A6BA9CB is not per se identifiable, but as per recital 26, determining whether a person is identifiable should take into account all means reasonably likely to be used, such as singling out, suggesting that “identifiable” in article 4(1) needs to be interpreted in a very practical sense. Recitals are not technically legally binding, but are commonly referred to for interpretation of the main text.

                                                                    Additionally, if IP addresses are stored along with the identifier (e.g. in logs), it’s game over in any case; even before GDPR, IP addresses (including dynamically assigned ones) were ruled by the ECJ to be personal data in Breyer v. Germany (ECLI:EU:C:2016:779 case no. C-582/14).

                                                                    1. 9

                                                                      Sorry for the short answer in my other comment. I was on my phone.

                                                                      Thus it does not apply to data about people that are netiher identified nor identifiable. An opaque identifier like 1BAB65CC-FE7F-4D8C-AB45-B7DB5A6BA9CB is not per se identifiable,

                                                                      The EC thinks differently:

                                                                      Examples of personal data

                                                                      a cookie ID;

                                                                      the advertising identifier of your phone;*

                                                                      https://ec.europa.eu/info/law/law-topic/data-protection/reform/what-personal-data_en

                                                                      It seems to me that an UUID is similar to cookie ID or advertising identifier. Using the identifier, it would also be trivially possible to link data. They use Google Analytics. Google could in principle cross-reference some application installs with Google searches and time frames. Based on the UUID they could then see all other applications that you have installed. Of course, Google does not do this, but this thought experimentat shows that such identifiers are not really anonymous (as pointed out in the working party opinion of 2014, linked on the EC page above).

                                                                      Again, IANAL, but it would probably be ok to reporting installs without any identifier linking the installations. They could also easily do this, make it opt-in, report all people who didn’t opt in using a single identifier, generate a random identifier for people who opt-in.

                                                                      1. 4

                                                                        They locked the PR talking about it and accused me of implying a legal threat for bringing it up. The maintainer who locked the thread seems really defensive about analytics.

                                                                        1. 3

                                                                          Once you pop, you can’t stop.

                                                                          I, too, thought that your pointing out their EU-illegal activity was distinct from a legal threat (presumably you are not a prosecutor), and that they were super lame for both mischaracterizing your statement and freaking out like that.

                                                                          1. 3

                                                                            The maintainer who locked the thread seems really defensive about analytics.

                                                                            It seems this is just a general trait. See e.g. this

                                                                          2. 1

                                                                            Now I really wish I had an ECJ decision to cite because at this point it’s an issue of interpretation. What is an advertising identifier in the sense that the EC understood it when they wrote that page—Is it persistent and can it be correlated with some other data to identify a person? Did they take into account web server logs when noting down the cookie ID?

                                                                            Interesting legal questions, but unfortunately nothing I have a clear answer to.

                                                                          3. 1

                                                                            Please cite the rest of paragraph 4, definitions:

                                                                            ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

                                                                            https://eur-lex.europa.eu/legal-content/en/TXT/?uri=CELEX%3A32016R0679

                                                                            Which was what I quoted.

                                                                            1. 1

                                                                              Your comment makes the following quotations:

                                                                              The data subjects are identifiable if they can be directly or indirectly identified, especially by reference to an identifier such as a name, an identification number, location data, an online identifier or one of several special characteristics, which expresses the physical, physiological, genetic, mental, commercial, cultural or social identity of these natural persons.

                                                                              Please ^F this entire string in the GDPR. I fail to find it as-is. They only start matching up in the latter half starting at “an identifier” and ending with “social identity”.

                                                                              (1) ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

                                                                              I agree it’s pedantic of me, but it’s not a 1:1 quote from the GDPR if a sentence is modified, no matter how small.


                                                                              I’ve edited in the second half in any case though. I do not, however, see any way that modification would invalidate any of the points I’ve made there, however.

                                                                          4. 2

                                                                            If that is true, consider submitting a PR, because GDPR violations are serious business.

                                                                            1. 3

                                                                              Or don’t submit a PR. As the project has stated:

                                                                              Do not open new threads on this topic.

                                                                              People have been banned from the project for doing exactly this.

                                                                              1. 7

                                                                                “We don’t want to hear complaints” is not a new stance for Homebrew.

                                                                                1. 2

                                                                                  Yeah, I got the impression that they are pretty hardline on this. I hope that they’ll reconsider before someone files a GDPR complaint.

                                                                                  Personally, I don’t really have a stake in this anymore, since I barely use my Mac.

                                                                                  I guess a more creative solution would be to fork the main repo and disable the analytics code and point people to that.

                                                                                  Edit: the linked PR is from before the GDPR though.

                                                                              2. 1

                                                                                But the above user didn’t post that did they? Your comment was meaningful and useful, but theirs was just sentimental. A law violation is a law violation, but OP just posted their own feelings about what they think is spyware and didn’t say anything about GDPR.

                                                                              3. 4

                                                                                hmm I disagree, the OP is claiming that we should have a unified standard for “Do_Not_Track”. Finn is arguing that we shouldn’t need such a standard because unless I specifically state that I would like to be tracked, I should not be tracked and that any attempts to track is a violation of consent. Finn here is specifically disagreeing with the website in question. Should we organize against attempts to track without explicit consent, or give a unified way to opt out. These are fundamentally different questions and are actually directly related. If I say everyone should be allowed into any yard unless they have a private property sign, that may cause real concern for people who feel that any yard shouldn’t permit trespassing unless they have explicit permission. They are different concerns, that are related, and are more nuanced than “thing is bad”.

                                                                              4. 1

                                                                                Okay. By your (non-accepted) definition, spyware abounds and is in common use.

                                                                                Simply calling it “spyware” and throwing up your hands doesn’t work. They have knobs to turn the spying off, to opt-out. I just want all those knobs to have the same label.

                                                                              1. 52

                                                                                Upvoted not because I think this is a good idea, but because I’m curious to get others’ opinions on it.

                                                                                This seems like a terrible, terrible idea. It’s yet another way of soft-forcing Google’s hegemony on the Web. Specifically:

                                                                                Badging is intended to identify when sites are authored in a way that makes them slow generally

                                                                                I’m pretty sure this actually means “badging is intended to identify when sites are authored in a way that makes them slow on Chrome…”

                                                                                And this isn’t like flagging a site that has an expired certificate or something. That is a legitimate security concern. This is making a value judgement on the content of the site itself, and making it seem like there’s something fundamentally wrong with the site if Google doesn’t like it.

                                                                                Nope.

                                                                                1. 21

                                                                                  I’m with you.

                                                                                  when sites are authored in a way that makes them slow on Chrome

                                                                                  And further, “badging is intended to identify when sites are authored without using AMP” - or whatever else Google tries to force people into using.

                                                                                  Seems like yet another way for Google to pretend to care whilst pushing their own agenda.

                                                                                  1. 9

                                                                                    They refer to two tools, one a Chrome extension (Lighthouse) that I didn’t bother to install, the other a website (PageSpeed Insights). I went for the latter to test a page that has no AMP or other “Google friendly” fluff and is otherwise quite light weight (https://www.coreboot.org) and got a few recommendations on what to improve: Compress some assets, set up caching policies and defer some javascript.

                                                                                    If that’s all they want, that seems fair to me.

                                                                                    (Disclosure: I work on Chrome OS firmware, but I have no insights in what the browser level folks are doing)

                                                                                    1. 4

                                                                                      Yeah, within 2 seconds of loading this link I was worried they were just pushing AMP. If they really are just pushing best practices then I’m cautiously optimistic about this change, and the fact that they didn’t mention AMP and instead linked to those tools gives me hope… but it’s Google, so who knows.

                                                                                      1. 11

                                                                                        I’m definitely still wary. I personally feel that Google sticking badges on websites they approve of is never going to end well, regardless of how scientific it may seem at the beginning.

                                                                                        I really feel like there are major parallels to be drawn between Google and the rules of Animal Farm.

                                                                                      2. 3

                                                                                        But that’s not all they want. Google and Chrome are now positioning themselves to visually tell users whether or not a site is “slow” (according to whatever metrics Google wants, which can of course change over time). As with most Google things, it will probably look reasonable on the surface, but long term just result in Google having even more control over websites and what they can and can’t do.

                                                                                        I would agree with you if it weren’t for Google’s long history of questionable decisions and abuses of their position as the (effective) gate keeper of the web,

                                                                                      3. 2

                                                                                        whatever else Google tries to force people into using.

                                                                                        QUIC?

                                                                                        1. 3

                                                                                          Not to mention that and SPDY being the basis of the “next” versions of HTTP (HTTP/2 and HTTP/3) which will no doubt be rabidly pushed for.

                                                                                          1. 2

                                                                                            Are there technical problems with HTTP/{2,3} or are you just worried because Google created them?

                                                                                            1. 1

                                                                                              Specifically the fact Google created them. As tinfoil-rambling as it sounds, they already deal out a rather extensive spying/targeted advertising network, a possibly-manipulable gateway to information, the most popular video service, the most popular browser, some rather popular programming languages (Go, Dart), power a large portion of Web services (Chrome’s V8 in Node.js, that and Blink for Electron), many alternative browsers being Chromium/Blink-based, the AMP kerfuffle, ReCAPTCHA, and maybe the future protocols as to how their vision of the Web works.

                                                                                              They keep encompassing more and more parts of the Web, both technical and nontechnical, and people keep guzzling that down like the newest Slurm flavor. That’s what worries me the most.

                                                                                              1. 3

                                                                                                people keep guzzling that down like the newest Slurm

                                                                                                http/2 and http/3 are the result of a multi-party standardization process. It’s not SPDY with just a new label.

                                                                                          2. 1

                                                                                            Yes.

                                                                                        2. 8

                                                                                          Generally I’m against prejudging companies like this, but Google has earned it, and then some.

                                                                                          Some sites that I use have Recaptcha. I use Firefox, and I can only pass the captcha if I log into Gmail first. Honestly, what kind of Orwellian horseshit is that?

                                                                                          1. 4

                                                                                            So, I can at least comprehend this one, even if I hate it.

                                                                                            The whole point of recaptcha is to make it hard to pass unless you can prove you’re a real person. Being logged in to a google account which is actively used for ‘real things’ (and does not attempt too many captchas) is a really hard-to-forge signal.

                                                                                          2. 1

                                                                                            What’s an example of a site that loads fast in Chrome but slowly in Firefox?

                                                                                            1. 8

                                                                                              Well, given what’s happening here…any site Google decides.

                                                                                              To be less pithy: this could be used as an embrace-extend-extinguish cycle. Sure, right now it’s all just general best-practices but what if later it’s “well, this site isn’t as fast as it could be because it’s not using Google’s extensions that only work in Chrome that would make it 0.49% faster, so we’ll flag it.”

                                                                                              I’m not saying Google is definitely going to do that, but…I don’t like them making this sort of determination for me.

                                                                                              It gives soft pressure to conform to Google’s “standards” whatever they may be. No website is going to want to have a big popup before it loads saying “This site is slow!” so they’ll do whatever it takes to have that not happen, including neglecting support for non-Chrome browsers.

                                                                                              1. 4

                                                                                                I don’t know if it’s still the case, but at one point YouTube deployed a site redesign that was based on a draft standard that Chrome implemented and Firefox did not (Firefox only supported the final, approved standard). As a result, the page loaded quickly in Chrome, but on Firefox it downloaded an additional JS polyfill, making the page noticeably slower to render.

                                                                                                1. 1

                                                                                                  How about Slack video calls? Those are a “loads never” in my book. (Still annoyed about that.)

                                                                                              1. 11

                                                                                                More people than you realize will take a URL, go to their favorite search engine, and type the URL into the search engine’s search field, never realizing they can actually edit the contents of the address bar above, [snip]

                                                                                                I paint this bleak picture primarily for the benefit of Internet veterans [snip]

                                                                                                If my description of “normal” users above surprised, shocked or disappointed you, you’re the target audience.

                                                                                                Hmm. I don’t see this as bleak at all. I see this as a great advancement in tech, as now even non-technical users have no problem navigating to any site they wish. What would they have done before search engines? I suspect they would have been locked out because it was too hard to use at the time.

                                                                                                1. 12

                                                                                                  now even non-technical users have no problem navigating to any site they wish.

                                                                                                  That doesn’t sound like the situation described. The situation described in the text you quoted says that non-technical users are only capable of visiting sites their search engine allows them to visit. It’s adding an additional layer or tracking and another opportunity for censorship, but only to non-technical users.

                                                                                                  1. 4

                                                                                                    says that non-technical users are only capable of visiting sites their search engine allows them to visit.

                                                                                                    Sure. What would those users have done before search engines?

                                                                                                    1. 7

                                                                                                      In the described case, and many I’ve seen over others’ shoulders, they are typing the URL they want to visit into a search engine. Without a search engine, they would do the same thing I do when I don’t have a bookmark to a site I want to visit that I haven’t visited in the past: type it in the URL bar or browser start page. They might get a character wrong, in which case they are no more susceptible to phishing and other problems as if they were doing the same into a search engine.

                                                                                                      I don’t think this in itself is much worse, but it does teach non-technical users to ignore that search engines are web sites, and instead they think it’s their browser. But it seems inevitable.

                                                                                                      1. 2

                                                                                                        I don’t think they’re typing the URL into the search engine. They’re typing the name of what they want into the search engine, like “facebook”. Google trends comparison. Edit: another common variation is “[service_name] login”.

                                                                                                        they would … type it in the URL bar or browser start page.

                                                                                                        Hmm, that’s exactly what the article’s author is complaining about - people not doing this because they don’t realize it’s a feature. I think the root cause of this is that users don’t understand what URLs are.

                                                                                                        And why should they? They can just type “facebook” into whatever input field is focused when the browser opens and eventually end up where they want. Had the user typed 4 more characters (”.com”) and into the browser’s URL bar instead, they’d end up at the same place and save the step of clicking a search result. Even an average person understands saving time and effort. So why don’t they do it?

                                                                                                        I think we’re assuming the average user is way more savvy than reality.

                                                                                                        1. 1

                                                                                                          I think users will do the least that they have to. I think then that not requiring they know the difference between a website and their computer should not be confusable!

                                                                                                          1. 1

                                                                                                            The article is saying they are going to a search page and entering the URL into the search. While a start page or address bar may also go to a search page, the browser will first attempt the text as a URL! (and in my case, neither is enabled to search - I have a search bar for that.)

                                                                                                    2. 4

                                                                                                      Maybe I’m being dense but I don’t see any difference between

                                                                                                      “type the exact digits into the phone application on your mobile”

                                                                                                      “type the exact website address into the browser address bar”

                                                                                                      from a point of usability. Imagine browsers never had added the omnibar.

                                                                                                      1. 3

                                                                                                        I would guess it’s something between the learning curve and level of standardization. Phones give you clear feedback that you did something wrong but provide zero help when you dial incorrectly, besides telling you that you did so. Omnibars provide suggestions that, with today’s very smart search engines, are almost always what you wanted. Browsers come in a variety of shapes and sizes, but phone dial pads are always the same. Phone numbers (at least, when dialing domestically) are always the same length and format. Web addresses have much more variation.

                                                                                                      2. 4

                                                                                                        What would they have done before search engines?

                                                                                                        What my parents and my grand parents eventually had to do, they would have learnt.

                                                                                                        That being said as an internet “veteran” of 25 years I still find it more convenient to type the name of a company into the nav bar and have my search engine of choice display a series of links of which the first one is usually what I am after rather than type in the whole url.

                                                                                                        All I can say is my teenage self would have been very disappointed if they saw how I navigate the internet today, what can I say? The lowest common denominator won out; in technology you either adapt or you die.