There is a very large robot in the room (I’d say elephant but I really like elephants), but it’s been there for so long that people on both sides of this robot just think of it as furniture, and don’t realize how much it’s colouring everyone’s reaction, and that robot’s name is Google.
I sympathise with the Go team’s position here. They have a product that’s extraordinarily widely deployed. It’s powering many enterprise codebases, where the deprecation of any feature is greeted with boos and jeers. And they need to overcome the sampling bias of surveys where of course everyone who’s still on High Sierra or Windows 7 or Novell Netware or whatever will show up to say they’re still using it, which is also understandable.
But their mothership has poisoned that well long ago and of course people go bananas. All the assurances about how reports aren’t associated with any identifying information are a lot more lukewarm when they’re coming from one of the companies that made shadow profiles a thing.
Plus, realistically, this isn’t any more “transparent” than most telemetry efforts made in good faith. A further step, which would warrant the “transparent” label in 2023, rather than 2013, would be, say:
Publishing the code that Google runs on the report collection server, so that we know all it does is, indeed, generate the reports being published – because Google has a history of swearing they’re only collecting non-identifying data, then using it to build identifiable profiles.
Publishing details about how the reports are used internally, so that we know they’re only used by the Go team for decisions about the golang toolchain – because this is 2023 and there are probably lots of people out there who would be fine with this data going to the golang team to improve their toolchain, but not with having it go to a dev experience team that’s building a Copilot clone.
Honestly I’m pretty frustrated that a valuable source of development data isn’t so easily accessible anymore – I doubt both that the Go team is in cahoots with evil schemers at Alphabet. I’m also probably not going to opt out, given that there’s not much value in the additional Go-related telemetry data, given that Google likely already knows what porn I watch and what early ’90s demos I liked so they can identify me well enough. But this is a hole that Google has dug, for everyone, not just for them.
Thank you for the level-headed and nuanced comment.
The posts are pretty long, so some stuff is bound to get lost, but the points you suggest are already part of the design. It goes even a little further, in fact: it’s not just the reports that are published, but all raw data that’s uploaded. In that sense, they can’t promise the data is going only to the Go team, but that’s because the data is public and anyone can download it.
The full raw data as collected is made public, so that project maintainers have no proprietary advantage or insights in their role as the direct data collector.
The server that collects the data will be open source, but the only thing that shows is that IP addresses are not collected.
The server would necessarily observe the source IP address in the TCP session uploading the report, but the server would not record that address with the data, a fact that can be confirmed by inspecting the reporting server source code (the server would be open source like the rest of Go) or by reference to a stated privacy policy like the one for the Go module mirror, depending on whether you lean more toward trusting software engineers or lawyers.
Yeah, that second one totally got lost, I should not be allowed near computers before I’ve had my coffee :-D. Thanks for pointing that out way more nicely than my drowsy post would’ve warranted!
I quite like open telemetry data in open source. One other instance I can think of is Debian’s opt-in popularity contest (popcon), which records number of users for each package: https://popcon.debian.org/
If you can be confident the data is safe to publish, and you actually do publish the data, that would personally make me feel much better about telemetry.
To warrant the “transparent” label in 2023 it’d need …
publishing …
reports …
swearing …
assurances …
pinky swears …
Problem is, do we even believe any of those? Should we? How about actually doing reproducible-build-style confirmation that we can actually check? Otherwise I’d say, especially in 2023 (i.e. in light of all the awful stuff that’s still getting relentlessly discovered 10 years after 2013 (/Snowden)), I’m not going to believe pretty much any of it, from any of them.
Google wants to keep track of people developing, for example, encryption software, so they can report those people to governments which will punish them. That’s why they’re doing this.
That’s plausible, but lots of things are plausible. Do you have any
concrete reason to suspect this, of all plausible reasons, is the real
one?
More likely, the reason this is being done is that Google has a love
affair with telemetry, analytics, and data. Their corporate motto
should be “All your data are belong to us.” The most important tool in
their toolbox is surveillance.
Google makes its money by optimizing for the most ads delivered to the
most people, and the tool they use to accomplish that is surveillance.
When you work in an environment that sees “gather more data” as the
solution to everything, it’s going to rub off on you.
I’m doubtful that this is overt malice, and I think it far more likely
that it’s just due to a bug in Google / Silicon Valley culture.
The reason Google can’t do opt-in telemetry is that Google wants the Go toolchain to report on politically sensitive software being developed with it and tell that information to governments which will punish the developers. This is, at this point, a fait accompli and our protests won’t change it, so the only moral thing to do is to poison the telemetry as much as possible.
This is a loaded claim; what kind of politically sensitive software would be impacted? What metrics would it use to determine if something if “politically sensitive?” Is there any evidence for this?
People yell at you when you break their favourite feature, and people yell at you when you try to determine whether a feature is still in use. How many of the commenters in that thread have read the linked blogpost about usecases and made a serious attempt at a counterproposal?
Sure you get less data but those who care will enable it
Care about what? Sending data? That’s not really a thing any user cares to do in general. How do I know in advance that the features I use are considered to be axed, so i can enable telemetry to “put in my vote”?
There’s no good solutions here. I assume whatever Go will add to the toolchain, distros will patch away anyway.
If this is opt-in you’ll have orders of magnitude less people turn it on, leading to less accurate stats. They’re already going to be doing a lot of sampling, but these sub-samples of the sample of people that opt-in may not be representative of the larger population.
Asking on first launch sounds like it’s between opt-in & opt-out assuming this is some blocking prompt. Did you enable KDE telemetry because it was opt-in, or do you think your data is “safe” with KDE developers?
People yell at you when you break their favourite feature, and people yell at you when you try to determine whether a feature is still in use.
You don’t need to use telemetry for that. It’s enough to ask. You can announce removal, you can ask them to speak up in a mailing list, Go can use the polls they use. The bonus is then that you even can find out why they use it, if they really intend to use it, if they misunderstand something, if it’s just widely used in old code bases, etc.
It’s often pretty hard, sometimes out-right impossible to assume whys via telemetry.
Take part in a survey, then read the interpretation. You’ll find misinterpretations.
One of my favorite things is absence of nothing options. Or interpreting the majority of people voting on “What would you like to see improved?” as a demand. Especially when there is multiple choice and no none option. A lot of people imagine what they might potentially consider good for their use case, even if it’s no big deal or not a problem at all in real life, you’ll find people ticking things like performance, because performance is always better?
So in the end you will always have to actually talk to people. You can’t do these things easily correctly large scale. Too bad, but that’s how it is.
For people using X you never have the slightest clue if it’ forced, because they don’t know better, because there is a famous tutorial doing that, because they’ve learned it that way, etc.
Telemetry only tells you things like counts. And you can see if the counts change, if something else changes, but even then there is a huge potential for misinterpretation. Did the change make people aware? Did something completely unrelated happen? Is this actually a long term trend?
Even with specialists working on these things there’s a lot of guessing involved. You’d kinda have to rule out any other reason. At this point why not stick to more direct approaches, even if they are not so large scale?
Also I don’t think features, adding or removing them are a great thing for telemetry. If few people use something and that’s why you remove something they will still yell. I’d ask the community and see if they have other options.
And one last thing related to Open Source. It of course depends on the tool, but there is a good chance that the thing that most people want will be worked on the most, given that it’s reasonably easy enough to contribute. Of course it depends on how hard the feature implementation is.
You don’t need to use telemetry for that. It’s enough to ask.
If you already have the data, then you don’t need to rely on people’s responses. They should probably be doing both when they remove features, but responses in a mailing list won’t give you as big a picture as telemetry would.
So in the end you will always have to actually talk to people. You can’t do these things easily correctly large scale. Too bad, but that’s how it is.
I need to talk to people in order to collect stats, but talking to them won’t scale, so I’m SOL? Take this question from the post on use cases:
What fraction of Go installations run on a particular major version of an operating system such as Windows 7, Windows 8, or WSL?
This is a totally normal thing to want to know, and will lead to better informed actions later on. Whether it’s adding/removing a features, prioritizing bugs, simplifying installation steps, understanding the scope of a security vuln, etc.
Talking to people won’t give you an accurate picture, and as you said it certainly doesn’t scale.
Google wants to keep track of people developing, for example, encryption software, so they can report those people to governments which will punish them. That’s why they’re doing this.
sure, and it will probably be a configuration file that will also require a special defined value so it doesn’t put it somewhere like $HOME/go
It’s just another nightmare of nightmares.
To make it more clear, it just sounds like my Go block will get bigger to force these various anti-user changes.
Example of how I tame npm below:
# BEGIN js #
if hash npm 2>/dev/null; then
export NPM_CONFIG_USERCONFIG="$XDG_CONFIG_HOME/npm/config"
if [[ ! -f $XDG_CONFIG_HOME/npm/config ]]; then
mkdir -p $XDG_CONFIG_HOME/npm $XDG_DATA_HOME/npm
{
print -- "prefix=$XDG_DATA_HOME/npm"
print -- "cache=$XDG_CACHE_HOME/npm"
} > $NPM_CONFIG_USERCONFIG
fi
path=($XDG_DATA_HOME/npm/bin $path)
fi
GOPATH still defaults to ~/go, but all the other Go settings live in https://pkg.go.dev/os#UserConfigDir which is $XDG_CONFIG_HOME on Unix machines. The distro specific go.env would be saved in GOROOT along with the executables, so probably /usr/bin/go or something.
I’m all got telemetry, but it should be opt in. Even if the toolchain asks you for permission the first time you use it, this shouldn’t be a silent feature that is left as opt out.
What if it just warns you the first time? It would be annoying if the first time you try to use the Go tool in Docker you have to pass a --agree flag to keep it from breaking.
Uploaded reports do not include user IDs, machine IDs, or any other kind of ID.
IP addresses exposed by the HTTP session that uploads the report are not recorded with the reports.
That’s good. I don’t want to be identified in any way. But there’s a problem: there’s no way for me to check whether the second point actually holds. I can in principle audit my logs and my tool’s source code and my network traffic… but I can’t know what the server actually does with my IP when it receives my weekly report.
Another problem, which is likely to incentivise Google to record those IP addresses anyway, is security: how will they deal with adversarial inputs if they can’t identify the sender? I guess they could ban bad IPs, but choosing which IP to ban still requires careful analysis of the data over some period of time… which likely means recording those IPs during that time. I can totally see them relaxing the “never record IP” rule to “erase IP after some amount of time, and maintain some blacklist.
In fact, the easiest way to deal with bad reports is likely to assign everyone a different key, which would then be used to sign the reports. Now reports are identified by keys instead of IP (more reliable), and we ban keys, not IPs. But now even the “no ID in reports” rule is broken.
I don’t know how to solve this to be honest. If it even can be solved, short of “trussst in meee”. And it’s a bit hard for me to trust Google.
The author gives two reasons why they think that their proposal will be opposed:
The first is the significant privacy cost to users of collecting and storing detailed activity traces. The second is the fact that access to this data must be restricted, which would make the project less open than most strive to be.
There are more! The third is the complication of all workflows; tools which previously worked offline now sporadically enter codepaths which can break and require additional capabilities to run. The fourth is the possibility of users sending garbage data simply because they can. The fifth is the massive sampling bias introduced in the second post by the fact that many distros will unconditionally disable this telemetry for all prebuilt Go packages. The sixth is that removing features based on percentage of users enjoying the feature will marginalize users who don’t use tools in ways approved by the authors, removing choice and flexibility from supposedly-open-source toolchains.
Not the most ridiculous proposal from the author, but clearly not fully baked. An enormous amount of words were spent explaining in technical detail how the author plans to violate their users’ privacies without overt irritation. Consider this gem from the third post:
The Go build cache is a critical part of the user experience, but we don’t know how well it works in practice.
How do they know that it’s critical, then? So much ego and so little understanding of people as individuals.
Finally, I hope it is clear that none of this is terribly specific to Go. Any open source project with more than a few users has the problem of understanding how the software is used and how well it’s working.
The author is explicitly telling us that they should not be trusted to publish Free Software.
The author doesn’t understand the problem, and intervention may be required in order to teach them a lesson.
A) Russ clearly understands that users can submit garbage data.
B) It’s pretty anti-social to be so opposed to telemetry that instead of merely opting out or boycotting a product you actively send junk data. It’s fine to take direct action against something that is harming you. For example, removing the speakers that play ads at gas station pumps in the US now is a positive good because those ads violate the social contract and provide no benefit to consumers who just want to pump their gas in peace. But this telemetry is meant to benefit Go as an open source tool, not line Google’s pockets. You can disagree about if it’s too intrusive if you want, but taking active measures against it is an uncalled for level of counter-aggression.
We must have different life experiences. Quoting from the Calvin and Hobbes strip published on August 23, 1995:
Calvin: I’m filling out a reader survey for Chewing magazine.
Calvin: See, they ask me how much money I spend on gum each week, so I wrote, “$500.” For my age, I put “43,” and when they asked what my favorite flavor is, I wrote “garlic/curry.”
Hobbes: This magazine should have some amusing ads soon.
Calvin: I love messing with data.
We don’t need to send data to an advertiser so that the advertiser can improve their particular implementation of a programming language. If this sort of telemetry is truly necessary, then let’s find or establish a reputable data steward who will serve the language’s community without preference to any one particular toolchain.
Twice you’ve used the phrase “anti-social”. We aren’t in the UK and your meme doesn’t work here. If you want to complain about a collective action being taken against a corporation, then find a better word, because the sheer existence of Google is damaging to the fabric of society.
Point (B) deserves to be addressed too. You seem to think that an advertiser only causes harm when they advertise. However, at Google’s size, we generally recognize that advertising causes harm merely by collecting structured data and correlating it.
If the problem is that Google exists at all, then get people in SF to throw eggs at the Google buses or bribe some Senators into letting the million and half pending anti-trust lawsuits go through. I don’t see how hurting the telemetry data for Go has any connection whatsoever to the goal of a world without Google. All it does is harm people who would benefit from a better Go compiler/tools. You can hate Google all day, and you probably should. I deliberately have never applied to work at Google because I don’t believe in their corporate mission. However, none of that has anything to do with the issue at hand, which is adding telemetry to Go. If you can show that they’re going to secretly use the telemetry to send Big Mac ads to developers, then you can be mad at them, but you haven’t shown that.
I don’t see how hurting the telemetry data for Go has any connection
whatsoever to the goal of a world without Google.
I’d argue that collecting data via ET Phone Home telemetry without
explicit consent is a grievous breach of the Social Contract. While I
myself am a non-belligerent person, you can bet that there are people
who share my outlook about consent and say that one bad turn deserves
another. Poisoning telemetry data is a tactic in line with the
traditions of Luddism and sabotage. It’s throwing a monkey-wrench into
the grinding gears of the machine. While I’m not gonna sit here and
encourage it (I’m not stupid), I do understand the mindset.
“Rub some telemetry on it” is definitely in line with Google culture.
Poisoning the stream won’t stop that.
Yeah, kids, be good and don’t poison the datastream. The ethical
approach is to stop using Go if they add opt-out telemetry to their
toolchain.
A marketing company that calls your home phone number at dinner time is scummy, but you don’t get to yell at the person on the other end of the phone because they’re just taking the best low paying job they could find. Instead you have to get the https://en.wikipedia.org/wiki/National_Do_Not_Call_Registry enacted and get funding to enforce the law.
A marketing company that calls your home phone number at dinner time is
scummy, but you don’t get to yell at the person on the other end of the
phone because they’re just taking the best low paying job they could
find.
Legally, I sure do. If somebody invades my personal space, I’m well
within my right to yell at them for it, even though they’re just some
poor schlub doing a job. Morally / ethically, it’s a different story.
The compassionate person will refrane from yelling at the
aforementioned poor schlub. Even then, it’s worth noting that the most
saintly of us has bad days and doesn’t always do the moral /
compassionate thing. My point? People who script-read for
telemarketing are going to be verbally abused by people whose right to
be let alone has been violated, and we cannot pretend otherwise.
At what point does “I’m just doing my job” go from being a reason for
compassion to an excuse? I don’t have the answer to that, but I do try
to be a compassionate person.
These two scenarios aren’t equivalent anyway. The telemarketing
equivalent of poisoning the telemetry stream isn’t yelling at some
unfortunate script-reader. The equivalent is lying to the marketing
company, wasting its resources, or possibly defrauding it. Think:
trolling the Jehovah’s Witnesses and Mormons who invite themselves to
your door to share their religion. This can be lots of fun, and it
wastes the other person’s cycles while simultaneously preventing them
from harassing somebody else.
In 2000, someone called me up from one of those multi-level marketing
scams, after a friend referred them to me. I spent three glorious hours
on the phone on a Sunday night trolling some scammer. It was better
than whatever was on TV at the time, no doubt.
I’m sure there are people who do this sort of thing to telemarketers.
It isn’t abuse, but it does cause them to waste cycles while getting
paid and possibly entertained, yet preventing them from doing active
harm to others.
For lack of a better term, I’ll call this array of tactics
“psychological warfare countermeasures”, to contrast them with
electronic warfare countermeasures.
Of course, poisoning an opt-out telemetry stream is an electronic warfare
countermeasure: the equivalent of deploying chaff to confuse enemy
radar.
And I’ll end with the “Kids, don’t poison a telemetry stream” disclaimer
I used yesterday.
No. You aren’t correctly reading what I wrote. I’m saying that somebodywill send them junk data in the future, and I anticipate that nothing short of that future situation will make them understand their mistake.
The plain reading of “intervention may be required in order to teach them a lesson” is that you’re rallying people to intervene, ie spam the system. If that’s not what you mean, then I’m sorry for misinterpreting you, but alyx apparently has the same misinterpretation.
Maybe my final comment in the GitHub discussion will make it clear to you and @alyx that, although I am thinking adversarially, I am still a whitehat in this discussion. I made similar comments during Audacity’s telemetry proposal, and did not develop any software to attack Muse Group.
Perhaps what irritates you is that I find the whole situation amusing. I don’t really respect the author’s understanding of society, and I think that often the correct thing to do in case of disaster is to learn a lesson. The author may not learn a lesson until somebody interferes with their plans, and I expect that the fallout will be hilarious, but I am not necessarily that somebody.
Nor, frankly, do I have thousands of spare USD/mo to waste on cloud instances just for the purpose of distracting Google. What do I look like, the government?
The sixth is that removing features based on percentage of users enjoying the feature will marginalize users who don’t use tools in ways approved by the authors, removing choice and flexibility from supposedly-open-source toolchains.
This is good. The tool chains are still open source, that is a fact. This will:
Make the code base more maintainable, thus over time making it less buggy
Focus the thinking of those involved on use cases that matter the most, making more people happier
Make the system overall simpler, and easier to understand for all users.
The world cannot be built around power users/tech elite. That would be bad.
The fifth is the massive sampling bias introduced in the second post by the fact that many distros will unconditionally disable this telemetry for all prebuilt Go packages
The fourth is the possibility of users sending garbage data simply because they can.
The data can be interpreted with its limitations in mind. That seems better than simply having no data whatsoever about the world.
So much ego and so little understanding of people as individuals.
I don’t see how this follows from Russ’ belief the build cache is critical to Go’s UX.
Focus the thinking of those involved on use cases that matter the most, making more people happier
I don’t think it’s valid to conclude that the use cases that occur the most are the use cases that matter the most. Blind and vision-impaired people make up a small portion of all computer users; do you think it would be valid to ignore their needs in favor of making experiences better for users without vision impairments?
True, I agree with you. But I think its a safe default to reject supporting use cases with few users. There are some concerns in specific cases which reasonably can overrule this principle, but I think the bar should be high for spending effort on rare uses.
Every project has limited resources to spend. This by definition means that some work will not be done. If you have two bugs with same severity it might be useful to know how many users each is impacting and factor that in decisions about what to work on and in what order.
Vision impaired users are an interesting comparison. There are lots of vision impaired people in the world. A neighbor of mine happens to work at an association for the blind/vision impaired. Many people wear glasses. Many people become blind or near blind in old age. All of us are vision impaired when it is dark, or when we use our eyes to look at something else while trying to use a computer simultaneously. Even if the absolute percentage of blind people is not that high, it ends up being a lot of people, especially factoring in situational blindness.
One can imagine a lot of other conditions that might also be good to accommodate for, but which are less common and more difficult to adapt to. In the end, I think the ADA’s “reasonable accommodation” standard is broadly correct. There has to be a balance to try to include people when the costs are bearable without making things so expensive that it’s not possible for the majority to use it either.
Google wants to keep track of people developing, for example, encryption software, so they can report those people to governments which will punish them. That’s why they’re doing this.
There is a very large robot in the room (I’d say elephant but I really like elephants), but it’s been there for so long that people on both sides of this robot just think of it as furniture, and don’t realize how much it’s colouring everyone’s reaction, and that robot’s name is Google.
I sympathise with the Go team’s position here. They have a product that’s extraordinarily widely deployed. It’s powering many enterprise codebases, where the deprecation of any feature is greeted with boos and jeers. And they need to overcome the sampling bias of surveys where of course everyone who’s still on High Sierra or Windows 7 or Novell Netware or whatever will show up to say they’re still using it, which is also understandable.
But their mothership has poisoned that well long ago and of course people go bananas. All the assurances about how reports aren’t associated with any identifying information are a lot more lukewarm when they’re coming from one of the companies that made shadow profiles a thing.
Plus, realistically, this isn’t any more “transparent” than most telemetry efforts made in good faith. A further step, which would warrant the “transparent” label in 2023, rather than 2013, would be, say:
Honestly I’m pretty frustrated that a valuable source of development data isn’t so easily accessible anymore – I doubt both that the Go team is in cahoots with evil schemers at Alphabet. I’m also probably not going to opt out, given that there’s not much value in the additional Go-related telemetry data, given that Google likely already knows what porn I watch and what early ’90s demos I liked so they can identify me well enough. But this is a hole that Google has dug, for everyone, not just for them.
Thank you for the level-headed and nuanced comment.
The posts are pretty long, so some stuff is bound to get lost, but the points you suggest are already part of the design. It goes even a little further, in fact: it’s not just the reports that are published, but all raw data that’s uploaded. In that sense, they can’t promise the data is going only to the Go team, but that’s because the data is public and anyone can download it.
The server that collects the data will be open source, but the only thing that shows is that IP addresses are not collected.
Yeah, that second one totally got lost, I should not be allowed near computers before I’ve had my coffee :-D. Thanks for pointing that out way more nicely than my drowsy post would’ve warranted!
I quite like open telemetry data in open source. One other instance I can think of is Debian’s opt-in popularity contest (popcon), which records number of users for each package: https://popcon.debian.org/
If you can be confident the data is safe to publish, and you actually do publish the data, that would personally make me feel much better about telemetry.
Problem is, do we even believe any of those? Should we? How about actually doing reproducible-build-style confirmation that we can actually check? Otherwise I’d say, especially in 2023 (i.e. in light of all the awful stuff that’s still getting relentlessly discovered 10 years after 2013 (/Snowden)), I’m not going to believe pretty much any of it, from any of them.
Google wants to keep track of people developing, for example, encryption software, so they can report those people to governments which will punish them. That’s why they’re doing this.
That’s plausible, but lots of things are plausible. Do you have any concrete reason to suspect this, of all plausible reasons, is the real one?
More likely, the reason this is being done is that Google has a love affair with telemetry, analytics, and data. Their corporate motto should be “All your data are belong to us.” The most important tool in their toolbox is surveillance.
Google makes its money by optimizing for the most ads delivered to the most people, and the tool they use to accomplish that is surveillance.
When you work in an environment that sees “gather more data” as the solution to everything, it’s going to rub off on you.
I’m doubtful that this is overt malice, and I think it far more likely that it’s just due to a bug in Google / Silicon Valley culture.
The reason Google can’t do opt-in telemetry is that Google wants the Go toolchain to report on politically sensitive software being developed with it and tell that information to governments which will punish the developers. This is, at this point, a fait accompli and our protests won’t change it, so the only moral thing to do is to poison the telemetry as much as possible.
This is a loaded claim; what kind of politically sensitive software would be impacted? What metrics would it use to determine if something if “politically sensitive?” Is there any evidence for this?
People yell at you when you break their favourite feature, and people yell at you when you try to determine whether a feature is still in use. How many of the commenters in that thread have read the linked blogpost about usecases and made a serious attempt at a counterproposal?
Simple solution: Make it opt-in. Sure you get less data but those who care will enable it. Could even ask on first launch.
I’ve enabled all the KDE telemetry since it was opt-in.
Care about what? Sending data? That’s not really a thing any user cares to do in general. How do I know in advance that the features I use are considered to be axed, so i can enable telemetry to “put in my vote”?
There’s no good solutions here. I assume whatever Go will add to the toolchain, distros will patch away anyway.
If this is opt-in you’ll have orders of magnitude less people turn it on, leading to less accurate stats. They’re already going to be doing a lot of sampling, but these sub-samples of the sample of people that opt-in may not be representative of the larger population.
Asking on first launch sounds like it’s between opt-in & opt-out assuming this is some blocking prompt. Did you enable KDE telemetry because it was opt-in, or do you think your data is “safe” with KDE developers?
You don’t need to use telemetry for that. It’s enough to ask. You can announce removal, you can ask them to speak up in a mailing list, Go can use the polls they use. The bonus is then that you even can find out why they use it, if they really intend to use it, if they misunderstand something, if it’s just widely used in old code bases, etc.
It’s often pretty hard, sometimes out-right impossible to assume whys via telemetry.
Take part in a survey, then read the interpretation. You’ll find misinterpretations.
One of my favorite things is absence of nothing options. Or interpreting the majority of people voting on “What would you like to see improved?” as a demand. Especially when there is multiple choice and no none option. A lot of people imagine what they might potentially consider good for their use case, even if it’s no big deal or not a problem at all in real life, you’ll find people ticking things like performance, because performance is always better?
So in the end you will always have to actually talk to people. You can’t do these things easily correctly large scale. Too bad, but that’s how it is.
For people using X you never have the slightest clue if it’ forced, because they don’t know better, because there is a famous tutorial doing that, because they’ve learned it that way, etc.
Telemetry only tells you things like counts. And you can see if the counts change, if something else changes, but even then there is a huge potential for misinterpretation. Did the change make people aware? Did something completely unrelated happen? Is this actually a long term trend?
Even with specialists working on these things there’s a lot of guessing involved. You’d kinda have to rule out any other reason. At this point why not stick to more direct approaches, even if they are not so large scale?
Also I don’t think features, adding or removing them are a great thing for telemetry. If few people use something and that’s why you remove something they will still yell. I’d ask the community and see if they have other options.
And one last thing related to Open Source. It of course depends on the tool, but there is a good chance that the thing that most people want will be worked on the most, given that it’s reasonably easy enough to contribute. Of course it depends on how hard the feature implementation is.
all of this is addressed in this other linked blogpost
If you already have the data, then you don’t need to rely on people’s responses. They should probably be doing both when they remove features, but responses in a mailing list won’t give you as big a picture as telemetry would.
I need to talk to people in order to collect stats, but talking to them won’t scale, so I’m SOL? Take this question from the post on use cases:
This is a totally normal thing to want to know, and will lead to better informed actions later on. Whether it’s adding/removing a features, prioritizing bugs, simplifying installation steps, understanding the scope of a security vuln, etc.
Talking to people won’t give you an accurate picture, and as you said it certainly doesn’t scale.
Google wants to keep track of people developing, for example, encryption software, so they can report those people to governments which will punish them. That’s why they’re doing this.
That is absolutely insane and I also don’t appreciate the necroposting
These stories should be merged: https://lobste.rs/s/bhbqkb/transparent_telemetry_for_open_source
Thanks!
For those who want to opt-out:
export GOTELEMETRY=off
(I think this is the environment flag, not sure)
Recent versions of Go respect an env var if it is set but also let you set config with
go env -w whatever
.Oh yay, another environment variable in my list of hundreds.
There are ways to set it besides using an ENV var. They’re going to add a thing so that Linux distros can disable it by default, for example.
sure, and it will probably be a configuration file that will also require a special defined value so it doesn’t put it somewhere like $HOME/go It’s just another nightmare of nightmares.
To make it more clear, it just sounds like my Go block will get bigger to force these various anti-user changes.
Example of how I tame npm below:
GOPATH still defaults to ~/go, but all the other Go settings live in https://pkg.go.dev/os#UserConfigDir which is $XDG_CONFIG_HOME on Unix machines. The distro specific go.env would be saved in GOROOT along with the executables, so probably /usr/bin/go or something.
All the articles in the series: https://research.swtch.com/telemetry.
I’m all got telemetry, but it should be opt in. Even if the toolchain asks you for permission the first time you use it, this shouldn’t be a silent feature that is left as opt out.
What if it just warns you the first time? It would be annoying if the first time you try to use the Go tool in Docker you have to pass a
--agree
flag to keep it from breaking.Is not sending telemetry to Go developers what you are characterizing as “breakage” here?
No, the breakage is prompting for Y/N. It’s why every stupid command for apt in a Docker file has to include pointless -y flag.
That’s good. I don’t want to be identified in any way. But there’s a problem: there’s no way for me to check whether the second point actually holds. I can in principle audit my logs and my tool’s source code and my network traffic… but I can’t know what the server actually does with my IP when it receives my weekly report.
Another problem, which is likely to incentivise Google to record those IP addresses anyway, is security: how will they deal with adversarial inputs if they can’t identify the sender? I guess they could ban bad IPs, but choosing which IP to ban still requires careful analysis of the data over some period of time… which likely means recording those IPs during that time. I can totally see them relaxing the “never record IP” rule to “erase IP after some amount of time, and maintain some blacklist.
In fact, the easiest way to deal with bad reports is likely to assign everyone a different key, which would then be used to sign the reports. Now reports are identified by keys instead of IP (more reliable), and we ban keys, not IPs. But now even the “no ID in reports” rule is broken.
I don’t know how to solve this to be honest. If it even can be solved, short of “trussst in meee”. And it’s a bit hard for me to trust Google.
The author gives two reasons why they think that their proposal will be opposed:
There are more! The third is the complication of all workflows; tools which previously worked offline now sporadically enter codepaths which can break and require additional capabilities to run. The fourth is the possibility of users sending garbage data simply because they can. The fifth is the massive sampling bias introduced in the second post by the fact that many distros will unconditionally disable this telemetry for all prebuilt Go packages. The sixth is that removing features based on percentage of users enjoying the feature will marginalize users who don’t use tools in ways approved by the authors, removing choice and flexibility from supposedly-open-source toolchains.
Not the most ridiculous proposal from the author, but clearly not fully baked. An enormous amount of words were spent explaining in technical detail how the author plans to violate their users’ privacies without overt irritation. Consider this gem from the third post:
How do they know that it’s critical, then? So much ego and so little understanding of people as individuals.
The author is explicitly telling us that they should not be trusted to publish Free Software.
Edit: I explicitly asked the author about the fourth reason. The author doesn’t understand the problem, and intervention may be required in order to teach them a lesson.
A) Russ clearly understands that users can submit garbage data.
B) It’s pretty anti-social to be so opposed to telemetry that instead of merely opting out or boycotting a product you actively send junk data. It’s fine to take direct action against something that is harming you. For example, removing the speakers that play ads at gas station pumps in the US now is a positive good because those ads violate the social contract and provide no benefit to consumers who just want to pump their gas in peace. But this telemetry is meant to benefit Go as an open source tool, not line Google’s pockets. You can disagree about if it’s too intrusive if you want, but taking active measures against it is an uncalled for level of counter-aggression.
We must have different life experiences. Quoting from the Calvin and Hobbes strip published on August 23, 1995:
We don’t need to send data to an advertiser so that the advertiser can improve their particular implementation of a programming language. If this sort of telemetry is truly necessary, then let’s find or establish a reputable data steward who will serve the language’s community without preference to any one particular toolchain.
A) Calvin is meant to be an anti-social twerp, not a role model.
B) Do you think the Go tool is going to start serving DoubleClick ads? There’s no analogy here.
Twice you’ve used the phrase “anti-social”. We aren’t in the UK and your meme doesn’t work here. If you want to complain about a collective action being taken against a corporation, then find a better word, because the sheer existence of Google is damaging to the fabric of society.
Point (B) deserves to be addressed too. You seem to think that an advertiser only causes harm when they advertise. However, at Google’s size, we generally recognize that advertising causes harm merely by collecting structured data and correlating it.
If the problem is that Google exists at all, then get people in SF to throw eggs at the Google buses or bribe some Senators into letting the million and half pending anti-trust lawsuits go through. I don’t see how hurting the telemetry data for Go has any connection whatsoever to the goal of a world without Google. All it does is harm people who would benefit from a better Go compiler/tools. You can hate Google all day, and you probably should. I deliberately have never applied to work at Google because I don’t believe in their corporate mission. However, none of that has anything to do with the issue at hand, which is adding telemetry to Go. If you can show that they’re going to secretly use the telemetry to send Big Mac ads to developers, then you can be mad at them, but you haven’t shown that.
I’d argue that collecting data via ET Phone Home telemetry without explicit consent is a grievous breach of the Social Contract. While I myself am a non-belligerent person, you can bet that there are people who share my outlook about consent and say that one bad turn deserves another. Poisoning telemetry data is a tactic in line with the traditions of Luddism and sabotage. It’s throwing a monkey-wrench into the grinding gears of the machine. While I’m not gonna sit here and encourage it (I’m not stupid), I do understand the mindset.
“Rub some telemetry on it” is definitely in line with Google culture. Poisoning the stream won’t stop that.
Yeah, kids, be good and don’t poison the datastream. The ethical approach is to stop using Go if they add opt-out telemetry to their toolchain.
A marketing company that calls your home phone number at dinner time is scummy, but you don’t get to yell at the person on the other end of the phone because they’re just taking the best low paying job they could find. Instead you have to get the https://en.wikipedia.org/wiki/National_Do_Not_Call_Registry enacted and get funding to enforce the law.
Legally, I sure do. If somebody invades my personal space, I’m well within my right to yell at them for it, even though they’re just some poor schlub doing a job. Morally / ethically, it’s a different story. The compassionate person will refrane from yelling at the aforementioned poor schlub. Even then, it’s worth noting that the most saintly of us has bad days and doesn’t always do the moral / compassionate thing. My point? People who script-read for telemarketing are going to be verbally abused by people whose right to be let alone has been violated, and we cannot pretend otherwise.
At what point does “I’m just doing my job” go from being a reason for compassion to an excuse? I don’t have the answer to that, but I do try to be a compassionate person.
These two scenarios aren’t equivalent anyway. The telemarketing equivalent of poisoning the telemetry stream isn’t yelling at some unfortunate script-reader. The equivalent is lying to the marketing company, wasting its resources, or possibly defrauding it. Think: trolling the Jehovah’s Witnesses and Mormons who invite themselves to your door to share their religion. This can be lots of fun, and it wastes the other person’s cycles while simultaneously preventing them from harassing somebody else.
In 2000, someone called me up from one of those multi-level marketing scams, after a friend referred them to me. I spent three glorious hours on the phone on a Sunday night trolling some scammer. It was better than whatever was on TV at the time, no doubt.
I’m sure there are people who do this sort of thing to telemarketers. It isn’t abuse, but it does cause them to waste cycles while getting paid and possibly entertained, yet preventing them from doing active harm to others.
For lack of a better term, I’ll call this array of tactics “psychological warfare countermeasures”, to contrast them with electronic warfare countermeasures.
Of course, poisoning an opt-out telemetry stream is an electronic warfare countermeasure: the equivalent of deploying chaff to confuse enemy radar.
And I’ll end with the “Kids, don’t poison a telemetry stream” disclaimer I used yesterday.
Are you really trying to rally lobsters into sending junk data to the Go telemetry?
No. You aren’t correctly reading what I wrote. I’m saying that somebody will send them junk data in the future, and I anticipate that nothing short of that future situation will make them understand their mistake.
The plain reading of “intervention may be required in order to teach them a lesson” is that you’re rallying people to intervene, ie spam the system. If that’s not what you mean, then I’m sorry for misinterpreting you, but alyx apparently has the same misinterpretation.
Maybe my final comment in the GitHub discussion will make it clear to you and @alyx that, although I am thinking adversarially, I am still a whitehat in this discussion. I made similar comments during Audacity’s telemetry proposal, and did not develop any software to attack Muse Group.
Perhaps what irritates you is that I find the whole situation amusing. I don’t really respect the author’s understanding of society, and I think that often the correct thing to do in case of disaster is to learn a lesson. The author may not learn a lesson until somebody interferes with their plans, and I expect that the fallout will be hilarious, but I am not necessarily that somebody.
Nor, frankly, do I have thousands of spare USD/mo to waste on cloud instances just for the purpose of distracting Google. What do I look like, the government?
This is good. The tool chains are still open source, that is a fact. This will:
The world cannot be built around power users/tech elite. That would be bad.
The data can be interpreted with its limitations in mind. That seems better than simply having no data whatsoever about the world.
I don’t see how this follows from Russ’ belief the build cache is critical to Go’s UX.
I don’t think it’s valid to conclude that the use cases that occur the most are the use cases that matter the most. Blind and vision-impaired people make up a small portion of all computer users; do you think it would be valid to ignore their needs in favor of making experiences better for users without vision impairments?
True, I agree with you. But I think its a safe default to reject supporting use cases with few users. There are some concerns in specific cases which reasonably can overrule this principle, but I think the bar should be high for spending effort on rare uses.
Mandatory XKCD? https://xkcd.com/1172 :-)
Every project has limited resources to spend. This by definition means that some work will not be done. If you have two bugs with same severity it might be useful to know how many users each is impacting and factor that in decisions about what to work on and in what order.
Vision impaired users are an interesting comparison. There are lots of vision impaired people in the world. A neighbor of mine happens to work at an association for the blind/vision impaired. Many people wear glasses. Many people become blind or near blind in old age. All of us are vision impaired when it is dark, or when we use our eyes to look at something else while trying to use a computer simultaneously. Even if the absolute percentage of blind people is not that high, it ends up being a lot of people, especially factoring in situational blindness.
One can imagine a lot of other conditions that might also be good to accommodate for, but which are less common and more difficult to adapt to. In the end, I think the ADA’s “reasonable accommodation” standard is broadly correct. There has to be a balance to try to include people when the costs are bearable without making things so expensive that it’s not possible for the majority to use it either.
Google wants to keep track of people developing, for example, encryption software, so they can report those people to governments which will punish them. That’s why they’re doing this.