1. 32

  2. 12

    Incidentally, archive.codeplex.com (still owned by Microsoft!) has been marked as containing harmful programs by Google Safe Browsing. As in, all of it. This is mildly entertaining to me. If inactive/archived code repositories are now getting flagged, how come code.google.com/archive isn’t?

    And finally, I am also providing my binaries on my Discord server in a special #releases channel so that there’s a method of obtaining the binaries outside of web browsers where pages and files can be blocked.

    Infosec Twitter has been trying to convince Discord to actually scan executables for malware. I wonder if this won’t end up with Discord going down the code signing route, too.

    1. 12

      Infosec Twitter has been trying to convince Discord to actually scan executables for malware. I wonder if this won’t end up with Discord going down the code signing route, too.

      Article author here: I think that’s a great idea to scan binaries for malware. Google’s Safe Browsing flags binaries as “harmful content” without scanning them at all, solely on the basis that it hasn’t seen those particular binaries ‘much’ before.

      If it were to run a Virus Total scan like this on my file before flagging it, it would have seen the file was safe in 70 of 70 different scanners. If it had considered that my domain was 14 years old and never once hosted anything harmful, that would have also been great.

      Unfortunately, Safe Browsing is a shoot-first, don’t let alone ever contact you to ask questions approach =/

    2. 10

      Well, to begin with, don’t use Google Chrome. And do yourself a better favor and get off all Google products as reasonably possible.

      Secondly, for a legal side, this sure seems like AI-automated libel.It would be nice if, say, the EFF could take a shot at them for libel.

      1. 2

        Wikipedia says that the Google Safe Browsing API is used by Safari, Firefox, Vivaldi, and GNOME Web in addition to Chrome. That doesn’t leave a lot of alternative browsers.

        1. 3

          I’m not sure, but I believe at least Firefox uses it but is a lot less aggressive with how it deals with different classes of “threats” reported by the GSB API.

          Firefox will block things that are clearly malware, but it will not block niche unsigned binaries etc.

        2. 2

          Firefox also uses Google’s safe browsing to block sites though, and probably so do a lot of other browsers.

          1. 1

            Libel is definitely a thought that crossed my mind when claiming my site contained “harmful content.” But their lawyers are infinitely more expensive than mine =/

          2. 10

            This article is a follow-up to “Google’s Monopoly is Stifling Free Software”, discussed a week ago.

            1. 8

              Google’s Safe Browsing technology, which in an effort to combat malware, flags perfectly safe new releases of software as “harmful content” until they have been downloaded a secretive number of times (it is well in excess of 1,000 times from personal experience.)

              Then Google must know what files people are downloading, even when it’s not downloaded from their servers. It is so strange to me that this is supposed to be normal.

              1. 15

                Let me repeat:

                Most people don’t even know Chrome is reading their entire hard drive, thanks to software_reporter_tool.exe

                My only question is, do Lobsters not generally know about this ? Because I didn’t until, I did.

                [0] https://news.ycombinator.com/item?id=19653881

                [1] https://imgur.com/QtcSXY9

                1. 10

                  At least some of us assume any statement of the form “google does [evil thing here]” to be overblown to the point of untruth, unless accompanied by accurate evidence. There’s just so much bullshit floating around.

                  I see a bunch of DLL file names in that screenshot. What was the DLL loading path set to, I wonder? IIRC that’s the %PATH% environment variable in Windows, and when I last used Windows, applications had an awful having of adding their internal directories to the system-global %PATH%. I see an assertion that Steam has nothing to do with Chrome, but I don’t see anything about what directories are and aren’t included in %PATH%.

                  1. 2

                    FWIW, since XP SP2 Windows has not used PATH for dll searching unless a certain system setting is changed in the registry for legacy reasons.

                    Even if it was changed, that wouldn’t account for unique to an app dll names, the executables, and the chm. For the dll loader to hit all those would imply the executable from Chrome is attempting to load them all for usage.

                    1. 1

                      Or that something else is attempting to load things from that directory into Chrome (and other processes), and Chrome notices. Does Steam try to do such things, I wonder?

                  2. 1

                    I’m using Chrome (v. 79.0.3945.88) on Windows 10 Enterprise and I cannot find an instance of “software_reporter_tool.exe” in Resource Monitor (as per the screenshot).

                    Given the intense scrutiny Google is under, especially with regards to privacy concerns, I would imagine this would be more widely reported if it was actually an issue.

                    1. 1

                      Interesting. That would mean that once Chrome becomes less popular browser, Safe Browsing becomes useless, as it will flag more and more stuff as malware.

                      1. 1

                        AIUI that API is fed by Googlebot more than by Chrome. Chrome does no analysis, its use of the API is read-only. The analysis and assessment are based on data fetched by Googlebot.

                        1. 1

                          My understanding from the message I replied to is that having Google Chrome installed means software_reporter_tool.exe and Chrome would send the information to Google about downloaded files. My interpretation was that once the use of Chrome drops, the number of downloads per file will be quite inaccurate, and if used as a metric of validity, would cause many more legitimate files to be flagged as malware, rendering the whole service useless. If Googlebot does it, then it shouldn’t be a problem, but if Google uses number of downloads as a metric, it can’t come from Googlebot.

                          1. 1

                            Oh, I see. Sorry for the misunderstanding.

                            I agree that the download count must really be a count of API lookups. But if Chrome’s usage declines, then the browsers most people are likely to switch to are Firefox and Safari, and they also use the same API.

                    2. 2

                      Google doesn’t know exactly what files people are downloading. As I understand it, browsers download a database of abbreviated hashes of bad URLs; most URLs a browser visits are not in the database, so they won’t be sent to Google at all.

                      If a URL’s abbreviated hash does appear in the database, the browser asks Google for all the full hashes matching the problematic abbreviated hash… and also some randomly-generated abbreviated hashes, so it’s not obvious to Google which abbreviated hash the user actually visited. Once the browser gets the full hashes of problematic URLs, it can compare them with the full hash of the URL it’s loading to find out whether the URL is unsafe.

                      1. 2

                        Not exactly; IIRC it was something like google knows which domains host malware by looking for themselves, and then your browser sends the hash of the domain to google to figure out if it’s probably on the list before sending the domain.

                        1. 3

                          They know somehow. Their Googlebot crawler is the one locating the ZIP archives, opening them up and scanning them, and seeing an EXE. Instead of running it through a scanner, they just treat it as dangerous right away.

                          They couldn’t tell you that a file was “uncommonly downloaded” unless they’re keeping a counter that increments each time it’s downloaded, which means Safe Browsing sends your download history to Google.

                          My guess is they record a hash, and when users download files, it submits it to some online database to get information about that file. But since Google is also logging where the files were from for Search Console, they know what file you are downloading.

                          My domain has never in its history hosted anything harmful, so it’s not related to that.

                          1. 6

                            You don’t need to guess, the safe browsing APIs are documented. Strictly speaking, what Google counts isn’t downloads, how the number of times Chrome runs through the lines of code just prior to the first attempt at downloading something.

                            Google also provides another API to do the same job, which sends less data to Google at the cost of having to download a largish file often. Anyone who uses this API will generally not be counted. I don’t know anything about how much the other API is used; “download large file to end-user device often” sounds like a bad tradeoff to my ears.

                            1. 1

                              Wikipedia says the download-a-largish-file API is the API used by Chrome, Firefox and Safari. It supports differential updating, so the update size should be proportional to the number of suspicious URLs added or removed… which might still be awkwardly large, I have no idea.

                        2. 1

                          Isn’t this from users enabling opt-in telemetry in Chrome?

                        3. 4

                          It doesn’t help that code signing tools from Microsoft are awful. Their documentation is scattered all over the place, but the important parts haven’t been touched since XP. Defaults options of the tools are deprecated or invalid, and you have to know to use a bunch of switches in the right order to get a usable signature. And that’s the experience after paying for a certificate and fighting with junk smartcard drivers to read it.