1. 3

    The note on Python’s garbage collector is interesting. I had been wondering recently if any languages performed limited* lifetime analysis to deallocate objects that were obviously dead meat. Does anyone know of others?

    *Limited compared to Rust’s lifetime analysis.

    1. 3

      Go doesn’t have generational GC and leans a bit on escape analysis to change heap allocations to stack allocations with a defined lifetime. There are limitations: dynamic calls (like anything using interfaces in Go) are opaque to the compiler so it can’t analyze across them. I think variable-size allocations don’t take advantage of EA now, because it always moves things to the stack and the runtime wants to be able to make simple stack maps.

      I think Python gets most garbage through refcounting; you just need to GC to catch reference cycles when two objects point at each other and so on.

      1. 3

        Fun thing, you can turn the GC off in Cpython 2.x (and I think in 3.x too but I never bothered to check). In that mode, data structures which don’t have any cycles are still managed correctly by refcounting, but cyclic structures will cause objects they refer to to hang around forever.

        This isn’t usually recommended to anyone because it’s likely that almost none of the libraries you use have been tested for leaks in this mode.

        There was a story here a year ago about Instagram doing this, for slightly esoteric reasons. https://lobste.rs/s/59rwdm/dismissing_python_garbage_collection_at

        And there was a follow up a while later which is now at the top of my to-read pile. Not linking the lobsters thread on that one because it sucked. https://engineering.instagram.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf

        1. 3

          *Limited compared to Rust’s lifetime analysis.

          In Rust, object lifetimes are tied to their scope, unless they’re heap-allocated in which case they’re cleaned up via RAII (also scope-based) or dynamically (reference count goes to 0). “Lifetime analysis” in Rust probably can only mean lifetime constraint checking which doesn’t have anything to do with deallocating objects.

          1. 2

            I believe that you are correct. I was not familiar with the term escape analysis before my post, but it looks like that was the idea I was thinking of. I’m reading through the rust book right now and assumed that a weaker version of lifetime constraints could be applied to languages which do not have as rigorous of deallocations rules.

            To further clarify what I meant by “limited”, I was thinking of how lifetime annotations are sometimes required in rust functions. I was wondering how far you could apply lifetime constraint rules without changing the semantics of a language.

            1. 2

              I was not familiar with the term escape analysis before my post, but it looks like that was the idea I was thinking of

              Right, that makes sense. Escape analysis can certainly lead to tightening up the lifetime of objects.

              I was wondering how far you could apply lifetime constraint rules without changing the semantics of a language.

              It’s possible that I don’t understand precisely what you mean here, but I want to point out that lifetime annotations don’t really change the semantics even of Rust - that is, if you could switch off the borrow checker in the Rust compiler, so to speak, it should still be able to generate exactly the same code (it would just lose the ability to catch certain errors, and lose memory safety as a consequence, though a correct Rust program would still run without memory-safety issues).

              That “lifetime of objects is tied to scope” in Rust is caused by objects being stack-allocated. Languages that heap-allocate all objects by default have much more capacity, theoretically, to perform static analyses allowing them to determine when object lifetime could really end - possibly before the stack frame is released, possibly at the same time, possibly earlier; in the latter two cases, a compiler (including a JIT) could elect to allocate the object on the stack instead (but it could also just insert a heap deallocation at the appropriate juncture).

          2. 2

            For several versions, Hotspot’s JIT compiler has done escape analysis and used it to stack allocate objects that will not escape.

          1. 3

            pcwalton noted wasm imposes structure which would make existing exploits more difficult. But it’s true (and acknowledged there) that it’s not truly safe.

            One kinda neat thing about wasm from a security perspective is that it can be relatively cheap to run multiple sandboxes, even within one webpage. So if, say, your messenger app wants to use wasm to accelerate both some fancy parser (polyfill to decode some fancy future image format, say?) and some security-critical crypto thing, the two needn’t have access to each others’ memory. That seems like it could be a powerful tool for limiting the blast radius of problems.

            1. 1

              Do any existing distributed databases provide distinct read/write ordering methods, similar to the ordering methods available for atomic memory read/write?

              1. 1

                I’m afraid of this answering a general q. when you meant a more specific one, but you can definitely ask for different consistency levels within one database for different sessions, commands within a transaction, etc. For example, in Amazon’s DynamoDB you can do a consistent read (gets responses from two of three replicas so any write accepted by the majority will show up) or not, in MongoDB you can set read and write concerns for transactions and other operations.

                In various SQL databases you can set transaction isolation levels (do I want to see data from other in-progress transactions that may never be committed? is it OK if some data changes out from under me after I read it, and if so, do I only permit certain changes or just anything?), and you have things like “select for update” to place locks to make select=>munge=>write-result-back a safe operation in some situations where it wouldn’t be otherwise, not because the DB itself is distributed but because the larger system of the DB server + its many clients is.

              1. 2

                This does leave me wary of having automated systems make a fast decision to fail over to a secondary (or do other big, special mitigations). Waiting for a human means you’ll have a higher time-to-recovery for routine failures, but the win is lower risk that the machine causes epic problems because of a transient situation.

                1. 7

                  Is he right in regards to keep alives being wrongly implemented at the application layer? Aren’t TCP keep alives not good enough because they only verify from proxy to proxy, are off by default and if turned on, the period is 2 hours. See https://stackoverflow.com/a/23240725/4283659

                  1. 1

                    Another upside to doing stuff over the encrypted channel is that it’s less open for things on the network to mess with it. Still some possibility, since you can always have a corporate middlebox with its own root cert installed on the client computer, etc. But generally the hope with putting stuff in the encrypted channel seemed to be “hopefully now the network won’t completely break it.”

                  1. 4

                    Surely I’m not going to be the only one expecting a comparison here between go’s. I’m not really well versed in GC but this appears to mirror go’s quite heavily.

                    1. 12

                      It’s compacting and generational, so that’s a pair of very large differences.

                      1. 1

                        My understanding, and I can’t find a link handy, is that the Go team is on a long term path to change their internals to allow for compacting and generational gc. There was something about the Azul guys advising them a year+ ago iirc.

                        Edit; I’m not sure what the current status is, haven’t been following, but see this from 2012, look for Gil Tene comments:

                        https://groups.google.com/forum/#!topic/golang-dev/GvA0DaCI2BU

                        1. 4

                          This presentation from this July suggests they’re averse to taking almost any regressions now even if they get good GC throughput out of it. rlh tried freeing garbage at thread (goroutine) exit if the memory wasn’t reachable from another thread at any point, which seemed promising to me but didn’t pan out. aclements did some very clever experiments with fast cryptographic hashing of pointers to allow new tradeoffs, but rlh even seemed doubtful the prospects of that approach in the long term.

                          Compacting is a yet harder sell because they don’t want a read barrier and objects moving might make life harder for cgo users.

                          Does seem likely we’ll see more work on more reliably meeting folks’ current expectations, like by fixing situations where it’s hard to stop a thread in a tight loop, and we’ll probably see work on reducing garbage through escape analysis, either directly or by doing better at other stuff like inlining. I said more in my long comment, but I suspect Java and Go have gone on sufficiently different paths they might not come back that close together. I could be wrong; things are interesting that way!

                          1. 1

                            Might be. I’m just going on what I know about the collector’s current state.

                        2. 10

                          Other comments get at it, but the two are very different internally. Java GCs have been generational, meaning they can collect common short-lived garbage without looking at every live pointer in the heap, and compacting, meaning they pack together live data, which helps them achieve quick allocation and locality that can help processor caches work effectively.

                          ZGC is trying to maintain all of that and not pause the app much. Concurrent compacting GCs are hard because you can’t normally atomically update all the pointers to an object at once. To deal with that you need a read barrier or load barrier, something that happens when the app reads a pointer to make sure that it ends up reading the object from the right place. Sometimes (like in Azul C4 I think) this is done with memory-mapping tricks; in ZGC it looks like they do it by checking a few bits in each pointer they read. Anyway, keeping an app running while you move its data out from under it, without slowing it down a lot, is no easier than it sounds. (To the side, generational collectors don’t have to be compacting, but most are. WebKit’s Riptide is an interesting example of the tradeoffs of non-compacting generational.)

                          In Go all collections are full collections (not generational) and no heap compaction happens. So Go’s average GC cycle will do more work than a typical Java collector’s average cycle would in an app that allocates equally heavily and has short-lived garbage. Go is by all accounts good at keeping that work in the background. While not tackling generational, they’ve reduced the GC pauses to more or less synchronization points, under 1ms if all the threads of your app can be paused promptly (and they’re interested in making it possible to pause currently-uncooperative threads).

                          What Go does have going for it throughput-wise is that the language and tooling make it easier to allocate less, similar to what Coda’s comment said. Java is heavy on references to heap-allocated objects, and it uses indirect calls (virtual method calls) all over the place that make cross-function escape analysis hard (though JVMs still manage to do some, because the JIT can watch the app running and notice that an indirect call’s destination is predictable). Go’s defaults are flipped from that, and existing perf-sensitive Go code is already written with the assumption that allocations are kind of expensive. The presentation ngrilly linked to from one of the Go GC people suggests at a minimum the Go team really doesn’t want to accept any regressions for low-garbage code to get generational-type throughput improvements. I suspect the languages and communities have gone down sufficiently divergent paths about memory and GC that they’re not that likely to come together now, but I could be surprised.

                          1. 1

                            One question that I don’t have a good feeling for is: could Go offer something like what the JVM has, where there are several distinct garbage collectors with different performance characteristics (high throughput vs. low latency)? I know simplicity has been a selling point, but like Coda said, the abundance of options is fine if you have a really solid default.

                            1. 1

                              Doubtful they’ll have the user choose; they talk pretty proudly about not offering many knobs.

                              One thing Rick Hudson noted in the presentation (worth reading if you’re this deep in) is that if Austin’s clever pointer-hashing-at-GC-time trick works for some programs, the runtime could choose between using it or not based on how well it’s working out on the current workload. (Which it couldn’t easily do if, like, changing GCs meant compiling in different barrier code.) He doesn’t exactly suggest that they’re going to do it, just notes they could.

                            2. 1

                              This is fantastic! Exactly what I was hoping for!

                            3. 4

                              There are decades of research and engineering efforts that put Go’s GC and Hotspot apart.

                              Go’s GC is a nice introductory project, Hotspot is the real deal.

                              1. 4

                                Go’s GC designers are not newbies either and have decades of experience: https://blog.golang.org/ismmkeynote

                                1. 2

                                  Google seems to be the nursing home of many people that had one lucky idea 20 years ago and are content with riding on their fame til retirement, so “famous person X works on it” has not much meaning when associated with Google.

                                  The Train GC was quite interesting at its time, but the “invention” of stack maps is just like the “invention” of UTF-8 … if it hadn’t been “invented” by random person A, it would have been invented by random person B a few weeks/months later.

                                  Taking everything together, I’m rather unconvinced that Go’s GC will even remotely approach G1, ZGC’s, Shenandoah’s level of sophistication any time soon.

                                2. 3

                                  For me it is kind of amusing that huge amounts of research and development went into the Hotspot GC but on the other hand there seem to be no sensible defaults because there is often the need to hand tune its parameters. In Go I don’t have to jump through those hoops, and I’m not advised to, but still get very good performance characteristics, at least comparable to (in my humble opinion even better) than for a lot of Java applications.

                                  1. 13

                                    On the contrary, most Java applications don’t need to be tuned and the default GC ergonomics are just fine. For the G1 collector (introduced in 2009 a few months before Go and made the default a year ago), setting the JVM’s heap size is enough for pretty much all workloads except for those which have always been challenging for garbage collected languages—large, dense reference graphs.

                                    The advantages Go has for those workloads are non-scalar value types and excellent tooling for optimizing memory allocation, not a magic garbage collector.

                                    (Also, to clarify — HotSpot is generally used to refer to Oracle’s JIT VM, not its garbage collection architecture.)

                                    1. 1

                                      Thank you for the clarification.

                                3. 2

                                  I had the same impression while reading the article, although I also don’t know that much about GC.

                                1. 2

                                  Babytimes, seeing some folks, and cleaning.

                                  1. 6

                                    Nice article. Also interesting for reasoning about DNS-over-HTTPS.

                                    As far as I can say from my experience in Kenya, it should also be noted that Africans have a very different way of perceiving time. And security. And… everything! :-D

                                    How would I address this issue?

                                    I think that I would basically create a reverse proxy serving over HTTP those sites that could benefit more from caching (eg Wikipedia). Probably with a custom domain such as wikipedia.cached.local so that people could not be fooled to take the proxied for the original. Rewriting URIs for hypertexts shouldn’t be an issue, but it could be harder to Ajax pages. Probably I would also create a control page so that a page could be prefetched or updated. With a custom protocol and a server in Europe, one could also prefetch several contents at once and send them back together, maximizing bandwidth usage.

                                    Obviously it wouldn’t be safe, but it would be visibly unsafe, and limited to those website that can get advantage of such caches without creating serious threats.

                                    As for service workers, I do not think they would improve the user experience at all, since they are local to the browser and the browser has a cache anyway. The problem is to share such cache between different machines.

                                    1. 2

                                      Local reverse proxy is a clever idea, and a proxy that you explicitly set up clients trust a la corporate middleboxes (see Lanny’s comment) seems like it can work in some environments too. Sympathetic to the problem of existing solutions no longer working, sort of surprised the original blog post wasn’t more about how to improve things now.

                                      1. 5

                                        The point of the machinery I described was to make user explicitly choose between security and access time.

                                        You can make everything smoother (and easier to implement) with a local CA or by installing proper fake certificates in the clients and transparent proxy, but then people cannot easily opt-out.
                                        Worse: they might be trusting the wrong people without any benefit, as for sensible pages that cannot be cached (shopping carts, online banking and similar…)

                                        That why using the reverse proxy should be opt-in, not default and trivial to opt out: there’s no need for a proxy if you want to edit a wikipedia page!

                                        Sympathetic to the problem of existing solutions no longer working, sort of surprised the original blog post wasn’t more about how to improve things now.

                                        Eric Meyer is a legend of HTML, CSS and Web accessibility. A legend, beyond any doubt.
                                        Before HTML5 I used to read his website daily. He teached me a lot.

                                        But he is a client-side guy.
                                        I think his reference to service workers is an attempt to improve things now.

                                        1. 2

                                          Nit: the past tense of teach is taught, not teached.

                                          1. 1

                                            I’m sorry… I can’t edit it anymore, but thanks!

                                    1. 8

                                      Neat to see the “Lobster” codebase get used this way. And that explains a strange and useless issues I recently got. I had thought they were some new, half-baked commercial product or service that wanted to contribute to Lobsters as a marketing effort (this has happened a couple times and so far been a complete waste of time). This is the only contact I’ve had with the study authors (I have no idea if they contacted jcs), though I see they’re also local to Chicago.

                                      Looks like this analysis is at least a few months old; when I added rubocop it caught all the opportunities to use find_by.

                                      1. 3

                                        :/ about the useless issues opened.

                                        There was a long history of automated analysis at Google and a takeaway was, more or less, “you don’t really have a useful analyzer ’til it mostly finds fixes coders think are useful.”

                                        I wonder what folks would come up with given incentives like that; currently, there’s probably more pressure to maximize your claimed number of bugs to get published.

                                      1. 1

                                        This analogy is a stretch, but ORMs and DBs face a problem a little like what dynamic language runtimes face: with more understanding of how the code really runs, you could make it run faster (x is almost always a float in this JS function, or we almost always/never retrieve obj.related_thing after this ORM query retrieves some objs), but that info isn’t readily available when the code is first run.

                                        JITs deal with this by recording specifics of what happens at each callsite, then making an optimized path for what usually happens. You could imagine ORMs tagging query results with the callsites they came from, and figuring out things like “this bulk query should probably retrieve this related object” or “looks like this query is usually just an existence check” from how the result was actually used.

                                        An inherent challenge with this kind of thing is that you need to make sure that tracking isn’t so expensive it eats up any advantage it brings. We take for granted that JVMs, V8, etc. do magic with our code, but that’s with big teams of experts working on them over years. Perhaps a more achievable thing is more like profile-guided optimization, where you do test runs in a slow stat-tracking mode and some changes get suggested to the code.

                                        That’s sort of a big dream and there is much lower hanging fruit; pushcx notes rubocop was able to find a chunk of things with static analysis, and stuff like “this query scans a table” or just “profiling shows this line is empirically one of our slowest” probably does a lot for big hotspots out there with a lot less trickiness.

                                        1. 12

                                          You don’t have to use the golden ratio; multiplying by any constant with ones in the top and bottom bits and about half those in between will mix a lot of input bits into the top output bits. One gotcha is that it only mixes less-significant bits towards more-significant ones, so the 2nd bit from the top is never affected by the top bit, 3rd bit from the top isn’t affected by top two, etc. You can do other steps to add the missing dependencies if it matters, like a rotate and another multiply for instance. (The post touches on a lot of this.)

                                          FNV hashing, mentioned in the article, is an old multiplicative hash used in DNS, and the rolling Rabin-Karp hash is multiplicative. Today Yann Collet’s xxHash and LZ4 use multiplication in hashing. There have got to be a bajillion other uses of multiplication for non-cryptographic hashing that I can’t name, since it’s such a cheap way to mix bits.

                                          It is, as author says, kind of interesting that something like a multiplicative hash isn’t the default cheap function everyone’s taught. Integer division to calculate a modulus is maybe the most expensive arithmetic operation we commonly do when the modulus isn’t a power of two.

                                          1. 1

                                            Nice! About the leftward bit propagation: can you do multiplication modulo a compile time constant fast? If you compute (((x * constant1) % constant2) % (1<<32)) where constant1 is the aforementioned constant with lots of ones, and constant2 is a prime number quite close to 1<<32 then that would get information from the upper bits to propagate into the lower bits too, right? Assuming you’re okay with having just slightly fewer than 1<<32 hash outputs.

                                            (Replace 1<<32 with 1<<64 above if appropriate of course.)

                                            1. 1

                                              You still have to do the divide for the modulus at runtime and you’ll wait 26 cycles for a 32-bit divide on Intel Skylake. You’ll only wait 3 cycles for a 32-bit multiply, and you can start one every cycle. That’s if I’m reading the tables right. Non-cryptographic hashes often do multiply-rotate-multiply to get bits influencing each other faster than a multiply and a modulus would. xxHash arranges them so your CPU can be working on more than one at once.

                                              (But worrying about all bits influencing each other is just one possible tradeoff, and, e.g. the cheap functions in hashtable-based LZ compressors or Rabin-Karp string search don’t really bother.)

                                              1. 1

                                                you’ll wait 26 cycles for a 32-bit divide on Intel Skylake

                                                And looking at that table, 35-88 cycles to divide by a 64 bit divide. Wow. That’s so many cycles, I didn’t realize. But I should have: on a 2.4 GHz processor 26 cycles is 10.83 ns per op, which is roughly consistent with the author’s measurement of ~9 ns per op.

                                                1. 1

                                                  That’s not what I asked. I asked a specific question.

                                                  can you do multiplication modulo a compile time constant fast?

                                                  similarly to how you can do division by a constant fast by implementing it as multiplication by the divisor’s multiplicative inverse in the group of integers modulo 2^(word size). clang and gcc perform this optimisation out the box already for division by a constnat. What I was asking is if there’s a similar trick for modulo by a constant. You obviously can do (divide by divisor, multiply by divisor, subtract from original number), but I’m wondering if there’s something quicker with a shorter dependency chain.

                                                  1. 1

                                                    OK, I get it. Although I knew about the inverse trick for avoiding DIVs for constant divisions, I didn’t know or think of extending that to modulus even in the more obvious way. Mea culpa for replying without getting it.

                                                    I don’t know the concrete answer about the best way to do n*c1%(2^32-5) or such. At least does intuitively seem like it should be possible to get some win from using the high bits of the multiply result as the divide-by-multiplying tricks do.

                                              2. 1

                                                So does that mean that when the author says Dinkumware’s FNV1-based strategy is too expensive, it’s only more expensive because FNV1 is byte-by-byte and fibonacci hashing multiplying by 2^64 / Φ works on 8 bytes at a time?

                                                Does that mean you could beat all these implementations by finding a multiplier that produces an even distribution when used as a hash function working on 8 byte words at a time? That is, he says the fibonacci hash doesn’t produce a great distribution, whereas multipliers like the FNV1 prime are chosen to produce good even distributions. So if you found an even-distribution-producing number for an 8 byte word multiplicative hash, would that then work just as well whatever-hash-then-fibonacci-hash? But be faster because it’s 1 step not 2?

                                                1. 1

                                                  I think you’re right about FNV and byte- vs. word-wise multiplies.

                                                  Re: 32 vs. 64, it does look like Intel’s latest big cores can crunch through 64-bit multiplies pretty quickly. Things like Murmur and xxHash don’t use them; I don’t know if that’s because perf on current chips is for some reason not as good as it looks to me or if it’s mainly for the sake of older or smaller platforms. The folks that work on this kind of thing surely know.

                                                  Re: getting a good distribution, the limitations on the output quality you’ll get from a single multiply aren’t ones you can address through choice of constant. If you want better performance on the traditional statistical tests, rotates and multiplies like xxHash or MurmurHash are one approach. (Or go straight to SipHash, which prevents hash flooding.) Correct choice depends on what you’re trying to do.

                                                  1. 2

                                                    That makes me wonder what hash algorithm ska::unordered_map uses that was faster than FNV1 in dinkumware, but doesn’t have the desirable property of evenly mixing high bits without multiplying the output by 2^64 / φ. Skimming his code it looks like std::hash.

                                                    On my MacOS system, running Apple LLVM version 9.1.0 (clang-902.0.39.2), std::hash for primitive integers is the identity function (i.e. no hash), and for strings murmur2 on 32 bit systems and cityhash64 on 64 bit systems.

                                                    // We use murmur2 when size_t is 32 bits, and cityhash64 when size_t
                                                    // is 64 bits.  This is because cityhash64 uses 64bit x 64bit
                                                    // multiplication, which can be very slow on 32-bit systems.
                                                    

                                                    Looking at CityHash, it also multiplies by large primes (with the first and last bits set of course).

                                                    Assuming then that multiplying by his constant does nothing for string keys—plausible since his benchmarks are only for integer keys—does that mean his benchmark just proves that dinkumware using FNV1 for integer keys is better than no hash, and that multiplying an 8 byte word by a constant is faster than multiplying each integer byte by a constant?

                                                2. 1

                                                  A fair point that came up over on HN is that people mean really different things by “hash” even in non-cryptographic contexts; I mostly just meant “that thing you use to pick hashtable buckets.”

                                                  In a trivial sense a fixed-size multiply clearly isn’t a drop-in for hashes that take arbitrary-length inputs, though you can use multiplies as a key part of variable-length hashing like xxHash etc. And if you’re judging your hash by checking that outputs look random-ish in a large statistical test suite, not just how well it works in your hashtable, a multiply also won’t pass muster. A genre of popular non-cryptographic hashes are like popular non-cryptographic PRNGs in that way–traditionally judged by running a bunch of statistical tests.

                                                  That said, these “how random-looking is your not-cryptographically-random function” games annoy me a bit in both cases. Crypto-primitive-based functions (SipHash for hashing, cipher-based PRNGs) are pretty cheap now and are immune not just to common statistical tests, but any practically relevant method for creating pathological input or detecting nonrandomness; if they weren’t, the underlying functions would be broken as crypto primitives. They’re a smart choice more often than you might think given that hashtable-flooding attacks are a thing.

                                                  If you don’t need insurance against all bad inputs, and you’re tuning hard enough that SipHash is intolerable, I’d argue it’s reasonable to look at cheap simple functions that empirically work for your use case. Failing statistical tests doesn’t make your choice wrong if the cheaper hashing saves you more time than any maldistribution in your hashtable costs. You don’t see LZ packers using MurmurHash, for example.

                                                1. 5

                                                  In case it’s still tagged crypto, note that these aren’t generators for cryptography; they’re for producing noise with good statistical properties faster than cryptographic generators would. That’s useful for, for example, Monte Carlo simulation.

                                                  It is interesting to note that some cryptographic primitives are fast enough to be in the comparison table, though. ChaCha20/8 and AES-128 counter mode are both under one cycle/byte on new x64 hardware and really well studied. If you find nonrandomness in them of any sort that would break your Monte Carlo simulation, you can probably get a paper published about it at least.

                                                  1. 4

                                                    chromium-browser is scrutinized closely enough that this would be noticed on ubuntu, right?

                                                    1. 5

                                                      The sandbox engine downloading and running ESET actually appears to be in Chromium: https://cs.chromium.org/chromium/src/chrome/browser/safe_browsing/chrome_cleaner/ so developpers are free to review it and remove any reference to it. If my memory serve me well, Chrome Cleaner is not special and should appear in chrome://components/ along other optional close source components, although I don’t have a windows machine to validate right now. It should (Or at least used to) be disabled for other build than Google Chrome.

                                                      1. 2

                                                        Thanks. It doesn’t appear in chrome://components for me, at any rate.

                                                        1. 1

                                                          If I look at it on windows I can see the entry: Software Reporter Tool - Version: 27.147.200

                                                          1. 1

                                                            Excellent, a positive control.

                                                      2. 2

                                                        isra17’s reply implies there’s no scanner in Chromium, only Chrome. [I wrote this referring to his separate comment–now he has another reply here.] It probably wouldn’t make sense to have this on Linux anyway, just because there isn’t the same size of malware ecosystem there.

                                                        (And I think the reporting/story would be different if the scanner were open source–we’d have an analysis based on the source code, people working on patched Chromium to remove it, and so on.)

                                                        1. 1

                                                          I’m curious about MacOS. I don’t run Chrome usually, but I have to in some cases, e.g. to use Google Meets for work.

                                                          1. 2

                                                            I don’t have an authoritative answer, but https://www.blog.google/products/chrome/cleaner-safer-web-chrome-cleanup/ only talks about Windows.

                                                            1. 2

                                                              I don’t see it in chrome://components on my Mac, if that is indeed where it is supposed to appear.

                                                        1. 3

                                                          We’re a small shop (~15 folks, ~10 eng), but old (think early 2000s, using mod_perl at the time). Not really a startup but we match the description otherwise so:

                                                          It’s a Python/Django app, https://actionk.it, which some lefty groups online use to collect donations, run their in-person event campaigns and mailing lists and petition sites, etc. We build AMIs using Ansible/Packer; they pull our latest code from git on startup and pip install deps from an internal pip repo. We have internal servers for tests, collecting errors, monitoring, etc.

                                                          We have no staff focused on ops/tools. Many folks pitch in some, but we’d like to have a bit more capacity for that kind of internal-facing work. (Related: hiring! Jobs at wawd dot com. We work for neat organizations and we’re all remote!)

                                                          We’ve got home-rolled scripts to manage restarting our frontend cluster by having the ASG start new webs and tear the old down. We’ve scripted hotfixes and semi-automated releases–semi-automated meaning someone like me still starts each major step of the release and watches that nothing fishy seems to be happening. We do still touch the AWS console sometimes.

                                                          Curious what prompts the question; sounds like market research for potential product or something. FWIW, many of the things that would change our day-to-day with AWS don’t necessarily qualify as Solving Hard Problems at our scale (or 5x our scale); a lot of it is just little pain points and time-sucks it would be great to smooth out.

                                                          1. 6

                                                            FYI, I get a “Your connection is not private” when going to https://actionk.it. Error is NET::ERR_CERT_COMMON_NAME_INVALID, I got this on Chrome 66 and 65.

                                                            1. 2

                                                              Same here on Safari.

                                                              1. 1

                                                                Sorry, https://actionkit.com has a more boring domain but works :) . Should have checked before I posted, and we should get the marketing site a cert covering both domains.

                                                              2. 1

                                                                Firefox here as well.

                                                                1. 1

                                                                  Sorry, I should have posted https://actionkit.com, reason noted by the other comments here.

                                                                2. 1

                                                                  https://actionk.it

                                                                  This happens because the served certificate it for https://actionkit.com/

                                                                  1. 1

                                                                    D’oh, thanks. Go to https://actionkit.com instead – I just blindly changed the http://actionk.it URL to https://, but our cert only covers the boring .com domain not the vanity .it. We ought to get a cert that covers both. (Our production sites for clients have an automated Let’s Encrypt setup without this problem, for the record :) )

                                                                1. 9

                                                                  The main topic discussed here is now known as the Han Unification, for those curious to catch up with what’s happened since this was written.

                                                                  1. 2

                                                                    A consequence that hadn’t soaked in for me is that you need to know the language of a piece of text to reliably display it correctly. I’m guessing Twitter and Facebook and such are assigning languages to comments, etc. based on content, the poster’s language preferences/location/Accept-Language, and who knows what else.

                                                                    (And then if a user switches into not-their-native-language for a post, or mixes languages by quoting text in another language or whatever, there’s a whole other level of difficulty.)

                                                                  1. 7

                                                                    Compared to, say, the ARM whitepaper, Intel’s still reads to me as remarkably defensive, especially the section on “Intel Security Features and Technologies.” Like, we know Intel has no-execute pages, as other vendors do, and we know they aren’t enough to solve this problem. And reducing the space where attackers can find speculative gadgets isn’t solving the root problem.

                                                                    Paper does raise the interesting question of exactly how expensive the bounds-check bypass mitigation will be for JS interpreters, etc. To foil the original attack, you don’t have to stop every out-of-bounds load, you just have to keep the results from getting leaked back from speculation-land to the “real world.” So you only need a fence between a potentially out-of-bounds load and a potentially leaky operation (like loading an array index that depends on the loaded value). You might even be able to reorder some instructions to amortize one fence across several loads from arrays. And I’m sure every JIT team has looked at whether they can improve their bounds check elimination. There’s no lack of clever folks working on JITs now, so I’m sure they’ll do everything that can be done.

                                                                    The other, scarier thing is bounds checks aren’t the only security-relevant checks that can get speculated past, just an easy one to exploit. What next–speculative type confusion? And what if other side-channels besides the cache get exploited? Lot of work ahead.

                                                                    1. 5

                                                                      FWIW: a possible hint at Intel’s future directions (or maybe just a temp mitigation?) are in the IBRS patchset to Linux at https://lkml.org/lkml/2018/1/4/615: one mode helps keep userspace from messing with kernel indirect call speculation and another helps erase any history the kernel left in the BTB. I bet both of these are blunt hammers on current CPUs (‘cause a microcode update can only do so much–turn a feature off or overwrite the BTB or whatever), but they’re defining an interface they want to make work more cheaply on future CPUs. It also seems to be enabled in Windows under the name “speculation control” (https://twitter.com/aionescu/status/948753795105697793)

                                                                      ARM says in their whitepaper that most ARM implementations have some way to turn off branch prediction or invalidate branch predictor state in kernel/exception handler code, which sounds about in line with what Intel’s talking about. The whitepaper also talks about barriers to stop out of bounds reads. The language is a bit vague but I think they’re saying an existing conditional move/select works on current chips and a new instruction, CLDB will barrier for future chips that provides just the minimum you need to avoid the cache side channel attack.

                                                                      1. 25

                                                                        Zenyep Tufekci said, “[T]oo many worry about what AI—as if some independent entity—will do to us. Too few people worry what power will do with AI.” (The thread starts with a result that, in some circumstances, face recognition can effectively identify protesters despite their efforts to cover distinguishing features with scarves, etc.)

                                                                        And, without really having to have tech resembling AI, what they can do can look like superhuman abilities looked at right. A big corporation doesn’t have boundless intelligence but it can hire a lot of smart lawyers, lobbyists, and PR and ad people, folks who are among the best in their fields, to try and shift policy and public opinion in ways that favor the sale of lots of guns, paying a lot for pharmaceuticals, use of fossil fuels, or whatever. They seem especially successful shifting policy in the U.S. recently, with the help of recent court decisions that free some people up to spend money to influence elections (and the decisions further back that established corps as legal people :/).

                                                                        With recent/existing tech, companies have shown they can do new things. It’s cheap to test lots of possibilities to see what gets users to do what you want, to model someone’s individual behavior to keep them engaged. (Here’s Zenyep again in a talk on those themes that I haven’t watched.) The tech giants managed to shift a very large chunk of ad spending from news outlets and other publishers by being great at holding user attention (Facebook) or being smarter about matching people to ads than anyone else (Google), or shift other spending by gatekeeping what apps you can put on a device and taking a cut of what users spend (Apple with iOS), or reshape market after market with digitally-enabled logistics, capital, and smart strategy (Amazon). You can certainly look forward N years to when they have even more data and tools and can do more. But you don’t really even have to project out to see a pretty remarkable show of force.

                                                                        This is not, mostly, about the details of my politics, nor is it to suggest silly things like that we should roll back the clock on tech; we obviously can’t. But, like, if you want to think about entities with incredible power that continue to amass more and how to respond to them, you don’t have to imagine; we have them right here and now!

                                                                        1. 3

                                                                          Too few people worry what power will do with AI.

                                                                          More specifically, the increasingly police statey government.

                                                                        1. 42

                                                                          Reminds me of a quote:

                                                                          Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

                                                                          • Brian W. Kernighan
                                                                          1. 13

                                                                            :) Came here to post that.

                                                                            The blog is good but I’m not convinced by his argument. It seems too worried about what other people think. I agree that we have to be considerate in how we code but forgoing, say, closures because people aren’t familiar with them or because we’re concerned about how we look will just hold back the industry. Higher level constructs that allow us to simplify and clarify our expression are a win in most cases. People can get used to them. It’s learning.

                                                                            1. 8

                                                                              I think he may not disagree with you as much as it sounds like. I don’t think that sentence says “don’t use closures,” just that they’re not for impressing colleagues. (It was: “You might impress your peers with your fancy use of closures… but this no longer works so well on people who have known for a decades what closure are.”)

                                                                              Like, at work we need closures routinely for callbacks–event handlers, functions passed to map/filter/sort, etc. But they’re just the concise/idiomatic/etc. way to get the job done; no one would look at the code and say “wow, clever use of a closure.” If someone does, it might even signal we should refactor it!

                                                                              1. 5

                                                                                It seems too worried about what other people think.

                                                                                I agree with your considerations on learning.
                                                                                Still, just like we all agree that good code must be readable, we should agree that it should be simple too. If nothing else, for security reasons.

                                                                              2. 2

                                                                                On the other hand, sometimes concepts at the limit of my understanding (like advanced formal verification techniques) allow me to write better code than if I had stayed well within my mental comfort zone.

                                                                              1. 3

                                                                                Here are some things:

                                                                                These are, again, perf but not profiling, but I really want to write (and have started on) a super beginner-focused post about memory use. I wish someone would write a balanced intro to concurrency–from StackOverflow, a couple common mistakes are dividing work into very tiny tasks (with all the overhead that causes) when NumCPU workers and large chunks would do better, and using channels when you really do just want a lock or whatever kind of shared datastructure.