1. 7

    Compared to, say, the ARM whitepaper, Intel’s still reads to me as remarkably defensive, especially the section on “Intel Security Features and Technologies.” Like, we know Intel has no-execute pages, as other vendors do, and we know they aren’t enough to solve this problem. And reducing the space where attackers can find speculative gadgets isn’t solving the root problem.

    Paper does raise the interesting question of exactly how expensive the bounds-check bypass mitigation will be for JS interpreters, etc. To foil the original attack, you don’t have to stop every out-of-bounds load, you just have to keep the results from getting leaked back from speculation-land to the “real world.” So you only need a fence between a potentially out-of-bounds load and a potentially leaky operation (like loading an array index that depends on the loaded value). You might even be able to reorder some instructions to amortize one fence across several loads from arrays. And I’m sure every JIT team has looked at whether they can improve their bounds check elimination. There’s no lack of clever folks working on JITs now, so I’m sure they’ll do everything that can be done.

    The other, scarier thing is bounds checks aren’t the only security-relevant checks that can get speculated past, just an easy one to exploit. What next–speculative type confusion? And what if other side-channels besides the cache get exploited? Lot of work ahead.

    1. 5

      FWIW: a possible hint at Intel’s future directions (or maybe just a temp mitigation?) are in the IBRS patchset to Linux at https://lkml.org/lkml/2018/1/4/615: one mode helps keep userspace from messing with kernel indirect call speculation and another helps erase any history the kernel left in the BTB. I bet both of these are blunt hammers on current CPUs (‘cause a microcode update can only do so much–turn a feature off or overwrite the BTB or whatever), but they’re defining an interface they want to make work more cheaply on future CPUs. It also seems to be enabled in Windows under the name “speculation control” (https://twitter.com/aionescu/status/948753795105697793)

      ARM says in their whitepaper that most ARM implementations have some way to turn off branch prediction or invalidate branch predictor state in kernel/exception handler code, which sounds about in line with what Intel’s talking about. The whitepaper also talks about barriers to stop out of bounds reads. The language is a bit vague but I think they’re saying an existing conditional move/select works on current chips and a new instruction, CLDB will barrier for future chips that provides just the minimum you need to avoid the cache side channel attack.

      1. 25

        Zenyep Tufekci said, “[T]oo many worry about what AI—as if some independent entity—will do to us. Too few people worry what power will do with AI.” (The thread starts with a result that, in some circumstances, face recognition can effectively identify protesters despite their efforts to cover distinguishing features with scarves, etc.)

        And, without really having to have tech resembling AI, what they can do can look like superhuman abilities looked at right. A big corporation doesn’t have boundless intelligence but it can hire a lot of smart lawyers, lobbyists, and PR and ad people, folks who are among the best in their fields, to try and shift policy and public opinion in ways that favor the sale of lots of guns, paying a lot for pharmaceuticals, use of fossil fuels, or whatever. They seem especially successful shifting policy in the U.S. recently, with the help of recent court decisions that free some people up to spend money to influence elections (and the decisions further back that established corps as legal people :/).

        With recent/existing tech, companies have shown they can do new things. It’s cheap to test lots of possibilities to see what gets users to do what you want, to model someone’s individual behavior to keep them engaged. (Here’s Zenyep again in a talk on those themes that I haven’t watched.) The tech giants managed to shift a very large chunk of ad spending from news outlets and other publishers by being great at holding user attention (Facebook) or being smarter about matching people to ads than anyone else (Google), or shift other spending by gatekeeping what apps you can put on a device and taking a cut of what users spend (Apple with iOS), or reshape market after market with digitally-enabled logistics, capital, and smart strategy (Amazon). You can certainly look forward N years to when they have even more data and tools and can do more. But you don’t really even have to project out to see a pretty remarkable show of force.

        This is not, mostly, about the details of my politics, nor is it to suggest silly things like that we should roll back the clock on tech; we obviously can’t. But, like, if you want to think about entities with incredible power that continue to amass more and how to respond to them, you don’t have to imagine; we have them right here and now!

        1. 3

          Too few people worry what power will do with AI.

          More specifically, the increasingly police statey government.

        1. 42

          Reminds me of a quote:

          Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

          • Brian W. Kernighan
          1. 13

            :) Came here to post that.

            The blog is good but I’m not convinced by his argument. It seems too worried about what other people think. I agree that we have to be considerate in how we code but forgoing, say, closures because people aren’t familiar with them or because we’re concerned about how we look will just hold back the industry. Higher level constructs that allow us to simplify and clarify our expression are a win in most cases. People can get used to them. It’s learning.

            1. 8

              I think he may not disagree with you as much as it sounds like. I don’t think that sentence says “don’t use closures,” just that they’re not for impressing colleagues. (It was: “You might impress your peers with your fancy use of closures… but this no longer works so well on people who have known for a decades what closure are.”)

              Like, at work we need closures routinely for callbacks–event handlers, functions passed to map/filter/sort, etc. But they’re just the concise/idiomatic/etc. way to get the job done; no one would look at the code and say “wow, clever use of a closure.” If someone does, it might even signal we should refactor it!

              1. 5

                It seems too worried about what other people think.

                I agree with your considerations on learning.
                Still, just like we all agree that good code must be readable, we should agree that it should be simple too. If nothing else, for security reasons.

              2. 2

                On the other hand, sometimes concepts at the limit of my understanding (like advanced formal verification techniques) allow me to write better code than if I had stayed well within my mental comfort zone.

              1. 3

                Here are some things:

                These are, again, perf but not profiling, but I really want to write (and have started on) a super beginner-focused post about memory use. I wish someone would write a balanced intro to concurrency–from StackOverflow, a couple common mistakes are dividing work into very tiny tasks (with all the overhead that causes) when NumCPU workers and large chunks would do better, and using channels when you really do just want a lock or whatever kind of shared datastructure.

                1. 11

                  When you mentioned channel costs, I wondered if there was communication via unbuffered channels, which can lead to traffic jams since the sender can’t proceed ‘til each recipient is ready. Looking at the old chat_handler.go that doesn’t seem to be the case, though. The three goroutines per connection thing isn’t without precedent either; I think at least the prototypes of HTTP/2 support for the stdlib were written that way.

                  It looks like maybe the socketReaderLoop could be tied in with ChatHandler.Loop(): where socketReaderLoop communicates with Loop on a channel, just inline the code that Loop currently runs in response, then call socketReaderLoop at the end of Loop instead of starting it asynchronously. You lose the 32-message buffer, but the end-user-facing behavior ought to be tolerable. (If a user fills your TCP buffers, seems like their problem and they can resend their packets.) However, saving one goroutine per connection probably isn’t a make-or-break change.

                  Since you talk about memory/thrashing at the end, one of the more promising possibilities would be(/have been) to do a memprofile to see where those allocs come from. A related thing is Go is bad about respecting a strict memory limit and its defaults lean towards using RAM to save GC CPU: the steady state with GOGC=100 is around 50% of the peak heap size being live data. So you could start thrashing with 512MB RAM once you pass 256MB live data. (And really you want to keep your heap goal well under 512MB to leave room for kernel stuff, other processes, the off-heap mmapped data from BoltDB, and heap fragmentation.) If you’re thrashing, GC’ing more often might be a net win, e.g. GOGC=50 to be twice as eager as the default. Finally, and not unrelated, Go’s collector isn’t generational, so most other collectors should outdo it on throughput tests.

                  Maybe I’m showing myself not to be a true perf fanatic, but 1.5K connections on a Pi also doesn’t sound awful to me, even if you can do better. :) It’s a Pi!

                  1. 2

                    Thank you for such a detailed analysis and looking into code before you commented :) positive and constructive feedback really helps. I have received a great amount of feedback and would definitely try with your tips. BoltDB is definitely coming up every-time and I think it contributes to memory usage as well. Some other suggestions include use fixed n workers and n channels, backlog building up, and me not doing serialization correctly. I will definitely update my benchmark code and test it with new fixes; and if I feel like code is clean enough would definitely love to move back.

                    1. 3

                      Though publicity like this is fickle, you might get a second hit after trying a few things and then explicitly being like “hey, here’s my load test, here are the improvements I’ve done already; can you guys help me go further?” If you don’t get the orange-website firehose, you at least might hear something if you post to golang-nuts after the Thanksgiving holiday ends or such.

                      Looking around more, I think groupInfo.GetUsers is allocating a string for each name each time it’s called, and then when you use the string to get the object out there’s a conversion back to []byte (if escape analysis doesn’t catch it), so that’s a couple allocs per user per message. Just being O(users*messages) suggests it could be a hotspot. You could ‘downgrade’ from the Ctrie to a RWLocked map (joins/leaves may wait often, but reads should be fastish), sync.Map, or (shouldn’t be needed but if you were pushing scalability) sharded RWLocked map. But before you put in time trying stuff like that, memprofile is the principled/right way to approach alloc stuff (and profile for CPU stuff)–figure out what’s actually dragging you down.

                      True that there are likely lighter ways to do the message log than Bolt. Just files of newline-separated JSON messages may get you far, though don’t know what other functionality you support.

                      FWIW, I also agree with the commenter on HN saying that Node/TypeScript is a sane approach. (I’m curious about someday using TS in work’s frontend stuff.) Not telling you what to use; trying to get Go to do things is just a hobby of mine, haha. :)

                  1. 7

                    I’ve been using a Samsung ARM Chromebook (1st generation) as my daily driver for the past 4 years. It’s a lowend, underpowered machine with nothing to write home about, but it can support a full Arch Linux ARM installation, run a web browser just fine, and have an adequate number of terminals. I love it. The battery life hasn’t changed at all since I bought it, it’s still consistently getting >7 hours. I have other friends with ARM laptops from other manufacturers, the battery life story is one I hear consistently.

                    1. 2

                      How did you install Arch? What I’m wondering is if you can replace ChromeOS or only install another OS inside it like Crouton.

                      1. 6

                        dz, I wrote up a blog post about this: http://blog.jamesluck.com/installing-real-linux-on-arm-chromebook I completely replaced ChromeOS with Archlinux ARM on the internal SSD. The gist of the process is that you make a live USB, boot to that, and then follow the same procedure for making the bootable USB onto the SSD. You just have to untar the root filesystem, edit /etc/fstab, and correct the networking config!

                        1. 2

                          Thanks so much for writing and sharing!

                        2. 1

                          If it’s anything like my Samsung ARM Chromebook, you can boot a different os off external storage (i.e. an SD card), or possibly replace Chrome OS on the internal solid-state storage.

                          1. 1

                            You can replace ChromeOS. Here’s the Arch wiki on the 1st gen Samsung ARM Chromebook and the Samsung Chromebook Plus.

                        1. 6

                          Not sure how responsive this is to the specific q, but I have an ARM Samsung Chromebook Plus. Initially I used developer mode and crouton, which provides an Ubuntu userspace in a chroot hitting Chrome OS’s Linux kernel, but more recently I use a setup like Kenn White describes, where you’re not in dev mode and you use Termux, an Android app that provides a terminal and apt-based distro inside the app’s sandbox. (Termux has a surprising number of apt packages covered and you can build and run executables in there, etc. You can also just use it on any Android phone or tablet.) In both cases I just use terminal stuff, though Crouton does support running an X session. You can apparently boot Linux on the Chromebook Plus full-stop; I don’t know about compatibility, etc. Amazon’s offering it for $350 now.

                          I’ve long been more indifferent to CPU speed than a lot of folks (used netbooks, etc.), but for what it’s worth I find the Chromebook Plus handles my needs fine. The chip is the RK3399, with two out-of-order and four in-order ARMv8 cores and an ARM Mali GPU. I even futz with Go on it, and did previously on an even slower ARM Chromebook (Acer Chromebook 13). The most noticeable downsides seemed more seem to be memory-related: it has 4GB RAM but…Chrome. If I have too much open the whole thing stalls a bit while it tries (I think) compressing RAM and then stopping processes. Also, there’s a low level of general glitchiness: sometimes I’ll open it up and it’ll boot fresh when it should be resuming from sleep. I had to do some tweaking to do what I wanted inside Chrome OS’s bounds, notably to connect to work’s VPN (had to make a special JSON file and the OS doesn’t readily give useful diagnostics if you get anything wrong), but it’s just a matter of round tuits.

                          From the ARM vs. Intel angle, Samsung actually made a Chromebook Pro with the low-power but large-core Core m3. Reviewers seemed to agree the Pro was faster, and I think I saw it was like 2x on JavaScript benchmarks, but Engadget wasn’t fond of its battery life and The Verge complained its Android app support was limited.

                          1. 3

                            Since we’re talking about the Plus, a few notes on my experience. I have an HP chromebook with an m3 as well, for comparison. None of the distinguishing features of the Plus were positives. Chrome touch support is just meh, and the keyboard was annoyingly gimped by very small backspace, etc. Touchpad was noticeably worse. Android support was basically useless for me.

                            More on topic, the ARM cpu was capable enough, but the system wasn’t practically lighter or longer lasting than the m3 system. It was cheaper, perhaps 50% so, but that’s still only a difference of a few hundred dollars. (I mean, that’s not nothing, but if this is a machine you’ll be using for hours per day over several years… Price isn’t everything.)

                            (I would not pick the Samsung Pro, either. Not a great form factor imo.)

                            More generally, if you don’t have ideological reasons to pick arm, the intel m3 is a pretty remarkable cpu.

                          1. 5

                            The technical description doesn’t fully convey some of the, uh, magic that’s been done with it. One of the coolest things is Arcadia, which gives you access to game-engine fun with a Clojure REPL. The MAGIC stuff is apparently key to cutting allocations so your games aren’t all janked up by collection pauses.

                            The author, Ramsey Nasser, also posts about a bunch of fun stuff on Twitter–other recent stuff he’s messed with includes spline drawing tool and (related, I think) Arabic typography

                            1. 6

                              Posted this in the orange place, figure that it’s worth saying here:

                              FWIW, I think this mostly shows that the GOGC defaults will surprise you when your program keeps a small amount of live data but allocates quickly and is CPU bound. By default, it’s trying to keep allocated heap size under 2x live data. If you have a tiny crypto bench program that keeps very little live data around, it won’t take many allocations to trigger a GC. So the same code benchmarked here would trigger fewer GCs in the context of a program that had more data live. For example, it would have gone faster if the author had allocated a 100MB []byte in a global variable.

                              If your program is far from exhausting the RAM but is fully utilizing the CPU, you might want it to save CPU by using more RAM, equivalent to increasing GOGC here. The rub is that it’s hard for the runtime ever be sure what the humans want without your input: maybe this specific Go program is a little utility that you’d really like to use no more RAM than it must, to avoid interference with more important procs. Or maybe it’s supposed to be the only large process on the machine, and should use a large chunk of all available RAM. Or it’s in between, if there are five or six servers (or instances of a service) sharing a box. You can imagine heuristic controls that override GOGC in corner cases (e.g., assume it’s always OK to use 1% of system memory), or even a separate knob for max RAM use you could use in place of GOGC. But the Go folks tend to want to keep things simple, so right now you sometimes have to play with GOGC values to get the behavior you want.

                              1. 6

                                As best I can reconstruct, the thing the Chrome team was dealing with was that Chrome had to delay starting to scroll until they ran touch event handlers, because the handler could always preventDefault() to disable scrolling, but 1) 80% of touch event handlers weren’t actually calling preventDefault(), 2) this added ~150ms latency before scroll started in a sample case, and doubled 95th-percentile latency in their field tests, and 3) even before adding their passive option, there was a CSS alternative for preventing scroll-on-touch without JS required at all. (They also argue that browsers already have timeouts for these handlers, so they weren’t entirely reliable in the first place.) So the post author’s app probably used touchstart to prevent scrolling and then Chrome broke that.

                                If the Chrome folks wanted to do this and were willing to go to a lot more effort to mitigate breakage, maybe they could have effectively tried to patch up the damage after a passive touchstart handler called preventDefault: jump back to the old scroll position, log a complaint to the console, and treat this handler (or all handlers) as active for subsequent events. It would be very complicated (both for Chrome hackers to code up and for Web devs to understand–consider that arbitrary DOM changes can happen between the scroll starting and the preventDefault!), lead to a janky scroll-and-snap-back sequence for users, and still break for more complex cases, but would likely avoid breaking some previously-usable pages entirely.

                                I get the value of working pretty hard to maintain backwards compatibility, and it’s possible the Chrome folks moved a little fast here; they did actually take the time to gather data (which they use sometimes: a GitHub comment provides some general background on when they’ve backed out interventions), but the ‘scrolling intervention’ page reports only 80% of listeners were passive, and 20% breakage seems high.

                                For what it’s worth, I’m not absolutist about targeted changes that might break some standards-compliant pages; as a dev I see compat impact from changes that are outside the scope of standards (sunsetting Flash, autofill changes, tracking protection, things popular extensions do) and bug fixes; it seems like there’s just always a cost-benefit balance and you need to see how things are used out on the Web, versus being able to reason from first principles what changes are OK or not. And the APIs being as old and quirky as they are, there are places where you really could subtly tweak some API and break almost nobody and improve things for almost everyone; this seems like the common category where we’d like to allow something to be async but that can break some uses of it.

                                It’s tricky to figure out if a change is worth it, and again, they might have gotten it wrong here, but I wouldn’t want “you can never break anything” to be the lesson from this.

                                1. 6

                                  A bias in SHA256 doesn’t look like the best explanation for what they saw.

                                  It sounds like they’re talking about a bias observed in live Bitcoin blocks not anything they’ve tested and seen in raw SHA256 themselves, because 2^80 hashes is more like what the entire Bitcoin network has done than anything they can do themselves.

                                  I’m sure someone has said this better elsewhere, but it sounds like that just means miners aren’t using entirely random nonces in their search–the same thing as in the difficulty-less-than-1,000 blocks, but in a subtler way. Maybe some miner’s code/chips just don’t scan all nonces. Maybe in some miner’s cluster, they use a node ID in some bits of the nonce to partition nonce space, and some sections of nonce space are effectively left aside for future capacity they might add. Maybe the miner tries every possible nonce eventually, but there’s an ordering thing where they’ll tend to try some parts of nonce space last, the ones corresponding to the nonces you less often see in blocks. I don’t know a ton about the block format, but if you can break out which blocks were generated by specific software or mining companies, you might be able to spot whose blocks have the strongest bias.

                                  If there were a SHA256 bias that strong (a couple bits set 55% of the time instead of expected 50!) and that simple (basically linear, and shows up even without a sophisticated search for linear characteristics), and visible through 2xSHA256’s 128 rounds not just SHA256’s 64, then SHA256 itself should have profound biases you’d expect to show up when cryptographers search for linear characteristics. And to test this directly, you wouldn’t need 2^80 effort – find a few blocks at difficulty >1000 searching through nonces randomly, and you should be able to see the 55%-versus-50% difference with much less effort than the Bitcoin miners have expended. Dunno whether that would be near try-this-at-home levels, but it would be a lot less than 2^80.

                                  I can see how if it all looks like a black box to you a problem in SHA256 would be an enticing theory, but the least surprising explanation by far is that it’s an artifact of miners searching nonrandomly, not of the hash being nonrandom in such a strong way.

                                  1. 1

                                    Thanks a lot for the explanations.

                                    I don’t have any feeling for the inner workings of cryptographic hash functions, only for reductions. I assumed Bitcoin does enough abuse in the way it applies hashing for a mild bias in this case not to contradict any well-studied properties of SHA256, I was wrong.

                                    I was obviously wrong to put a summary only slightly less enthusiastic than the title instead of much less enthusiastic.

                                    I didn’t believe that they found anything that would be an actual problem with SHA256 as a hash, but could we please have something non-catastrophic but noticeable (5% bias in Bitcoin mining would be nice, but not this time) to make people stop using random primitives as if they were perfect random oracles? I guess not this time. Sigh.

                                  1. 3

                                    Seems plausible Google didn’t go over Buganizer as carefully as they do many public-facing products because it seemed like an internal product, and the external audience that gets access (folks who successfully reported a security bug) seemed semi-trusted. I bet there are a lot of security issues in code that’s meant for semi-trusted audiences like this: a company’s vendors, say, or a small number of special high-value clients, or employees (when employees have access to a tool but shouldn’t all have ‘root’ on it). Place to look for issues in our own stuff potentially.

                                    1. 1

                                      Agreed—this is a prime example of security through obscurity. Tons of small shops do this sort of thing every day, employing only what security is necessary to keep out the passerby who isn’t too tech-savvy.

                                      One would hope a place like Google wouldn’t lean on obscurity, but here they are doing it.

                                      1. 2

                                        Agreed—this is a prime example of security through obscurity. Tons of small shops do this sort of thing every day, employing only what security is necessary to keep out the passerby who isn’t too tech-savvy.

                                        Oh, huh, you’re right and it’s different from what I originally thought. I had thought they had proper access control so that only bug reporters could access the tool, but weren’t securing products for that audience as well as they would secure apps that just anyone can trivially browse to. But, in fact, it looks like it’s public (at least, I can just browse to issuetracker.google.com), so it really was just a more obscure product being less thoroughly scrubbed for issues.

                                    1. 0

                                      Another great aspect of concurrency in Go is the race detector. This makes it easy [emphasis mine] to figure out if there are any race conditions within your asynchronous code.

                                      Am I reading correctly? How exactly does this magical decision procedure establish the presence of absence of data races in your concurrent code?

                                      1. 4

                                        Yeah–it is easy, but isn’t comprehensive. If racy accesses don’t actually occur in whatever you run under the detector nothing is reported, and racy accesses that have some totally unrelated sync operation happen between them (like there happened to be some lock taken but it doesn’t guard the racily-accessed data) aren’t found.

                                        A lot of people seem to say that an emphasis on CSP-style concurrency combined with the race detector catching some stuff gets them far. For what it’s worth, one dissenting opinion on that comes from Dropbox, who say clever engineers looking for races is the best they’ve got: https://about.sourcegraph.com/go/go-reliability-and-durability-at-dropbox-tammy-butow/

                                        1. 1

                                          Yeah–it is easy, but isn’t comprehensive.

                                          Then at best it is easy to find (some) data races, not to establish their presence or absence. The former might be good enough for your purposes, but the latter is how I interpret “figure out if there are any race conditions”. But, then, I’m not a native English speaker.

                                          If racy accesses don’t actually occur in whatever you run under the detector nothing is reported,

                                          Which is no different from languages in which concurrency is supposedly harder than in Go.

                                          and racy accesses that have some totally unrelated sync operation happen between them (like there happened to be some lock taken but it doesn’t guard the racily-accessed data) aren’t found.

                                          In a heavily concurrent program, this could be pretty much any racy access.

                                          A lot of people seem to say that an emphasis on CSP-style concurrency combined with the race detector catching some stuff gets them far.

                                          Yeah, I totally understand the feeling of empowerment when you learn something that seemed out of reach until not too long ago (in this case, writing concurrent programs), but IMO it’s not really justified unless you can reliably do it right. “Not too wrong” isn’t good enough.

                                          1. 4

                                            I think the OP is just being imprecise with their wording. They were technically wrong as soon as they started talking about race conditions instead of data races in the context of the race detector. (And many, many, many people get this wrong. It happens.)

                                            1. 1

                                              I think I’d get it wrong too?

                                              I guess my understanding is that a race condition is a general term (something which has two actors, and the sequencing of their operations matters) which includes a data race (the actors both happen to be modifying some data) is a specific instance of a race condition.

                                              Is that about right, or am I missing some nuance in the terms?

                                              1. 6

                                                They are actually completely orthogonal concepts. :-) You can have a race condition without a data race.

                                                John Regehr explains it far better than I could, with examples: https://blog.regehr.org/archives/490

                                                Also, you’re in good company. The blog post introducing the race detector even uses the term “race condition”: https://blog.golang.org/race-detector The official documentation of the tool, however, does not: https://golang.org/doc/articles/race_detector.html

                                                (And btw I completely agree with your “perfect is the enemy of the good” remarks.)

                                                (Please take a look at John’s blog post. It even shows an example of a data race that isn’t a race condition!)

                                                1. 3

                                                  Thanks for that.

                                                  Edit: I briefly posted saying I still thought data races were a subset of race conditions - edited after digesting blog.

                                                  The blog post makes the distinction that (their definition of) race conditions is about correctness, and not all data races violate correctness, so not all data races are race conditions.

                                                  That’s a subtle distinction and I’m not entirely sure I agree, but I understand better - so thanks :-)

                                                  1. 6

                                                    Dmitry Vyukov, who works on the Go race detector, is a big advocate for presuming data races are incorrect rather than trying to sort out if they’re safe, because things that look benign in the code can bite you. Sometimes the problems only arise with the help of compiler optimizations that assume there are no racy accesses and therefore, say, compute on copy of some value in a register rather than the original on the heap for some duration. He writes some about this at https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong

                                                    The io.Discard example partway down https://blog.golang.org/race-detector is good too: some memory that everyone thought of as a write-only ‘black hole’ turned out not to be in one specific situation, and a race that some experienced coders thought was safe caused a bug.

                                                    Nothing there is inconsistent with what Regehr says, I don’t think: Vyukov isn’t saying a benign data race is an invalid concept, just saying it’s dangerous to try to write code with them. I mention it because examples like these made me much less inclined to try to figure out when data races were or weren’t benign.

                                                    1. 1

                                                      Hmm, interesting :) I’m going to read up and correct this.

                                                    2. 1

                                                      I don’t find terribly convicning that example of a data race that isn’t a race condition. The racy program doesn’t have a well-defined meaning in terms of the language it’s written in. Its behavior depends on the whims of the language implementor, i.e., they have ways to break your program without neglecting their duty to conform to the language specification.

                                                      1. 1

                                                        I’m not convinced by your rebuttal. That doesn’t imply it is a race condition. It just implies that it is UB. UB could cause a race condition, but not necessarily so.

                                                        1. 1

                                                          The language specification is a contract between language implementors and users. To prove a language implementation incorrect, it suffices to show one conforming program whose behavior under the implementation doesn’t agree with the language specification. Conversely, to prove a program incorrect, it suffices to exibit one conforming language implementation in which the program deviates from its intended behavior.

                                                          In other words: Who cares that, by fortunate accident, there are no race conditions under one specific language implementation?

                                                          1. 1

                                                            In other words: Who cares that, by fortunate accident, there are no race conditions under one specific language implementation?

                                                            Everyone that uses it?

                                                            This is my last reply in this thread. I don’t find your objection worth discussing. Moreover, you didn’t actually refute my previous comment. You just lectured me about what UB means.

                                                            1. 2

                                                              The essence of my objection is that “there are no race conditions under this specific implementation” isn’t a very interesting property when implementation details are outside of your control. But maybe that’s not true. Maybe people genuinely don’t mind when changes to a language implementation suddenly break their programs. What do I know.

                                                2. 3

                                                  I think it sounded like I was disagreeing when I was trying to agree: the Go race detector is basically LLVM’s ThreadSanitizer (Dmitry Vyukov did work on both), and as such can help but can’t prove the absence of races. Agree it’s nothing like Rust’s static checking or the isolated heaps some languages use.

                                                  1. 2

                                                    “Not too wrong” isn’t good enough.

                                                    That’s a pretty good opposite to to “the perfect is the enemy of the good”.

                                                    In most software domains, the impact of a race is related to how often you hit it. e.g. a race which only shows up under your typical load once a year is a low impact problem - you have other, more relevant bugs.

                                                    (I agree that in some scenarios (medical, aviation, etc) formal proofs of correctness and other high assurance techniques may be required to establish that even low-likelihood bugs (of all kinds, not just races) can’t occur.)

                                                    The (imperfect) race detector allows you most easily find the most frequent races (and all of the ones you can provoke in your auto tests).

                                                    No, it’s not perfect, but it is a very useful tool.

                                                    1. 1

                                                      This comment made me think; I think another thing that lets people get by is that a lot of practical concurrency is the simpler stuff: a worker pool splitting up a large chunk of work, or a thread-per-request application server where sharing in-memory data isn’t part of the design and most of the interesting sync bits occur off in database code somewhere.

                                                      Not that you’re guaranteed to get those right, or that advanced tools can’t help. (Globals in a simple net/http server can be dangerous!) Just, most of us aren’t writing databases or boundary-pushing high-perf apps where you have to be extra clever to minimize locking; that’s why what you might expect to become a complete, constant tire fire often works in practice.

                                              1. 1

                                                The error handling section was a little confusing to me: Are there any example that are meant by proper error handling?

                                                I think when you have multiple return values, you get many of the advantages of exceptions, but I don’t really know what the author of this blog is arguing for as compared to python besides exceptions.

                                                1. 1

                                                  Well you’d at least expect this to be baked into the language: https://github.com/pkg/errors

                                                  1. 3

                                                    But by not being baked in, you have alternatives: https://pocketgophers.com/error-handling-packages/ , all of which conform to the error interface

                                                    1. 1

                                                      Cool site!

                                                  2. 1

                                                    I think he means the kind of stuff he now gets from third-party tools: showing where errors occurred, and making it hard to accidentally discard errors.

                                                    (Also: big plug for https://github.com/gordonklaus/ineffassign. It finds assignments in your code that make no difference to how it runs, a generalization of the ‘no unused variables’ rule that turns up some bugs and, when it doesn’t, often points to somewhere you could make your code tighter. Better yet, you can use something like https://github.com/alecthomas/gometalinter with a subset of checks enabled.)

                                                  1. 1

                                                    Here are some scattered thoughts from when I spent a Saturday a few months back just seeing how Dart worked and throwing a toy screenful of widgets together on my phone: https://news.ycombinator.com/item?id=14433308

                                                    Short version: I liked hot reload, and the IDE-ish luxuries were great, especially for learning it. It had some surprising odd bits and bumps setting it up, detailed in the link. I’m not a mobile dev, just playing around, so don’t give my reactions too much weight.

                                                    1. 2

                                                      Flutter / Dart’s hot reload functionality is kind of amazing.

                                                      1. 1

                                                        Could you elaborate a bit on why (when compared to literally any other framework that supports it, such as Ember, React, Vue, or Mithril)? This is something I’ve honestly just come to expect at this point from a decent SPA framework.

                                                        1. 1

                                                          I’m not aware of JS engines being able to hot-patch live objects with new versions of methods. With Dart’s hot reload as you modify and add methods the new code is deployed to object instances that keep their data. It makes for a pretty smooth iteration flow.

                                                          1. 1

                                                            I think it’s that it provides it in a mobile context. React Native does too, if I’m reading right, but default iOS and Android toolchains don’t, so mobile devs are a lot less used to seeing their code live in seconds on-device than Web devs.

                                                            1. 3

                                                              Android does support this since Android studio 2.3, it’s called instant run.


                                                        1. 5

                                                          There’s something creepy, unpleasant and unhuman in all these late Google technologies. Dart is just a frankeinsteinian mix of two most popular languages: Java and Javascript. Even name is stupid - “Dart”. It has no community outside Google and is not supposed to have. This Flutter looks like it will always be alpha. All these docs websites with Material design (which looks like ultra-modernist plastic prisons, at least in desktop web) built from internal wiki pages look cold and dead. Everything is like Google+ or modern Google Photos.

                                                          1. 3

                                                            FWIW, I’ve only played with Flutter, haven’t used Dart beyond the learn-how-it-works phase, but seems a little harsh. It’s an evolution of JavaScript because way back when they first put it out there, it targeted the browser and needed to appeal to JS devs. It reminds me less of Java than of Kotlin or especially TypeScript, with more sugar and fewer typed-out type annotations than Java, and a little more light functional style than Java (like JS, not like Lisp). It has that lack of community outside Google because the industry decided to evolve JS instead while Google already had enough internal investment (from the AdWords area mainly I think) to justify continuing to evolve Dart; if you’re saying Dart was “not supposed to have” community it seems inaccurate, since it was launched as a potential successor to JS. They do still put effort into making it adoptable, e.g. the site has good docs, tutorials, etc., there’s a solid playground with code completion (https://dartpad.dartlang.org/), they give talks at all the Google events, and folks have e.g. helped me when I had setup issues trying it out. I think Flutter is actually used for AdWords mobile apps, and the ‘alpha’ may have a bit of the GMail long beta thing going on.

                                                            There’s certainly complaints to have about Google, but to me, Dart just seems like one of the many perfectly likable technologies out there that hasn’t had the hockey-stick growth its creators might have hoped for (outside Google, so far). Could be sort of cool if it does.

                                                          1. 3

                                                            For speculation on why they might want this, some folks think Google’s next flagship Chrome OS device might host VMs. There are other public Chrome commits related to virtualization recently: one says, “Stop disabling VM extensions on Eve. VMs will be used to host containers soon.” And Eve is suspected to be the “Pixelbook” Google’s announcing on October 4. Oddly for a Chromebook, is supposed to be available with gobs of Flash (512GB).

                                                            They could just use virtualization as a more secure sandbox for Android apps than they have now. Or they could go big and let you run Windows or Ubuntu in VMs, which would explain offering configurations with gobs of SSD space.

                                                            There’s not too much point to speculating, since we’re going to find out the 4th anyway, but why not?

                                                            1. 8

                                                              Bigger reason is we don’t need their algorithms. What we need are more solid, OSS implementations of protocols and algorithms we have plus tools for automating as much analysis as possible. Despite their BS, I do thank the NSA’s defensive side (I.A.D.) for funding Galois’ work on Cryptol and other stuff that can help with this goal.


                                                              That’s open-source which is always a good default for security-related things but especially true if NSA funds it. ;) Another they did for FOSS which had some benefit was SELinux which was a Secure Computing Corp and Mitre work. They also give a lot of contracts to companies such as Rockwell-Collins who make secure CPU’s, compilers, code generators, and so on. They do good stuff but not enough FOSS for the public. Thanks to Snowden, we know the motivation for that. (sighs)

                                                              1. 2

                                                                Bigger reason is we don’t need their algorithms.

                                                                Yes. We certainly have ARX and simple non-ARX algos already. The only distinction is these ciphers ‘scale down’ to terribly insecure block sizes, key sizes, and round counts (32 bit block, 64 bit key, 22 single-ARX rounds), and I haven’t exactly heard the clamor for that from civilians.

                                                                1. 1

                                                                  Well, I was thinking more along the lines of all the cryptographers we have working on algorithms when I said that. We certainly have some specific algorithms already. We’ll have more or just better versions of what we have. We don’t need theirs if it comes down to it. Even Schneier liked them so they’re always on the table. I think, if we got desperate about it, we could take the most efficient ciphers available, put them on an advanced node as an ASIC, and just combine that with whatever needs it using tech that mixes die from different nodes. The complicated one with the crypto is already done with the cost getting pretty marginal. The rest of the ASIC is the IoT stuff on same or cheaper node.

                                                                  I know this sounds like work upfront if we look at the whole of the silicon ecosystem. It just seems to me it’s an easier investment of governments just throwing grant money at enabling that tech using existing methods (esp fabs and packagers) instead of inventing better algorithms. I mean, do both across many organizations but we got options before “trust the NSA or have no crypto in IoT” is the option. Hope the elaboration makes more sense.