1. 1

    It looks like a particle accelerator experiment in a cloud chamber married a cluster of galaxies.

    1. 12

      Kind of an aside, but I’m pleased by the lack of vitriol in this.

      1. 13

        Almost all of Theo’s communications are straightforward and polite. It’s just that people cherry-picked and publicized the few occasions where he really let loose, so he got an undeserved reputation for being vitriolic.

        1. 2

          Pleasantly surprised, even.

        1. 1

          Diagnosing and fixing the issue was good. Coming up with and publishing the uptime faker was great.

          1. 8

            Is it possible there’s no information to recover? Like if I tried hard enough, I think I could rip a waffle, but I’m not sure the result would be sensible.

            1. 15

              I’m going to go ahead and postulate that if you tried to rip a waffle three hundred times, at some point during that process the bit stream would start to diverge wildly as mould grows on it, pieces get flung away by rotation and eventually it rots.

              1. 8

                The fact that he was able to get 13 copies (even out of 300) that have the same CRC suggests that there’s something real there. That wouldn’t happen by chance.

                1. 1

                  And thinking about it no, you couldn’t rip a waffle. For a CD drive to read anything at all it has to see pit/land transitions every 3 spaces (at least) at something approximating the correct rate. A random physical object would just have no signal at all and you’d know there was no signal, like trying to receive digital radio when there’s no carrier wave.

                1. 1

                  I’m wondering if you could, instead of using eval to make functions with the right names, allocate some function objects and assign strings to their name (or is it func_name?) attribute? Or is that one read-only? I remember some of the attributes of function objects are read-only but not all of them are.

                  1. 2

                    In Python 2.7, you can do this by constructing a new code object and assigning it to a function’s func_code attribute. The function’s func_name is ignored by the traceback generation. Still, no eval necessary!

                    Running this:

                    from types import CodeType
                    
                    def make_raiser(name, lineno, filename, type_to_raise):
                        def raises():
                            raise type_to_raise()
                        co = raises.func_code
                        raises.func_name = name
                        raises.func_code = CodeType(
                            co.co_argcount,
                            co.co_nlocals,
                            co.co_stacksize,
                            co.co_flags,
                            co.co_code,
                            co.co_consts,
                            co.co_names,
                            co.co_varnames,
                            filename,
                            name,
                            lineno - 1,
                            co.co_lnotab,
                            co.co_freevars,
                            co.co_cellvars,
                        )
                        return raises
                    
                    foo = make_raiser('foo', 3, 'foo.rs', lambda: IndexError(0))
                    bar = make_raiser('bar', 4, 'bar.rs', foo)
                    bar()
                    

                    with python2.7 gives me this output:

                    $ python2 maketraceback.py 
                    Traceback (most recent call last):
                      File "maketraceback.py", line 27, in <module>
                        bar()
                      File "bar.rs", line 4, in bar
                      File "foo.rs", line 3, in foo
                    IndexError: 0
                    

                    For Python3, there’s one extra argument to the CodeType constructor, so it’s just:

                    from types import CodeType
                    
                    def make_raiser(name, lineno, filename, type_to_raise):
                        def raises():
                            raise type_to_raise()
                        co = raises.__code__
                        raises.__name__ = name
                        raises.__code__ = CodeType(
                            co.co_argcount,
                            co.co_kwonlyargcount,
                            co.co_nlocals,
                            co.co_stacksize,
                            co.co_flags,
                            co.co_code,
                            co.co_consts,
                            co.co_names,
                            co.co_varnames,
                            filename,
                            name,
                            lineno - 1,
                            co.co_lnotab,
                            co.co_freevars,
                            co.co_cellvars,
                        )
                        return raises
                    
                    foo = make_raiser('foo', 3, 'foo.rs', lambda: IndexError(0))
                    bar = make_raiser('bar', 4, 'bar.rs', foo)
                    bar()
                    

                    and output looks the same:

                    $ python3 maketraceback3.py 
                    Traceback (most recent call last):
                      File "maketraceback3.py", line 29, in <module>
                        bar()
                      File "bar.rs", line 4, in bar
                      File "foo.rs", line 3, in foo
                    IndexError: 0
                    
                    1. 1

                      Yes, this is what I’m planning for the next post :)

                    1. 30

                      Other important political aspect of Material Design (and some other UI/web styles that are popular now) is “minimalism”. Your UI should have few buttons. User should have no choices. User should be consumer of content, not a producer. Having play and pause buttons is enough. User should have few choices how and what to consume — recommender system (“algorithmic timeline”, “AI”) should tell them what to consume. This rhetoric is repeated over and over in web and mobile dev blogs.

                      Imagine graphics editor or DAW with “material design”. It’s just nearly impossible. It’s suitable only for scroll-feed consumption and “personal information sharing” applications.

                      Also, it’s “mobile-first”, because Google controls mobile (80% market share or something like that). Some pages on Google itself (i.e. account settings) look on desktop like I’m viewing it on giant handset.

                      P.S. compared with “hipster” modernist things of ~2010, which often were nice and “warm”, Material Design looks really creepy for me even when considering only visual appearance.

                      1. 10

                        A potentially interesting challenge: What does a design language for maker-first applications look like?

                        1. 17

                          Not sure if such design languages exist, but from what I’ve seen, I have feeling that every “industry” has its own conventions and guidelines, and everything is very inconsistent.

                          • Word processors: lots of toolbar buttons (still lots of them now, but in “ribbons” which are just tabbed widgets). Use of ancient features like scroll lock key. Other types of apps usually have actions in menus or in searchable “run” dialogs, not toolbar button for each feature.
                          • Graphics editors: narrow toolbars with very small buttons (popularized by both Adobe and Macromedia, I think). Various non-modal dialogs have widgets of nonstandard small size. Dark themes.
                          • DAWs: lots of insane skeuomorphism! Everything should look like real synths and effects, with lots of knobs and skinning. Dark themes. Nonstandard widgets everywhere. Single program may have lots of multiple different styles of widgets (i.e. Reason, Fruity Loops).
                          • 3D: complicated window splits, use of all 3 mouse buttons, also dark themes. Nonstandard widgets, again. UI have heritage from Silicon Graphics workstations and maybe Amiga.

                          I thought UI guidelines for desktop systems (as opposed to cellphone systems) have lots of recommendations for such data editing programs, but seems that no, they mostly describe how to place standard widgets in dialogs. MacOS guidelines are based on programs that are included with MacOS, which are mostly for regular consumers or “casual office” use. Windows and Gnome guidelines even try to combine desktop and mobile into one thing.

                          Most “editing” programs ignore these guidelines and have non-native look and feel (often the same look-and-feel on different OSes).

                          1. 3

                            3D: complicated window splits, use of all 3 mouse buttons, also dark themes. Nonstandard widgets, again. UI have heritage from Silicon Graphics workstations and maybe Amiga.

                            Try Lisp machines. 3D was a strong market for Symbolics.

                          2. 9

                            I’d suggest–from time spent dealing with CAD, programming, and design tools–that the biggest thing is having common options right there, and not having overly spiffy UI. Ugly Java swing and MFC apps have shipped more content than pretty interfaces with notions of UX (notable exceptions tend to be music tools and DAW stuff, for reasons incomprehensible to me). A serious tool-user will learn their tooling and extend it if necessary if the tool is powerful enough.

                            1. 0

                              (notable exceptions tend to be music tools and DAW stuff, for reasons incomprehensible to me)

                              Because artists demand an artsy-looking interface!

                            2. 6

                              We had a great post about two months back on pie menus. After that, my mind goes to how the Android app Podcast Addict does it: everything is configurable. You can change everything from the buttons it shows to the tabs it has to what happens when you double-click your headset mic. All the good maker applications I’ve used give me as much customization as possible.

                              1. 2

                                It’s identical to the material design guidelines but with a section on hotkeys, scripts, and macros.

                              2. 5

                                P.S. compared with “hipster” modernist things of ~2010

                                What do you mean by this

                                1. 4

                                  Stuff like Bootstrap mentioned there, early Instagram, Github. Look-and-feels commonly associated with Silicon Valley startups (even today).

                                  These things usually have the same intentions and sins mentioned in this article, but at least look not as cold-dead as Material Design.

                                  1. 3

                                    Isn’t this like… today? My understanding was: web apps got the material design feel, while landing pages and blogs got bootstrappy.

                                    I may be totally misinterpreting what went on though

                                  2. 3

                                    Bootstrap lookalikes?

                                1. 3

                                  The linked interview with José Valim talking about how Elixir came about is really good! Alas no transcript but oh well.

                                  1. 2

                                    I can’t install pocketsphinx on the system I’m typing this on, but this should work. Well, it may not be perfectly accurate, but its a transcript :D

                                    wget https://cdn.changelog.com/uploads/podcast/194/the-changelog-194.mp3 -O /tmp/file.mp3 && ffmpeg -i /tmp/file.mp3 -ar 16000 -ac 1 /tmp/file.wav && pocketsphinx_continuous -infile /tmp/file.wav 2> pocketsphinx.log > ~/result.txt

                                    And then just read ~/result.txt for the transcript.

                                  1. 1

                                    Glad to hear that a replacement for -background: -webkit-canvas(...) is making it onto a standards track. https://webkit.org/blog/176/css-canvas-drawing/

                                    1. 3

                                      I really hate this as well. I still drive a 5spd … in America. I feel like that’s becoming incredibly rare. There is something that feels really good about using every appendage, and driving being a fully engaging experience.

                                      My favorite in-dash units are probably the Pioneers. They have decent navigation and they have the most useful functions (volume, switch track) as physical buttons. A friend of mine had a Cherokee and her stock in-dash unit was awful. There is no fucking reason air-con and the heated seats should be in a touch screen interface. Those should be physical buttons you can reach for without fumbling.

                                      I also hate the Audi climate controls … buttons to move the temperate up and down?! .. and you have to look down to watch it? Compare that to the Subaru’s where is’a physical dial you can adjust super fast once you’re use to it.

                                      1. 9

                                        AC seems like an application that would really benefit from always-listening voice control.

                                        Instead of switching on and off with commands like “AC on” and “AC off”, it should be programmed to respond to either “fuck me it’s hot” or “aaaaa the day-star it burnsssss” to switch on and and “brrrrr” to deactivate.

                                      1. 1

                                        Nice article. I’ve had two Model M’s (the m key on the first one stopped responding) and I like the sturdy feel. However recently I’ve been finding the buckling springs a bit too loud and heavy. The keyboard is also pretty big so it’s hard to reach for the mouse. Maybe it’s time to go for a Spacesaver 104 :)

                                        1. 2

                                          it’s hard to reach for the mouse

                                          For this reason I have vi keybindings in all the programs I use most. It’s not a productivity thing, I just really dislike taking my paws off home row to reach for a pointing device.

                                          1. 2

                                            I have a mouse layer with movement on WASD, and mouse buttons on F/R, for the same reason, for more unruly programs.

                                            1. 1

                                              Same, but I also consider a TKL keyboard (CM MasterKeys S for the beginning) or 60% which may require a while to accustom.

                                          1. 2

                                            The USR1 signal is supported on Linux for making dd report the progress. Implementing something similar for cp and other commands, and then make ctrl-t send USR1 can’t be too hard to implement. Surely, it is not stopped by the Linux kernel itself?

                                            1. 8

                                              SIGUSR1 has an nasty disadvantage relative to SIGINFO: by default it kills the process receiving it if no handler is installed. 🙁 The behavior you really want is what SIGINFO has, which is defaulting to a no-op if no handler is installed.

                                              • I don’t want to risk killing a long-running complicated pipeline that I was monitoring by accidentally sending SIGUSR1 to some process that doesn’t have a handler for it
                                              • there’s always a brief period between process start and the call to signal() or sigaction() during which SIGUSR1 will be lethal
                                              1. 1

                                                That’s interesting. The hacky solution would be to have a whitelist of processes that could receive SIGUSR1 when ctrl-t was pressed, and just ignore the possibility of someone pressing ctrl-t at the very start of a process.

                                                A whitelist shouldn’t be too hard to maintain. The only tool I know of that handles SIGUSR1 is dd.

                                              2. 5

                                                On BSD it’s part of the TTY layer, where ^T is the default value of the STATUS special character. The line printed is actually generated by the kernel itself, before sending SIGINFO to the foreground process group. SIGINFO defaults to ignored, but an explicit handler can be installed to print some extra info.

                                                I’m not sure how equivalent functionality could be done in userspace.

                                                1. 1

                                                  It would be a bit hacky, but the terminal emulator could send USR1 to the last started child process of the terminal, when ctrl-t is pressed. The BSD way sounds like the proper way to do it, though.

                                                  1. 4

                                                    I have a small script and a tmux binding for linux to do this:

                                                    #!/bin/sh
                                                    # tmux-signal pid [signal] - send signal to running processes in pids session
                                                    # bind ^T run-shell -b "tmux-signal #{pane_pid} USR1"
                                                    
                                                    [ "$#" -lt 1 ] && return 1
                                                    sid=$(cut -d' ' -f6 "/proc/$1/stat")
                                                    sig=$2
                                                    : ${sig:=USR1}
                                                    ps -ho state,pid --sid "$sid" | \
                                                    while read state pid; do
                                                            case "$state" in
                                                            R) kill -s"$sig" "$pid" ;;
                                                            esac
                                                    done
                                                    
                                                    1. 4

                                                      Perfect, now we only need to make more programs support USR1 and lobby for this to become the default for all Linux terminal emulators and multiplexers. :)

                                              1. 12

                                                You don’t have to use the golden ratio; multiplying by any constant with ones in the top and bottom bits and about half those in between will mix a lot of input bits into the top output bits. One gotcha is that it only mixes less-significant bits towards more-significant ones, so the 2nd bit from the top is never affected by the top bit, 3rd bit from the top isn’t affected by top two, etc. You can do other steps to add the missing dependencies if it matters, like a rotate and another multiply for instance. (The post touches on a lot of this.)

                                                FNV hashing, mentioned in the article, is an old multiplicative hash used in DNS, and the rolling Rabin-Karp hash is multiplicative. Today Yann Collet’s xxHash and LZ4 use multiplication in hashing. There have got to be a bajillion other uses of multiplication for non-cryptographic hashing that I can’t name, since it’s such a cheap way to mix bits.

                                                It is, as author says, kind of interesting that something like a multiplicative hash isn’t the default cheap function everyone’s taught. Integer division to calculate a modulus is maybe the most expensive arithmetic operation we commonly do when the modulus isn’t a power of two.

                                                1. 1

                                                  Nice! About the leftward bit propagation: can you do multiplication modulo a compile time constant fast? If you compute (((x * constant1) % constant2) % (1<<32)) where constant1 is the aforementioned constant with lots of ones, and constant2 is a prime number quite close to 1<<32 then that would get information from the upper bits to propagate into the lower bits too, right? Assuming you’re okay with having just slightly fewer than 1<<32 hash outputs.

                                                  (Replace 1<<32 with 1<<64 above if appropriate of course.)

                                                  1. 1

                                                    You still have to do the divide for the modulus at runtime and you’ll wait 26 cycles for a 32-bit divide on Intel Skylake. You’ll only wait 3 cycles for a 32-bit multiply, and you can start one every cycle. That’s if I’m reading the tables right. Non-cryptographic hashes often do multiply-rotate-multiply to get bits influencing each other faster than a multiply and a modulus would. xxHash arranges them so your CPU can be working on more than one at once.

                                                    (But worrying about all bits influencing each other is just one possible tradeoff, and, e.g. the cheap functions in hashtable-based LZ compressors or Rabin-Karp string search don’t really bother.)

                                                    1. 1

                                                      you’ll wait 26 cycles for a 32-bit divide on Intel Skylake

                                                      And looking at that table, 35-88 cycles to divide by a 64 bit divide. Wow. That’s so many cycles, I didn’t realize. But I should have: on a 2.4 GHz processor 26 cycles is 10.83 ns per op, which is roughly consistent with the author’s measurement of ~9 ns per op.

                                                      1. 1

                                                        That’s not what I asked. I asked a specific question.

                                                        can you do multiplication modulo a compile time constant fast?

                                                        similarly to how you can do division by a constant fast by implementing it as multiplication by the divisor’s multiplicative inverse in the group of integers modulo 2^(word size). clang and gcc perform this optimisation out the box already for division by a constnat. What I was asking is if there’s a similar trick for modulo by a constant. You obviously can do (divide by divisor, multiply by divisor, subtract from original number), but I’m wondering if there’s something quicker with a shorter dependency chain.

                                                        1. 1

                                                          OK, I get it. Although I knew about the inverse trick for avoiding DIVs for constant divisions, I didn’t know or think of extending that to modulus even in the more obvious way. Mea culpa for replying without getting it.

                                                          I don’t know the concrete answer about the best way to do n*c1%(2^32-5) or such. At least does intuitively seem like it should be possible to get some win from using the high bits of the multiply result as the divide-by-multiplying tricks do.

                                                    2. 1

                                                      So does that mean that when the author says Dinkumware’s FNV1-based strategy is too expensive, it’s only more expensive because FNV1 is byte-by-byte and fibonacci hashing multiplying by 2^64 / Φ works on 8 bytes at a time?

                                                      Does that mean you could beat all these implementations by finding a multiplier that produces an even distribution when used as a hash function working on 8 byte words at a time? That is, he says the fibonacci hash doesn’t produce a great distribution, whereas multipliers like the FNV1 prime are chosen to produce good even distributions. So if you found an even-distribution-producing number for an 8 byte word multiplicative hash, would that then work just as well whatever-hash-then-fibonacci-hash? But be faster because it’s 1 step not 2?

                                                      1. 1

                                                        I think you’re right about FNV and byte- vs. word-wise multiplies.

                                                        Re: 32 vs. 64, it does look like Intel’s latest big cores can crunch through 64-bit multiplies pretty quickly. Things like Murmur and xxHash don’t use them; I don’t know if that’s because perf on current chips is for some reason not as good as it looks to me or if it’s mainly for the sake of older or smaller platforms. The folks that work on this kind of thing surely know.

                                                        Re: getting a good distribution, the limitations on the output quality you’ll get from a single multiply aren’t ones you can address through choice of constant. If you want better performance on the traditional statistical tests, rotates and multiplies like xxHash or MurmurHash are one approach. (Or go straight to SipHash, which prevents hash flooding.) Correct choice depends on what you’re trying to do.

                                                        1. 2

                                                          That makes me wonder what hash algorithm ska::unordered_map uses that was faster than FNV1 in dinkumware, but doesn’t have the desirable property of evenly mixing high bits without multiplying the output by 2^64 / φ. Skimming his code it looks like std::hash.

                                                          On my MacOS system, running Apple LLVM version 9.1.0 (clang-902.0.39.2), std::hash for primitive integers is the identity function (i.e. no hash), and for strings murmur2 on 32 bit systems and cityhash64 on 64 bit systems.

                                                          // We use murmur2 when size_t is 32 bits, and cityhash64 when size_t
                                                          // is 64 bits.  This is because cityhash64 uses 64bit x 64bit
                                                          // multiplication, which can be very slow on 32-bit systems.
                                                          

                                                          Looking at CityHash, it also multiplies by large primes (with the first and last bits set of course).

                                                          Assuming then that multiplying by his constant does nothing for string keys—plausible since his benchmarks are only for integer keys—does that mean his benchmark just proves that dinkumware using FNV1 for integer keys is better than no hash, and that multiplying an 8 byte word by a constant is faster than multiplying each integer byte by a constant?

                                                      2. 1

                                                        A fair point that came up over on HN is that people mean really different things by “hash” even in non-cryptographic contexts; I mostly just meant “that thing you use to pick hashtable buckets.”

                                                        In a trivial sense a fixed-size multiply clearly isn’t a drop-in for hashes that take arbitrary-length inputs, though you can use multiplies as a key part of variable-length hashing like xxHash etc. And if you’re judging your hash by checking that outputs look random-ish in a large statistical test suite, not just how well it works in your hashtable, a multiply also won’t pass muster. A genre of popular non-cryptographic hashes are like popular non-cryptographic PRNGs in that way–traditionally judged by running a bunch of statistical tests.

                                                        That said, these “how random-looking is your not-cryptographically-random function” games annoy me a bit in both cases. Crypto-primitive-based functions (SipHash for hashing, cipher-based PRNGs) are pretty cheap now and are immune not just to common statistical tests, but any practically relevant method for creating pathological input or detecting nonrandomness; if they weren’t, the underlying functions would be broken as crypto primitives. They’re a smart choice more often than you might think given that hashtable-flooding attacks are a thing.

                                                        If you don’t need insurance against all bad inputs, and you’re tuning hard enough that SipHash is intolerable, I’d argue it’s reasonable to look at cheap simple functions that empirically work for your use case. Failing statistical tests doesn’t make your choice wrong if the cheaper hashing saves you more time than any maldistribution in your hashtable costs. You don’t see LZ packers using MurmurHash, for example.

                                                      1. 0

                                                        A list of beliefs about programming that I maintain are misconceptions.

                                                        1. 3

                                                          Small suggestion: use a darker, bigger font. There are likely guidelines somewhere but I don’t think you can fail with using #000 for text people are supposed to read for longer than a couple of seconds.

                                                          1. 3

                                                            Current web design seems allergic to any sort of contrast. Even hyper-minimalist web design calls for less contrast for reasons I can’t figure out. Admittedly, I’m a sucker for contrast; I find most programming colorschemes hugely distasteful for the lack of contrast.

                                                            1. 6

                                                              I think a lot of people find the maximum contrast ratios their screens can produce physically unpleasant to look at when reading text.

                                                              I believe that people with dyslexia in particular find reading easier with contrast ratios lower than #000-on-#fff. Research on this is a bit of a mixed bag but offhand I think a whole bunch of people report that contrast ratios around 10:1 are more comfortable for them to read.

                                                              As well as personal preference, I think it’s also quite situational? IME, bright screens in dark rooms make black-on-white headache inducing but charcoal-on-silver or grey-on-black really nice to look at.

                                                              WCAG AAA asks for a contrast ratio of 7:1 or higher in body text which does leave a nice amount of leeway for producing something that doesn’t look like looking into a laser pointer in the dark every time you hit the edge of a glyph. :)

                                                              As for the people putting, like, #777-on-#999 on the web, I assume they’re just assholes or something, I dunno.

                                                              Lobsters is #333-on-#fefefe which is a 12.5:1 contrast ratio and IMHO quite nice with these fairly narrow glyphs.

                                                              (FWIW, I configure most of my software for contrast ratios around 8:1.)

                                                              1. 2

                                                                Very informative, thank you!

                                                          2. 3

                                                            I think the byte-order argument doesn’t hold when you mentioned ntohs and htons which are exactly where byte-order needs to be accounted for…

                                                            1. 2

                                                              If you read the byte stream as a byte stream and shift them into position, there’s no need to check endianness of your machine (just need to know endianness of the stream) - the shifts will always do the right thing. That’s the point he was trying to make there.

                                                              1. 2

                                                                ntohs and htons do that exact thing and you don’t need to check endianess of your machine, so the comment about not understanding why they exist makes me feel like the author is not quite groking it. Those functions/macros can be implemented to do the exact thing linked to in the blog post.

                                                          1. 3

                                                            One problem with std::optional, at least at the moment, while it’s relatively new, is that std is opinionated, so you often won’t find library functions that work with a std::optional-based codebase.

                                                            For example, parsing an integer from a string is a classic example of a function which might not succeed. So it would make sense to use std::optional to store the result. However, the standard library provides int stoi(const std::string& str, std::size_t* pos = 0, int base = 10) and friends, which signal failure by throwing exceptions.

                                                            So, in theory, std::optional provides an alternative way to handle failure, somewhat like some haskell or rust code might, making the possibility of failure explicit in the type, and thus forcing you to explicitly handle it or pass it on. However, (unless a library exists which I’m not aware of?) you may need to reimplement large parts of the standard library to make them fit.

                                                            1. 3

                                                              Right. “This is a feature of the standard library!” means something entirely different in C++ than in other programming languages.

                                                              1. 2

                                                                Can you make it much less of a headache by defining a generic function that takes a lambda, calls it in a try/catch, returns the successful value from the try branch, returns nullopt from the catch?

                                                                1. 1

                                                                  There are any number of workarounds, that obfuscate the code to varying degrees. This same situation arose with Optional in java 8, it’s there, but not really, so a lot of places you’d like to use it you have to go through similar contortions. The other problem is if interacting with different teams writing different parts of the app; everyone has to be on the same page or you’ll end up wrapping/unwrapping optional all over. And libraries. In the end I found optionals were a lot of trouble for very little gain.

                                                                  1. 1

                                                                    I did wonder about that. However, blindly catching all different exceptions and effectively discarding the information about which exception it was seems unwise. Of course you could keep the information while still using sum types, but then you don’t really want std::optional, you want an either type which can hold either a valid value or an error code. I’m not sure whether the standard library has one of these or whether you’d have to roll your own.

                                                                    1. 1

                                                                      blindly catching all different exceptions and effectively discarding the information about which exception it was seems unwise

                                                                      Sure, I wouldn’t be very happy with a blind try/catch around something like a database access or RPC call. Just if the thing you’re wrapping is something really boring like (say) parsing a string into an integer, the exception if it goes wrong isn’t going to be very interesting anyway.

                                                                1. 1

                                                                  On the window vs global thing: for the love of compatibility, please put both in, both pointing to exactly the same object.

                                                                  1. 23

                                                                    This is a bit disappointing. It feels a bit like we are walking into the situation OpenGL was built to avoid.

                                                                    1. 7

                                                                      To be honest we are already in that situation.

                                                                      You can’t really use GL on mac, it’s been stuck at D3D10 feature level for years and runs 2-3x slower than the same code under Linux on the same hardware.

                                                                      It always seemed like a weird decision from Apple to have terrible GL support, like if I was going to write a second render backend I’d probably pick DX over Metal.

                                                                      1. 6

                                                                        I remain convinced that nobody really uses a Mac on macOS for anything serious.

                                                                        And why pick DX over Metal when you can pick Vulkan over Metal?

                                                                        1. 3

                                                                          Virtually no gaming or VR is done on a mac. I assume the only devs to use Metal would be making video editors.

                                                                          1. 1

                                                                            This is a bit pedantic, but I play a lot of games on mac (mainly indie stuff built in Unity, since the “porting” is relatively easy), and several coworkers are also mac-only (or mac + console).

                                                                            Granted, none of us are very interested in the AAA stuff, except a couple of games. But there’s definitely a (granted, small) market for this stuff. Luckily stuff like Unity means that even if the game only sells like 1k copies it’ll still be a good amount of money for “provide one extra binary from the engine exporter.”

                                                                            The biggest issue is that Mac hardware isn’t shipping with anything powerful enough to run most games properly, even when you’re willing to spend a huge amount of money. So games like Hitman got ported but you can only run it on the most expensive MBPs or iMac Pros. Meanwhile you have sub-$1k windows laptops which can run the game (albeit not super well)

                                                                          2. 2

                                                                            I think Vulkan might have not been ready when Metal was first skecthed out – and Apple does not usually like to compromise on technology ;)

                                                                            1. 2

                                                                              My recollection is that Metal appeared first (about June 2014), Mantle shipped shortly after (by a coupe months?), DX12 shows up mid-2015 and then Vulkan shows up in February 2016.

                                                                              I get a vague impression that Mantle never made tremendous headway (because who wants to rewrite their renderer for a super fast graphics API that only works on the less popular GPU?) and DX12 seems to have made surprisingly little (because targeting an API that doesn’t work on Win7 probably doesn’t seem like a great investment right now, I guess? Current Steam survey shows Win10 at ~56% and Win7+8 at about 40% market share among people playing videogames.)

                                                                              1. 2

                                                                                Mantle got heavily retooled into Vulkan, IIRC.

                                                                                1. 1

                                                                                  And there was much rejoicing. ♥

                                                                      1. 5

                                                                        Congratulations to Lua, Zig, and Rust on being in C’s territory. Lua actually beat it. Nim and D are nearly where C++ is but not quite. Hope Nim closes that gap and any others given its benefits over C++, esp readability and compiling to C.

                                                                        1. 1

                                                                          To be clear, and a little pedantic, Lua =/= Luajit.

                                                                          1. 1

                                                                            The only thing I know about Lua is it’s a small, embeddable, JIT’d, scripting language. So, what did you mean by that? Do Lua the language and LuaJIT have separate FFI’s or something?

                                                                            1. 5

                                                                              I think just that there are two implementations. One is just called “Lua”, is an interpreter written in C, supposedly runs pretty fast for a bytecode interpreter. The other is LuaJIT and runs much faster (and is the one benchmarked here).

                                                                              1. 1

                                                                                I didn’t even know that. Things I read on it made me think LuaJIT was the default version everyone was using. Thanks!

                                                                                  1. 2

                                                                                    I waited till I was having a cup of coffee. Wow, this is some impressive stuff. More than I had assumed. There’s a lot of reuse/blending of structures and space. I’m bookmarking the links in case I can use these techniques later.

                                                                                  2. 2

                                                                                    I think people when doing comparative benchmarks very often skip over the C Lua implementation because it isn’t so interesting to them.

                                                                                1. 4

                                                                                  Extra context: LuaJIT isn’t up to date with the latest Lia either, so they’re almost different things, sorta.

                                                                                  LuaJIT is extremely impressive.

                                                                            1. 1

                                                                              I’m a little surprised the x87 is even involved here - doesn’t targeting “modern” x86 usually involve using the scalar SSE instructions since they have behave more predictably than x87 does?

                                                                              1. 3

                                                                                Even if your compiler emits exclusively SSE instructions for actual arithmetic, the de-facto-standard calling conventions on x86 (but not x86-64), cdecl and stdcall, return floating-point values from functions by sticking them onto the x87 FPU stack. So there will still be a handful of x87 instructions emitted solely to push/pop the FPU stack, even if no other x87 features are used, which seems to be what’s happening here. That convention was set ages ago and changing it would break ABI compatibility.

                                                                                1. 1

                                                                                  Interesting, thank you!

                                                                              1. 25

                                                                                This seems a good time to promote a paper our team published last year (sorry to blow my own trumpet :P ): http://soft-dev.org/pubs/html/barrett_bolz-tereick_killick_mount_tratt__virtual_machine_warmup_blows_hot_and_cold_v6/

                                                                                We measured not only the warmup, but also the startup of lots of contemporary JIT compilers.

                                                                                On the a quad-core i7-4790 @ 3.6GHz with 32GB of RAM, running Debian 8:

                                                                                • C was the fastest to start up at 0.00075 secs (+/- 0.000029) – surprise!
                                                                                • LuaJIT was the next fastest to start up at 0.00389 secs (+/- 0.000442).
                                                                                • V8 was in 3rd at 0.08727 secs (+/- 0.000239).
                                                                                • The second slowest to start up was HHVM at 0.75270 secs (+/- 0.002056).
                                                                                • The slowest overall to start up was JRubyTruffle (now called TruffleRuby) at 2.66179 sec (+/- 0.011864). This is a Ruby implementation built on GraalVM (plain Java on GraalVM did much better in terms of startup).

                                                                                Table 3 in the linked paper has a full breakdown.

                                                                                The main outcome of the paper was that few of the VMs we benchmarked reliably achieved a steady state of peak performance after 2000 benchmark iterations, and some slowed down over time.

                                                                                1. 1

                                                                                  I saw a talk about this. Very cool stuff! It is a good antidote to the thrall of benchmarks.

                                                                                  1. 1

                                                                                    Cool work! You should make that a submission on its own in the morning in case someone misses it due to a filter. For instance, people who don’t care about Python specifically like main post is tagged with. Just programming, performance, and compiler tags should do. Good news is a lot of people still saw and enjoyed it per the votes. You definitely deserve an “authored by” submission, though. :)

                                                                                    1. 3

                                                                                      It was on the lobsters front page about six months ago. https://lobste.rs/s/njsxtv/virtual_machine_warmup_blows_hot_cold

                                                                                      It was a very good paper and I personally wouldn’t mind seeing it reposted, but I don’t actually know what the etiquette for that is here.

                                                                                      1. 1

                                                                                        I forgot. My bad. I should probably do a search next time.

                                                                                  1. 6

                                                                                    Elegant! The simplicity is really impressive, a real 80/20 kind of solution.

                                                                                    Maybe you could solve the pipefail thing by having a tts utility that invokes the target program as a subprocess, capturing its stdout+err and then when it stops, wait4() it and return with the same error code the child did.

                                                                                    1. 4

                                                                                      Added to rtss - stderr of the child still goes to stderr, so redirection works as you’d expect.

                                                                                      1. 2

                                                                                        Nice. ❤