1. 12

    Yes, cat | grep is a “useless use of cat”, but it allows me to quickly change the grep regex, which is much more annoying if I have to navigate past the filename first.

    1. 8

      I agree, and I think it’s kind of stupid to try to avoid cat outside of scripts, but that said, there is a way to solve this specific problem:

      <file grep regex
      
      1. 1

        That’s cool. I think I’ve read about this shell syntax before in man pages but never saw the use for it. Now I do!

        1. 1

          I’ve always liked that syntax, especially if you have a redirection at the end of a follow-on pipeline (<foo | bar >baz ) but it can break in bash with some shell constructs in the pipeline (some uses of while)

          1. 2

            I think you mean

            <foo bar | baz >qux
            

            I too find this pattern useful, specifically for troff pipelines.

        2. 4

          “Useless use of cat” is usually bad advice, anyhow.

          As far as I can tell, it only matters for performance-sensitive scripts. If someone’s just hacking around on the command line, cat ... | ... is extremely useful since it allows quick switching out of inputs. Let’s say I’ve got some_expensive_script.sh | ... – better to capture the output to a file, run cat output.txt | ... and build up the pipeline until it works, then swap back in the expensive script call when I’m ready.

          (Also, I very often end up accidentally writing < output.txt | grep ... because I forget to delete the pipe, and so grep gets nothing! This leads to a great deal of frustrated debugging until I notice it. Best to just keep it a pipeline while developing rather than adhering to some random internet person’s notions of terminal purity.)

        1. 4

          lib/pq: An early Postgres frontrunner in the Go ecosystem. It was good for its time and place, but has fallen behind, and is no longer actively maintained.

          Latest release was 6 days ago: https://github.com/lib/pq/releases/tag/v1.10.3, so it seems that it is still maintained.

          1. 6

            The README explicitly says:

            This package is currently in maintenance mode. For users that require new features or reliable resolution of reported bugs, we recommend using pgx which is under active development.

            1. 8

              “In maintenance mode” does not mean that it is not actively maintained.

              1. 5

                I would argue that the statement “Maintainers usually do not resolve reported issues” does mean it’s not actively maintained.

                That being said, this is probably getting into the semantics of what “actively maintained” means. For me, it means there’s active development and that reported issues will be resolved, neither of which seem to be the case for lib/pq at the moment.

          1. 16

            I am a bit reassured that I am not the only one thinking that the numerous claims of “blazingly fast” Rust lib/apps can be a bit annoying. ^^

            1. 27

              93°C on all cores is certainly blazing.

              1. 2

                They might run fast, but compiling is painfully slow. My 2017 (I think) dual core XPS takes ages to compile anything with GUI elements or even moderately complicated. On my desktop machine (5900X) it’s bearable.

              1. 7

                I think one lesson is “Don’t trust any performance comparison that doesn’t accept pull requests.”

                1. 9

                  “Don’t trust a statistic which you haven’t manipulated yourself”, is a (paraphrased) quote attributed to Churchill (decidedly not German, as I thought before).

                  Doing benchmarks correctly is so hard that I would default to “suspicious until proven otherwise” when looking at literally any result.

                1. 2

                  First idea that popped into my head is that this is only applicable if the thing in question is constructible. Non-existence is not constructible by its nature: I can prove to you that no decision procedure for the halting problem exists, but I cannot show this constructively.

                  1. 1

                    I would really like to learn more about this, aren’t coq proofs constructive by nature?

                    Here is a proof that the halting problem for turing machines is undecidable: https://github.com/uds-psl/coq-library-undecidability/blob/30d773c57f79e1c5868fd369cd87c9e194902dee/theories/TM/TM_undec.v#L8-L12

                    There are other proofs in that repo that show other problems as being undecidable.

                    1. 2

                      Huh, TIL. Indeed, it seems that there exist proofs of the undecidability of the halting problem that are constructive.

                      ¬ (A ∧ ¬ A) is apparently provable in intuitionistic logic, which suffices for diagonalization arguments:

                      lemma "¬ (A ∧ ¬ A)"
                      proof (rule notI)
                        assume 1: "A ∧ ¬ A"
                        from 1 have "A" by (rule conjE)
                        from 1 have "¬ A" by (rule conjE)
                        from `¬ A` `A` show False by (rule notE)
                      qed
                      

                      The general point still stands though, as there are other examples, such as non-constructive existence proofs.


                      In general Coq proofs are not necessarily constructive, since one can define the law of the excluded middle as an axiom in Coq (and this is done in e.g. Coq.Classical). I can’t say anything about the proofs you linked though, as my Coq-foo is very limited.

                      1. 1

                        I think, the way the definitions expand in that repo, that undecidable (HaltTM 1) is decidable (HaltTM 1) -> decidable (HaltTM 1), which is trivially true. That is, it’s taken as an axiom that Turing machines are undecidable. (I think? I might be misreading)

                      2. 2

                        while Coq proofs are constructive, proving “~ exists t, t determines halting” does not construct any Turing machines, what it constructs is something that takes a hypothetical existence proof and builds a proof of false from it.

                        i.e. in construct math, ~ P is a function P -> False. this function can never be invoked as you can never build a P object.

                      3. 1

                        As this post was tagged practices (not math) and the article addresses the difficulty of convincing people, I suspect you’ve misunderstood the author’s use of the phrase “constructive proof.”

                        Mathematical (constructive proof) vs rhetorical (constructive criticism).

                        1. 1

                          I don’t feel like the author talks about constructive criticism at all, how did you come to that conclusion?

                          1. 1

                            As this post was tagged practices (not math) and the article addresses the difficulty of convincing people

                            ☝🏾

                            1. 2

                              Just for future reference, “constructive proof” is a term from math and I meant my usage of the term to be an analogy to math. Math also involves convincing people. But maybe I should have tagged it computer science or math, sorry.

                      1. 5

                        Some thoughts on horizontal overflow:

                        If you ever have trouble tracking down what element is causing overflow, add this rule in the browser inspector:

                        * { outline: thin solid red }
                        

                        This makes the bounding boxes of every single element visible. Very helpful.

                        The author mentioned using work-break: break-word. I have an open question on that rule: Is there any reason not to set it on the body on most sites? I’ve gone on quests to remove scrolling from user-entered content, and I end up setting word-break in a ton of different places. I’m not sure of the trade-offs of just setting it globally.

                        1. 2

                          Some googling reveals that break-word is not CJK-aware, but I’m no expert myself. I only publish English content, so this is fine for me :)

                        1. 15

                          This is exactly what I was thinking when more and more stuff was pushed into the USB-C stack.

                          Previously it was kinda easy to explain that “no, you can’t put the USB cable in the HDMI slot, that won’t work”. Now you have cables that look identical, but one can charge with 90W and the other can’t, even though both fit. It’s going to be confusing for everyone having to be careful with the which cable can be plugged into where.

                          1. 9

                            Everything about the official naming/branding used in USB 3 onward seems purposely designed to be confusing.

                            1. 5

                              It seems like for some reason the overriding priority was making the physical connector the same, but it’s fine to run all kinds of incompatible power and signals through it. I preferred the old way of giving different signals different connectors so you knew what was going on!

                              1. 2

                                The downside to that is I guess that each different type of device that you would want to be able to connect to a small form factor device such as a phone or a slim laptop would need to have with every type of connector that you might want, or alternatively you would need dongles left and right.

                                I can now charge my phone with my laptop charger, that has not been the case in previous generations.

                                I believe we are moving into POE-enabled network cable territory on some conceptual level; data+power (either optional) is the level of abstraction that the connector is made common on.

                                1. 4

                                  I’m surprised the business laptop manufacturers haven’t tried getting into PoE based chargers, considering most offices have not just Ethernet, but PoE, and it’d solve two cables at once.

                                  1. 1

                                    I think (a hunch more than data-backed) that the last meters of networking is increasingly moving towards wireless, at least if the end-user equipment is a laptop. Monitors on the other hand are still cable-connected, and that is one of the singled-out use-cases for usb-c now that we have it.

                                    Looking back at before usb-c, then I’d agree power and networking would be a neat thing to combine, but it would have to have a different connector than regular ethernet, those plastic tab spring lock things would not last long.

                                    1. 1

                                      I mean, I would like an Ethernet AAM for Type C…

                                    2. 1

                                      I’d love to know what you’re basing this thesis on because as far as I know I’ve never worked in a German office with PoE in the last 20 years. (Actually it was a big deal in my last company because we had PoE-powered devices, so there was kind of a “where and in which room do we put PoE switches for the specialty hardware”)

                                      1. 1

                                        Most offices nowadays have PoE if only for deskphones.

                                  2. 1

                                    It’s fine if you have a manufacturer that you can trust to make devices that work with everything (USB, DP, TB, PD, etc.) the cable can throw at it. (Like, my laptop will do anything a Type C cable can do, so there’s no confusion.) The problem is once you get less scrupulous manufacturers of the JKDSUYU variety on Amazon et al, the plan blows up spectaularly.

                                  3. 1

                                    When the industry uses a bunch of mutually-incompatible connectors for different types of cables, tech sites complain “Ugh, why do I need all these different types of cables! It’s purposely designed to be overcomplex and confusing!”

                                    When the industry settles on one connector for all cable types, tech sites complain “Ugh, how am I supposed to tell which cables do which things! It’s purposely designed to be overcomplex and confusing!”

                                    1. 5

                                      Having the same connector but incompatible cable is much worse than the alternative.

                                      1. 3

                                        The alternative is that every use case develops its own incompatible connector to distinguish its particular power/data rates and feature set. At which point you need either a dozen ports on every device, or a dozen dongles to connect them all to each other.

                                        This is why there have already been well-intentioned-but-bad-idea-in-practice laws trying to force standardization onto particular industries (like mobile phones). And the cost of standardization is that not every cable which has the connector will have every feature of every other cable that currently or might in the future exist.

                                      2. 1

                                        They could’ve avoided this by either making it obvious when one cable or connector doesn’t support the full set or simply disallowing any non full-featured cables and connectors. Have fun buying laptops and PCs while figuring out how many of their only 3 USB-C connections are actually able to handle what you need, which of them you can use in parallel for stuff you want to use in parallel and which of them is the only port that can actually do everything but is also reserved for charging your laptop. It’s a god damn nightmare and makes many laptops unusable outside some hipster coffee machine.

                                        Meanwhile I’m going to buy something that has a visible HDMI, DP, LAN and USB-A connector, so I’m not stranded with either charging, mouse connection, external display or connecting my USB3 drive. It’s enraging.

                                        1. 1

                                          or simply disallowing any non full-featured cables and connectors

                                          OK, now we’re back on the treadmill, because the instant someone works out a way to do a cable that can push more power or data through, we need a new connector to distinguish from already-manufactured cables which will no longer be “full-featured”. And now we’re back to everything having different and incompatible connectors so that you either need a dozen cables or a dozen dongles to do things.

                                          Or we have to declare an absolute end to any improvements in cable features, so that a cable manufactured today will still be “full-featured” ten years from now.

                                          There is no third option here that magically lets us have the convenience of a universal connector and always knowing the cable’s full capabilities just from a glance at the connector shape, and ongoing improvements in power and data transmission. In fact for some combinations it likely isn’t possible to have even two of those simultaneously.

                                          It’s a god damn nightmare and makes many laptops unusable outside some hipster coffee machine.

                                          Ah yes, it is an absolute verifiable objective fact that laptops with USB-C ports are completely unsuitable and unusable by any person, for any use case, under any circumstance, in any logically-possible universe, ever, absolutely and without exception.

                                          Which was news to me as I write this on such a laptop. Good to know I’m just some sort of “hipster coffee” person you can gratuitously insult when you find you’re lacking in arguments worthy of the name.

                                          1. 1

                                            Ah yes, it is an absolute verifiable objective fact that laptops with USB-C ports are completely unsuitable and unusable by any person, for any use case, under any circumstance, in any logically-possible universe, ever, absolutely and without exception.

                                            You really do want to make this about yourself, don’t you ? I never said you’re not allowed to have fun with them, I just say that for many purposes those machines are pretty bad. And there are far too many systems produced now with said specs, that its becoming a problem for people with a different use case than the one you have. With more connections, less dongles or hubs and with the requirement that you know about the specific capabilities before buying it: Have fun explaining your family why model X doesn’t actually do what they thought, because their USB-C is just a USB 2.0. Why their USB-C cable doesn’t work - even though it looks the same, why there are multiple versions of the same connector with different specs, why one port of USB-C doesn’t mean it can do everything the port right beside it can do. Why there is no way to figure out if the USB-C cable is actually able to handle a 4k60 display before trying it out. Even for 1000+€ models that you might want to use with an external display, mouse, keyboard, headset,charging and some yubikey you get 3 USB-C connections these days. USB-C could’ve been something great, but now it’s a RNG for what you actually get. And some colored cables and requirements towards labeling the capabilities would have already helped a lot.

                                            Yes I’m sorry for calling it hipster in my rage against the reality of USB-C, let’s call it “people who do not need many connections (2+ in real models) / like dongles or hubs / do everything wireless”. Which is someone commuting by train, going to lectures or whatnot. But not me when I’m at home or at work.

                                            This is where I’m gonna mute this thread, you do not seem to want a relevant conversation.

                                    2. 1

                                      Yeah, I was thinking this too. Though even then we were already starting to get into it with HDMI versions.

                                    1. 9

                                      The author seems highly adversarial towards Meow hash at the start of the essay, though the amount of research work that must have gotten into it is worthy of admiration.

                                      For people that might not know what the initial use Casey wanted fulfilled when he wrote v1. of MeowHash, it was basically to use as a hash-function for building maps of assets in a game engine. I think a long time he has been very vocal about how it will not be suitable to do cryptography, because it was designed for speed, not safety. I guess the subsequent progress in the next versions made them be comfortable with assuming Level 3. I see that currently they have taken it down to Level 1 following this article.

                                      1. 18

                                        The author seems highly adversarial towards Meow hash at the start of the essay

                                        How so?

                                        The author noted, multiple times, that Meow hash was not advertised as a strong cryptographic hash. I took the author’s tone as polite and collegiate.

                                        I ask because I often write documents with similar tone and caveats. I don’t want to seem adversarial!

                                        1. 11

                                          The article is still held in a professional manner, and backs up the author’s claims with outstanding cryptographic analysis. I symapthise with the author here: The meow hash function claims security where none is to be found. The developers warn that their level 3 classification might be unsound and nobody has proven that it is secure, but they still upheld that claim, which is questionable at best and malicious at worst. It was reclassified after these exploits were made public, but IMO it should never have claimed to be securer without proof.

                                          I don’t mean to detract from the value of the meow hash function: 50GB/s is an absolutely outstanding hash rate and in situations where cryptographic security isn’t required (such as game asset hasing), and it is certainly one of the fastest out there. Just don’t claim what you haven’t verified yourself.

                                          1. 10

                                            I think that, in general, humans should not be boastful about hash functions. We have yet to show that one-way functions even exist; there is a lot of hubris involved in claiming that a hash function is one-way or even slightly hard to invert.

                                            1. 1

                                              Have not shown that one way functions exists? This seems… odd. Speaking of which, how is isOdd not a one-way function?

                                              1. 5

                                                The Wikipedia article @Corbin linked explains this. It is not, because a one-way function is one where for a given output, it is hard to find any input, it doesn’t have to be the input originally provided.

                                                1. 1

                                                  Interesting, cheers. I will let this stew in the ol’ noggin.

                                            2. 2

                                              [Meow hash] was designed for speed, not safety

                                              My understanding is that speed and safety are basically the same thing:

                                              • Either you want maximum security for given speed constraints,
                                              • or you want maximum speed for a given security margin.

                                              Thus, the quality of a symmetric primitive is basically measured by how many rounds it needs to thwart all known attacks (assuming the community made a concerted effort to break it, mind you), and how much time each round takes.

                                              Now if we can increase the rounds of MeowHash down to the speed of some modern hash like BLAKE2, BLAKE3, or SHA-3, and we notice that it has a similar security margin, then it’s probably pretty good. If however the margin is smaller, or it stays broken even with that many rounds, then it’s just crap…

                                              …which may be a worthy trade-off anyway if it was never meant to be attacked in the first place of course, especially if the first rounds take little computation.

                                            1. 2

                                              The mailing list post has a lot of positive feedback and constructive questions, I’m excited to see where this goes.

                                              1. 45

                                                PDFs are not mobile friendly. That’s already a strong reason for HTML.

                                                I know the author claims that PDF reflow allegedly solves this issue, but I’ve yet to find a reader that supports it.

                                                1. 21

                                                  Strongly agree. And what about accessibility? PDFs are awkward for people who need large text accommodations. And I don’t know how amenable they are to screen readers nowadays: originally it was quite difficult to recover the text from a PDF because each line of text was an independent element with no clear indication of how they connect to each other.

                                                  1. 2

                                                    These days you can at least add accessibility annotations to PDFs that support devices like screen readers. Not sure how many readers support this stuff though.

                                                  2. 2

                                                    KOReader does a halfway decent job, but I agree it’s not nearly good enough to obviate HTML.

                                                  1. 14

                                                    In metric units 170F is about 75C, which is literal Sauna temperature. Wow.

                                                    1. 1

                                                      A sauna is much more dangerous. Just as the air in a 150 celcius oven is pretty hot, but a 100 degree steam coming out of a pot is going to hurt way more. Temperature isn’t the only relevant factor when it comes to cooking…

                                                      1. 4

                                                        You make it sound like a sauna is dangerous. It’s not, because you get out when you get too hot.

                                                        In Finland there were less than 2 deaths in sauna per 100,000 inhabitants per year in the 1990s. That was a time when on average, Finns spent 9 minutes in a sauna twice a week. That’s one death per 780,000 hours spent in the sauna. And half of that is because people binge drink and go to the sauna.

                                                        I don’t have statistics for deaths per hours a child spends in a hot car, but it cannot be very high considering reasonable people don’t leave children in a hot car at all yet there are dozens of deaths every year.

                                                        1. 2

                                                          In these comparisons I think the deciding factor is the ability/inability to leave the hostile environment…

                                                        2. 1

                                                          Saunas are typically dry air (although you can sometimes pour water onto hot stones). There are 100degC saunas which you can sit in for several minutes because heat transfer is so low. But a 100degC steam room would instantly burn you (and so doesn’t exist).

                                                          1. 1

                                                            Yeah humidity plays an important role as well. Sadly the post doesn’t show the record high with humidity info, but maybe @JeremyMorgan can enlighten us :)

                                                            1. 2

                                                              It looks like the min [1] humidity was 7.8% from one of the pictures.

                                                              [1]: Humidity and temperature should be inversely related, so as the temperature rises, the humidity should decrease, as more water vapor can be “stored” in the air. Similarly, when the temperature drops, humidity increases, leading to dew in the early morning or fog.

                                                              1. 1

                                                                @bfielder

                                                                Humidity levels:

                                                                Outside

                                                                • Min: 13.6
                                                                • Max: 89.3

                                                                In the car

                                                                • Min: 6.8
                                                                • Max: 55.3

                                                                Seems like some wild fluctuations. It was an unusual weather event for sure.

                                                          1. 2

                                                            Based on my experiences with Typescript, I’m guessing compiling it is NP-extrahard.

                                                            1. 1

                                                              It’s undecidable (and its type system is actually unsound).

                                                              1. 1

                                                                Yep it was a bad joke, I saw your very useful link. Anecdotally, I see this occur most frequently when a function has one or more type parameters and is overloaded. I actually found a case yesterday when I couldn’t actually get it to select the right overload, even with explicit type parameters and argument types.

                                                                1. 1

                                                                  and its type system is actually unsound

                                                                  Doesn’t this make it very decidable, like O(1) decidable.

                                                                  1. 1

                                                                    In what way?

                                                                    1. 1

                                                                      If it’s unsound, you should be able to prove anything, so just always return “this doesn’t typecheck” or always return “this is an int” and you’re not wrong.

                                                                      (I may be thinking of types as a proof system slightly more than they actually are)

                                                              1. 2

                                                                This paper gives a proof that an adversial tetris AI can always force you to lose the game, no matter what you do. Interestingly it is enough to consider only left and right kinks (the S and Z-shaped pieces).

                                                                It does not give a proof that this must happen during a regular tetris game in which the pieces are chosen randomly, because the losing sequence is dependent on the player’s piece placement. It is not immediately clear whether a loss is forced even for a randomly chosen piece sequence.

                                                                Related is the Haetris game, a tetris game that tries to select the worst possible piece at each move. The default Haetris AI does not implement the above algorithm by Brzustowski, but it seems to be effective nonetheless, as nobody has found a loop yet.

                                                                1. 2

                                                                  qntm actually implemented the above algorithm in Hatetris recently! You can toggle the piece-choosing AI.

                                                                1. 16

                                                                  Compiling a lot of languages is actually (at least) NP-hard, simply because of the type systems involved. I’ve created a little overview a few months back, when it turned out that Swift’s type checking was undecidable.

                                                                  1. 5

                                                                    That’s a wonderful overview, thank you! It’s easy to read a lot of type theory papers and then say “okay but how does this apply to the real world”, and a reference like that really makes it easier to connect theory to more concrete ideas.

                                                                    1. 1

                                                                      Lovely overview. Thanks!

                                                                    1. 3

                                                                      There’s also a different project with similar goals called cogent by Data61.

                                                                      1. 16

                                                                        The short code snippets that Copilot reproduces from training data are unlikely to reach the threshold of originality.

                                                                        The thing is that the core logic of a tricky problem can very well be very little code. Take quicksort for instance. It is super-clever algorithm, yet not much code. Luckily quicksort is not under some license that hinders its use in any setting, but it could very well be. Just because it is only 10 lines, it does not mean it is not an invention that is copy-rightable. Code is very different from written language in that regard.

                                                                        1. 16

                                                                          Yeah the title is ignoring this important bit. The claim is precisely that Github Copilot can suggest code that DOES exceed the threshold of originality.

                                                                          Based on machine learning systems I’ve used, it seems very plausible. There is no guarantee that individual pieces of the training data don’t get output verbatim.

                                                                          And in fact I say that’s likely, not just plausible. Several years ago, I worked on a paper called Deep Learning with Differential Privacy that addresses the leakage from the training data -> model -> inferences (which seems to have nearly 2000 citations now). If such things were impossible then there’s no reason to do such research.

                                                                          1. 2

                                                                            That was/is still a concern at least about two years ago, because it was one of the topics I was contemplating for my bachelor’s thesis. The hypothesis I was presented was that there (allegedly) is a smooth transition: the better your model the more training data it leaks, and vice versa. Unfortunately, I chose a different topic so I can’t go into detail here.

                                                                            1. 1

                                                                              There is no guarantee that individual pieces of the training data don’t get output verbatim.

                                                                              The funny thing is that humans do that too. They read something and unknowingly reproduce the same thing in wiring as their own w/o bad intents. I think there is a name for that effect, but I fail to find it atm.

                                                                              1. 5

                                                                                I’d say it depends on the scale. Sure in theory it’s possible for a human to type out 1000 lines of code that are identical to what they saw elsewhere, without realizing it, but vanishingly unlikely. They might do that for 10 lines, but not 1000 lines, and honestly not even 100.

                                                                                On the other hand it’s pretty easy to imagine a machine learning system doing that. Computers are fundamentally different than humans in that they can remember stuff exactly … It’s actually harder to make them remember stuff approximately, which is what humans do :)

                                                                            2. 8

                                                                              Quicksort is an algorithm, and isn’t covered under copyright in the first place. A specific implementation might be, but there’s a very limited number of ways you can implement quicksort and usually there’s just one “obvious” way, and that’s usually not covered under copyright either – and neither should it IMO. It would be severely debilitating as it would mean that the first person to come up with a piece of code to make a HTTP request or other trivial stuff would hold copyright over that. Open source code as we know it today would be nigh impossible.

                                                                              1. 6

                                                                                So that means the fast square root example from the Quake engine source code can be copied by anyone w/o adhering to the GPL? It is just an algorithm after all. If that is truly the case, then the GPL is completely useless, since I can start copy & pasting GPL code w/o any repercussions, since it “just an algorithm”.

                                                                                1. 9

                                                                                  If your friend looks at the Quake example and describes it to you without actually telling you the code – by describing the algorithm – and you write it in a different language, you are definitely safe.

                                                                                  If your friend copies the Quake engine code into a chat message and sends it to you, and you copy it into your codebase and change the variable names to match what you were doing, you are very probably in the wrong.

                                                                                  Somewhere in between those two it gets fuzzy.

                                                                                  1. 11

                                                                                    It looks like my friend copilot is willing to paste the quake version of it directly into my editor verbatim, comments included, without telling me its provenance. If a human did that, it would be problematic.

                                                                                  2. 4

                                                                                    In the Quake example it copied the (fairly minor) comments too, which is perhaps a bit iffy but a minor detail. There is just one way to express this algorithm: if anyone were to implement this by just having the algorithm described to them but without actually having seen the code then the code would be pretty much identical.

                                                                                    I’m not sure if you’re appreciating the implications if it would work any different. Patents on these sort of things are already plenty controversial. Copyright would mean writing any software would have a huge potential for copyright suits and trolls; there’s loads of “only one obvious implementation” code like this, both low-level and high-level. What you’re argueing for would be much much worse than software patents.

                                                                                    People seem awfully focused on this Copilot thing at the moment. The Open Source/Free Software movement has spent decades fighting against expansion of copyright and other IP on these kind of things. The main beneficiaries of such an expansion wouldn’t be authors of free software but Microsoft, copyright trolls, and other corporations.

                                                                                    1. 2

                                                                                      Semi-related, Carmack sorta regrets using the GPL license

                                                                                      https://twitter.com/ID_AA_Carmack/status/1412271091393994754

                                                                                      1. 5

                                                                                        Semi-related, Carmack sorta regrets using the GPL license

                                                                                        …in this specific instance.

                                                                                        Quote from Carmack in that thread: “I’m still supportive of lots of GPL work, but I don’t think the restrictions helped in this particular case.”

                                                                                        I’m not implying you meant to imply otherwise, I’m just adding some context so it’s clearer.

                                                                                        1. 1

                                                                                          That is fine, I am not the biggest GPL fan myself, yet if I use GPL code/software, I try to adhere to its license

                                                                                    2. 5

                                                                                      quicksort is not under some license that hinders its use in any setting, but it could very well be

                                                                                      Please do not confuse patent law and copyright law. By referring to an algorithm you seem to be alluding to the patent law. And yet you mixed the term “invention” and “copy-rightable”. Please explain what you mean further because as far as I know this is nonsense. As far as I know “programs for computers” can’t be regarded as an invention and therefore patented under the european legal system, this is definitely a case in Poland. This is a separate concept from source code copyright.

                                                                                      Maybe your comment is somehow relevant in the USA but I am suspicious.

                                                                                    1. 2

                                                                                      There’s also the classic tools Isabelle and Coq. Isabelle was used to prove correctness of seL4, and Coq was used to prove the four-color theorem and to verify CompCert.. Both of these can also produce executable code, which is nice since there is less chance of divergence of code and model.

                                                                                      Ada was already mentioned in a different comment, but it falls in the same category as the above.

                                                                                      Then there’s the thing that drives the autonomous Paris metro, written in Event-B.

                                                                                      Then there’s also Lean, although it more in the direction of mathematical work if I am not mistaken.

                                                                                      I bet there’s a lot of other tools out there that I’ve forgotten, but these are also worth a look if you are interested in formal verification IMO.

                                                                                      1. 3

                                                                                        Also worth looking at F*, which is used in Project Everest (a formally verified TLS stack used in Firefox and Azure).

                                                                                        1. 8

                                                                                          Perfect place for me to shill Let’s Prove Leftpad

                                                                                      1. 3

                                                                                        Always funny to see things pop up from random people one knows from way back.

                                                                                        And this is a great idea as well, I love it! Formalizing some sudoku logic is a great exercise in writing tactics.

                                                                                        1. 3

                                                                                          This feature already exists on at least Lichess, as they introduced it in this blog post. But DGT boards and pieces are expensive, so this should be a cheaper alternative.

                                                                                          1. 1

                                                                                            AFAIK all the big chess sites and all the major desktop chess software packages have supported DGT boards for years. For example, here’s chess.com’s help page for DGT.

                                                                                            I would be extremely wary of trusting someone’s GitHub computer-vision project with any rated game, though.

                                                                                            And for anyone wondering what “expensive” means in this context: a decent base-level DGT setup (board, pieces, and clock/computer) will run typically USD$500. Which sounds expensive, but compared to what people often pay for other chess-related materials isn’t that bad. For example, ChessBase (software + games database + subscription to their other materials) is more or less a necessity for serious players, and is a couple hundred euro to get started with, plus recurring subscription fees.

                                                                                            1. 1

                                                                                              $500 is not a sum I’d spend lightly, but that varies on a person-by-person basis. I bought two foldable boards with pieces and a digital new clock for about CHF150, which is a lot less than the DGT digital boards.

                                                                                              1. 2

                                                                                                See my edit – for people who are really into chess and getting serious about playing, $500 is not that much money to be spending on something that might improve their game (and many people do report that they need to practice with a real board/pieces because they struggle coming back to a physical board if they play online too much).

                                                                                          1. 4

                                                                                            In a 64-bit address space, why wouldn’t the OS just space the thread stacks really far apart, like 100MB, and map those pages as needed? Then it could budget RAM for the process’s total stack space, instead of a fixed amount per thread.

                                                                                            1. 1

                                                                                              Isn’t that just like setting the thread stack size to 100MB? Paging is an independent concept and still happens behind the scenes AFAIK.

                                                                                              1. 4

                                                                                                Not exactly: today’s 8mb stack means that the pages are mapped eagerly, when the thread is created. That is, while we don’t allocate physical pages to back the stack space immediately, we do insert 8mb/4k entries into the page table in the kernel.

                                                                                                An alternative would be to do this mapping lazily: rather than mapping the whole stack, you’d map just a small fraction immediately, with a guard page after that. When a guard page is hit and you get a signal, you’d just map&allocate more of the space.

                                                                                                1. 1

                                                                                                  Not exactly: today’s 8mb stack means that the pages are mapped eagerly,

                                                                                                  What makes you think that?

                                                                                                  1. 2

                                                                                                    My theoretical understanding of virtual memory subsystem plus looking at process memory maps in practice.

                                                                                                    I think there might be some terminological confusion here (and I very well might be using wrong terminology), so let me clarify. When a page of memory is “used” for stack, this is a two-step process. First, the page is “mapped”, that is, the entry is created in the page tabs for this page that describes that it is writable, readable, and doesn’t yet have a physical page packing up. Second, an actual physical page of memory is allocated and recorded in the page table entry previously created.

                                                                                                    The first step, creating page table entries, is done eagerly. The second step, actually allocating the backing memory, is done on demand. If you don’t do the first step eagerly, you need to do something else to describe 8mb of memory as being occupied by a stack.

                                                                                                    1. 1

                                                                                                      Yep, there was a gap in terminology.

                                                                                                      Still, if you just scatter the stacks far apart without reserving the space, you run the risk of something else mapping the memory, which would prevent you from growing the stack later. And if you reserve a smaller region of virtual memory just for stacks, you’ve invented a scarce resource.

                                                                                                  2. 1

                                                                                                    An alternative would be to do this mapping lazily

                                                                                                    Go does this. My knowledge may be out of date but as far as I know Go doesn’t use guard pages—the compiler / runtime explicitly manages Goroutine stacks. I think the default stack size is 8kb. Which makes sense when Goroutines are supposed to be small and cheap to spawn.

                                                                                                    1. 1

                                                                                                      Not really, go doesn’t rely heavily on virtual memory to manage stacks and just manually memmoves the data between two allocated chunks of memory:

                                                                                                      https://github.com/golang/go/blob/c95464f0ea3f87232b1f3937d1b37da6f335f336/src/runtime/stack.go#L899

                                                                                                      This is a different approach from dedicating a big chunk of address space to the stack, and then lazily mapping & allocating the pages. In this latter approach, stack is never moved and is always continuous, it requires neither scanning the stack for pointers nor support for segmented stacks.

                                                                                                      1. 1

                                                                                                        Go defaults to well below a single page of memory for the thread stack.

                                                                                                        1. 1

                                                                                                          “Well-bellow” is 2kb of memory, half a page.

                                                                                                  3. 1

                                                                                                    Not if the pager treats stacks differently than other allocations. For example it could reserve a particular amount of swap space for stacks and segfault if it’s exceeded, and it could unmap pages from a stack that wasn’t using them anymore.. This would make it more practical to have large numbers of threads with small stacks (as in Go.)

                                                                                                    1. 2

                                                                                                      The tricky bit is determining when a stack isn’t using them anymore. In theory, when a thread does a syscall then any pages between the bottom of the red zone and the end of the stack could be reclaimed, but in practice there is nothing in the ABI that says that the kernel may do this.

                                                                                                      1. 1

                                                                                                        Does the kernel need to necessary be involved here? Can the userspace MADV_FREE the unused stack from time to time?

                                                                                                        1. 1

                                                                                                          There are a few problems with a pure userspace implementation but the biggest one is policy, not mechanism. The kernel has a global view of memory and knows when memory is constrained. When there’s loads of spare memory, this is pure overhead for no gain and so even a fairly infrequent madvise would probably be measurable overhead. When memory is constrained, you want to reclaim more aggressively, without going through the MADV_FREE dance (which is fast, but not free: it still needs to lock some VM data structures and collect the dirty bit from every page being marked). Userspace doesn’t have visibility into the global state of memory pressure in the system and so can’t make this decision very easily.

                                                                                                          In particular, the best candidates for this are threads that have been parked in the kernel for a long time. Consider a process like devd: it does a read on a device node and typically blocks there for ages. Ideally, the kernel would come along periodically and note that its stack pages are a good candidate for reclaim. You might have a pass that notices that a thread has been blocked on a system call for a while and does the MADV_FREE thing to its stack pages, so that they can be reclaimed immediately if you hit memory pressure and then a different mechanism for when you actually hit memory pressure.