1. 3

    This article is a different take and perspective to a lot of what I read regarding (memory) safer languages. I hope it doesn’t grate with people too much. I’m regularly trying to fix things setup by other people, whether it’s systems at work or old programs and games for fun, so I have probably have different priorities and views to many other authors.

    1. 7

      As a programming language developer, “time safety” to me really does refer to certain sorts of time-isolating techniques akin to memory safety. This blog post about the paper Time Protection is a good introduction to the topic.

      What you claim as C’s time-tested durability, I see as a powerful series of network effects around C, starting from the fact that C was designed to be part of UNIX. Other languages with the same trajectory, notably AWK and sh, have the same staying power over the decades. It was possible to ride along this memetic current if one designed their language appropriately; Perl is probably the best example.

      I dislike these sorts of appeals to positive usage. It is relatively trivial to get code that has positive functionality; shove a bunch of combinators into a genetic evolution chamber, apply trials and fitness, and evolve programs that positively satisfy some given property or specification. The problem, of course, is that the evolved programs have no constraints on what they may not do. (If you dislike evolution, then you may instead imagine a feature factory where junior developers continually adjoin new modules and let unused modules rot.) When we design languages these days, we are extremely preoccupied not just with what one may build, but with what one may not build.

      I, too, am a maintenance programmer; when I am fixing code, I want to first know what it does. I need to know how code behaves apart from its specification, and this usually leads to considering the breadth of the range of the program’s effects. I want to know both positive and negative properties.

    1. 4

      We need to stop writing drivers in such ad-hoc and low-level ways. This is the second large effort to reverse-engineer Mali, after Lima, and the primary artifacts produced by these projects are piles of C. Long-term maintainability can only come from higher-level descriptions of hardware capabilities.

      I’m a little frustrated at the tone of the article. I was required to write this sort of drivel when I was applying for grants, but by and large, the story of how I wrote a GPU driver is simple: I was angry because there wasn’t a production-quality driver, so I integrated many chunks of code and documentation from existing hard-working-but-exhausted hackers to make an iterative improvement. The ingredients here must have been quite similar. The deliberate sidelining of the Lima effort, in particular, seems rather rude; Panfrost doesn’t mark the first time that people have been upset with this ARM vendor refusing to release datasheets, and I think that most folks in the GPU-driver-authorship world are well aware of how the continual downplaying of Lima is part of the downward spiral that led to their project imploding.

      I don’t think I ever hid that r300, r500, and indeed radeonhd, from the same author as Lima, were all big influences on my work, and that honest acknowledgement of past work is the only way to avoid losing contributors in the future.

      1. 15

        I’m a little frustrated at the tone of the article. I was required to write this sort of drivel

        Is it not possible that the tone is a true, unforced reflection of the author’s enthusiasm? That’s how I read it. Maybe that’s just naive of me.

        1. 9

          Long-term maintainability can only come from higher-level descriptions of hardware capabilities.

          Is there any source you can provide indicating that this would actually work? From my understanding, creating meaningful abstractions over hardware is an extodinarily tough problem to solve. For example device trees work as a general purpose description of hardware, but still need a lot of kernel-space driver fluff to get anything talking correctly. What kind of higher level description do you think would work in this space?

          FYI: I have zero experience in graphics driver land so I don’t actually know anything about this domain. ¯\_(ツ)_/¯

          1. 1

            I read it not as “assembling a driver from common code components and interfaces in c”, but as “write a high-level description of the hardware and api, from which implementations in C or Rust or whatever can be generated”.

            But maybe we’re both reading it wrong :)

          2. 4

            Isn’t the primary artefact produced a set of instruction able to use a GPU? I would think it comes first, before the piles of C.

            Long-term maintainability can only come from higher-level descriptions of hardware capabilities.

            This seems like an extraordinary claim. “Can only come from” is a very strong statement.

            1. 4

              GPU drivers are not required to have a single shape. Indeed, they usually have whatever shape is big and complex enough to fit their demanded API. The high-level understanding of GPUs is what allowed the entire Linux GPU driver tree to be refactored around a unified memory manager, and what allowed the VGA arbiter to be implemented. High-level descriptions, in particular datasheets, are already extremely valuable pieces of information which are essential for understanding what a driver is doing. At the same time, the modern GPU driver contains shader compilers, and those compilers are oriented around declarative APIs which deal with hardware features using high-level descriptions of capabilities.

              Let me show you some C. This module does PCI ID analysis and looks up capabilities in a table, but it’s done imperatively. This module does very basic Gallium-to-r300 translation for state constants, but rather than a table or a relation, it is disgustingly open-coded. (I am allowed to insult my own code from a decade ago.) I won’t lie, Panfrost has tables, but this is, to me, only the slightest of signs of progress.

              1. 3

                Ah, I see what you mean.

                Higher-level description means genericity, this can lead to bloated code trying to deal with the future, impairing the present. Trying to keep the proper balance of high-enough description and low-enough efficient description is a challenge.

                Helping the maintenance effort by lending hands to refactor with a fresh mindset is my naive view of how to fight this, but I know this is falling prey to the rewrite fallacy.

          1. 8

            Show them that they actually do have something to hide. Take them on a Socratic journey to get them to a position where they defensively refuse to talk candidly about some part of their personal life. Then, when they refuse to talk about it, explain how they are hiding, and build their empathy for others.

            1. 10

              This can backfire, since taking people on such a journey may not get to the point where they found something to hide or they may expose more than you wanted to hear. And if you do get them to that point, you will probably have lost most goodwill at that point in the conversation. From a theoretical point of view, I like this idea, but I don’t think it factors in how little patience some people have with nerds.

              1. 2

                Indeed, my use of this approach has never resulted in a changed mind. Spend your conversation on different topics.

            1. 8

              …when will the rediscovery of plaintext and the applied Unix philosophy make waves in the “tech industry”, and how will it be repackaged to sell back to us?

              1. 2

                It’s already happened. We’re on like, Wave 3. One word: “Services”.

                1. 1

                  To apply the author’s argument: Text should be queryable, too; specifically, it should be readable. But readability is not something that we can measure in the absolute or the objective. You say that text should be “plain”, but even plain old text files are actually built out of arbitrarily chosen alphabets. Do we use Unicode or ASCII for text?

                  If we consider the human-computer-interface point-of-view, then text should also be writeable, and it is here that we see the programs-as-prisons metaphor on full display: We can only write using the alphabets available for text encoding!

                  1. 2

                    Every ASCII file is a UTF-8 file. Just use UTF-8 :)

                    https://research.swtch.com/utf8

                1. 19

                  As somebody who first learned Python, and then Haskell, and then left both, this review reminds me of many of my own experiences.

                  I was reading through some of Hillel Wayne’s blog posts, and one of them discussed the Curry-Howard correspondence, which somehow proves a 1:1 association between aspects of a mathematical proof and aspects of a type system. You can convert a proof into a type, if your type system supports it. I think Haskell’s type system respects this. I don’t think Python’s type system does.

                  Yes, but undefined happens to be a value of every type, in Haskell; this corresponds to being able to prove any statement, including the false ones. So while it is indeed possible to turn proofs into programs, it’s not possible to turn Python or Haskell programs into useful valid proofs.

                  (Somebody may pedantically note that every Python or Haskell program gives a proof, just not of the fact that we might hope that they prove; they usually have Python-shaped or Haskell-shaped holes in them.)

                  I don’t think you can do anything like [fmap] with Python or any other practitioner’s language that I’m familiar with.

                  I think that this is part of the roots of the Haskeller memetic complex. Because Haskell has the lightweight syntax of an ML, and its prelude is rich in algebraic tools, suddenly there is a blindness as to the fact that the tools have been there all along. For example, in Python 3:

                  >>> plusOne = lambda x: x + 1
                  >>> lmap = lambda f: lambda xs: [f(x) for x in xs]
                  >>> nada = object()
                  >>> mmap = lambda f: lambda x: nada if x is nada else f(x)
                  >>> dot = lambda f: lambda g: lambda x: f(g(x))
                  >>> zf = dot(lmap)(mmap)(plusOne)
                  >>> zf([1, 2, 3])
                  [2, 3, 4]
                  >>> zf([1, nada, 3])
                  [2, <object object at 0x7ff65c5db0d0>, 4]
                  

                  Even the base error type, bottom or |, is defined as an evaluation that never completes successfully, either because a computation failed, or because there’s an infinite loop. … I don’t think Python’s BaseException has a built-in notion of infinite loops evaluating to an error condition. This appears super powerful to me, because the big error type introduced by going over network connections are network timeouts, which appear to most programs as infinitely long computations.

                  There are many opinions on this, but I am of the opinion that the sort of incompleteness generated by an unresponsive network peer, known as FLP impossibility or perhaps FLP “incompleteness” of network requests, is not the same as the incompleteness that Turing, Gödel, Tarski, Lawvere, and others showed via diagonalization. They are connected, in that the FLP result implies that a Turing machine which simulates many computers on a network may encounter a Turing-complete subproblem which prevents the simulated network peers from responding.

                  In Alice-and-Bob framing, suppose Alice is waiting for Bob to reply over the network. The FLP impossibility of Alice having to wait forever for Bob’s reply is connected to the Turing undecidability of whether a simulation of Alice and Bob will hang forever while simulating Bob’s preparation of their reply.

                  Haskell has Test.QuickCheck, a property-based testing framework that generates values to check a declared property. Python has hypothesis, but having used it a tiny bit for an open-source contribution, I don’t think it’s the same without the Haskell type system.

                  I think that this is another memetic root; specifically, the root is type-driven dispatch. To compare apples to apples, a better analogue of QuickCheck in Python might be PayCheck, while Hedgehog is closer to Hypothesis in Haskell. As soon as the one-dimensional comparison becomes two-dimensional, the flexibility of Hypothesis, and crucially the ability to select arbitrary strategies for generating test cases, far outweighs the convenience of simple type-checking. This makes sense, in a certain way; if Haskell’s type system were sound in the first place, then QuickCheck would be little more than a totality checker. Hypothesis’ own author recommends Hedgehog over QuickCheck, and in my language, Monte, I implemented both a classic unit-test runner and also a Hypothesis-style property test runner.

                  Ultimately, I don’t think [indentation] is a big issue. Haskell has code auto-formatters like haskell-formatter, much like Python’s black. Use it and move on. Bikeshed about something else, you know?

                  Since you didn’t mention it, and none of your code samples seem to use it, and to give an example of something to bikeshed over, it’s worth pointing out that Haskell supports curly-brace-oriented syntax as well as indentation. This is a recurring topic in programming language design: We want code to be readable, and that leads to caring about whitespace.

                  1. 8

                    This makes sense, in a certain way; if Haskell’s type system were sound in the first place, then QuickCheck would be little more than a totality checker.

                    I don’t think that follows unless you’re using incredibly specific types. There are many total functions of type Ord a => [a] -> [a], but not all of them would pass the tests of a sorting function.

                    1. 3

                      This is a fair observation. QuickCheck test selection indeed is driven not just by types, but by typeclasses and associated behaviors, and I was remiss to dismiss it so glibly.

                      The main desire is that one might express algebraic laws, and ultimately all of these testing tools are only approximations to actual equational reasoning. That doesn’t preclude a type system from giving useful annotations.

                    2. 4

                      Wow, amazing comment! I understand about half of this. I wish I had more than one upvote to give you.

                      I am really curious as to whether Python’s NoneType would map to Haskell’s bottom in that regards. Would that be the Python-shaped hole you’re talking about?

                      I did see a tutorial by David Beazley on implementing a lambda calculus in Python. I didn’t think it was all that practical tbh, and I think he said the same in the video: https://www.youtube.com/watch?v=pkCLMl0e_0k

                      Could you give a brief summary of how FLP impossibility and Turing incompleteness are different? I know of “Turing completeness” and that it’s a thing, but my knowledge ends there. I haven’t heard of FLP impossibliity before. Bookmarked that website.

                      Huh, the author of hypothesis says QuickCheck doesn’t have integrated shrinking. From his statement on generators rather than type classes, I understand your point on how Hedgehog and Hypothesis are more similar.

                      Hmm, there weren’t a whole lot of examples of curly-brace oriented syntax. Most examples used indentation. The Haskell wiki does mention it though: https://en.wikibooks.org/wiki/Haskell/Indentation#Explicit_characters_in_place_of_indentation

                      (I do care about whitespace and clean code; I read through “Clean Code” my second year out of college, but in my experience as a practitioner so far I’d rather use a tool and automated process to have a consistent way of doing things with multiple people, in order to spend political capital elsewhere.)

                      1. 9

                        (Phew this got long, time to turn it into a blog post ;) )

                        I am really curious as to whether Python’s NoneType would map to Haskell’s bottom in that regards.

                        It’s important to distinguish between the empty type and the unit type. In (type) theory there are no values of the empty type, and there is one value of the unit type. In a language without _|_, like Agda, we might define them something like this:

                        data Empty : Type
                        
                        data Unit : Type where
                            unit : Unit
                        

                        This says that Empty is a Type, that Unit is a Type, that unit is a Unit. Since we’ve given no contructors (or “introduction forms”) for Empty, there’s no way to make one. Likewise we can write function types like Int -> Empty which have no values (since there’s no way for any function to return a value of type Empty), we can write tuple types like Pair Int Empty that have no values, etc. We can also write function types like Empty -> Int which do have values (e.g. const 42), but which can never be called (since there are no Empty values to give as an argument). Incidentally, the fact that we can have types with no values is one of the reasons we can’t have a value with type forall a. a (in Haskell) or (a : Type) -> a (in Agda): they claim to return a value of any type, but we might ask for an Empty.

                        The unit type is trivial, literally. Since there’s only one value of this type (unit), whenever we know that a value will have type Unit we can infer that the value must be unit, and hence (a) if it’s a return value, we can always optimise the function to simply unit and (b) if it’s an argument, it has no effect on the return value (there’s nothing to branch on). In this sense, the unit type contains 0 bits of information (compared to Bool which contains 1 bit).

                        As an aside, Either is called a “sum type” because it contains all the values from its first argument plus those of its second: Either Unit Unit is equivalent to Bool (with values Left unit and Right unit). Pairs are called “product types” since they contain every combination of their first argument and second arguments, which is the number of values in each multiplied together. In this sense Empty acts like zero: the type Either Zero a doesn’t contain any Left values, so it’s equivalent to a, just like 0 + a = a for numbers. Likewise Pair Empty a contains no values, since there’s nothing to put in the first position, just like 0 * a = 0 for numbers. Either Unit a acts like 1 + a, since we have all of the values of a (wrapped in Right) plus the extra value Left unit; note that this is also the same as Maybe a, where Right acts like Just and Left unit acts like Nothing. Pair Unit a is the same as a since each a value can only be paired with one thing, and we know that thing is always unit; this is just like 1 * a = a with numbers.

                        Back to your question: things get confusing when we try to apply this to “real” languages, especially since a lot of the terminology is overloaded. In Haskell the type Unit is written () and the value unit is also written ().

                        In Haskell the type Empty is called Void, but that is not the same as void in languages like Java! In particular, remember that we cannot have a function which returns Empty, so void being Empty would mean we could not implement methods like public static void putStrLn(String s). Such methods certainly do exist, since they’re all over the place in Java, but what happens if we call them? They will run, and return a value back to us (hence it can’t be Empty). What is the value we get back? We know, without even running the method, that we’ll get back null. That’s just like the Unit type (where we know the value will be unit without having to run the code). Hence Java’s void acts like Unit, and null acts like unit.

                        If we follow a similar line of reasoning in Python, we find that None acts like unit (or Java’s null) and hence NoneType is like Unit (or Java’s void). AFAIK Python doesn’t have anything which acts like Empty (Haskell’s Void) since, lacking any values, Empty is only useful for type-level calculations (of the sort which get erased during compilation), which Python doesn’t tend to do.

                        That just leaves “bottom”. There are a couple of ways to think about it: we can be “fast and loose” where we ignore bottom as a nuisance, which mostly works since bottom isn’t a “proper” value: in particular, we can’t branch on it; hence any time we do a calculation involving bottoms which results in a “proper” value, we know that the bottoms were irrelevant to the result, and hence can be ignored. Any time a bottom is relevant, we have to abandon our nice, logical purity in any case, since catching them requires IO, so why bother complicating our pure logic by trying to include them?

                        Alternatively we can treat bottom as an extra value of every type, in a similar way to null inhabiting every type in Java. From this perspective Haskell’s Void type does contain a value (bottom), and the () type contains two values (() and bottom). In this sense we might think of Java’s void corresponding to Haskell’s Void, but I find this line of thinking difficult to justify. In particular, we can branch on null in Java (in fact we should be doing such “null checks” all the time!), whereas (pure) Haskell doesn’t even let us tell whether we have () or bottom.

                        As a final thought, whenever comparing dynamic and static languages, it’s important to not conflate different concepts which just-so-happen to have similar names. In particular I find it useful to think of dynamically typed languages as being “uni-typed”: that is, every value has the same type, which we might write in Haskell as a sum like data DynamicValue = S String | I Int | L List | O Object | E Exception | .... Python is actually even simpler than this, since “everything is an object” (not quite as much as in Smalltalk, but certainly more than e.g. Java), so Python values are more like a pair of a class (which itself is a Python value) and a collection of instance properties.

                        This is important because “dynamic types”, like in Python, aren’t directly comparable to “static types” like those of Haskell; they’re more comparable to “tags” (e.g. constructors). In particular, think about a Python function branching based on its argument’s “dynamic type”; this is the same as a Haskell function branching on its argument’s constructor. What does this tell us about “bottom” in Python? From this perspective, there isn’t one (at least not reified; an infinite loop might be comparable, but it’s not something we can e.g. assign to a variable). Python’s None is just a normal value like any other, in the same way that Haskell’s Nothing is a normal value (we might sometimes use it to stand for an error, but that’s not inherent to the system); likewise Python’s exceptions are also normal values (we can assign them to variables, return them from functions, etc.); the idea of “throwing and catching” (or raise/except for Python) is actually a perfectly normal method of control flow (it’s actually a limited form of delimited continuation), and this is orthogonal to error representation and handling.

                        This makes raising an exception in Python very different to triggering a bottom in Haskell, since Haskell provides no way to branch on a bottom (or whether we even have one). In Python we can raise a value (with the exception “tag”) as an alternative to using return, and we can catch those values, inspect their tag and value, etc. to determine what our overall return value will be, with none of this being visible by the caller. To do anything like that in Haskell we need some “proper” data to inspect and branch on, hence why I say None, exceptions, etc. are just normal values (even though they’re used to represent errors, and we treat them specially because of that; but the same could be said about Either String values in Haskell, for example). Consider that in Haskell the only way to even check if a bottom exists is to use IO, but the idea behind IO is that it’s like State RealWorld, i.e. every IO a value is a pair of (RealWorld, a), so conceptually we never “actually” see a bottom; it’s more like triggering a bottom changes the RealWorld, and we’re branching on that.

                        1. 4

                          Thank you for the clarification on None vs bottom! I think it definitely helps me understand how much more different Python is from Haskell.

                          I like the idea of saying data DynamicValue = S String | I Int | L List | O Object | E Exception for dynamic languages, and I think from a type theory perspective I finally understand what people say when they say “In {Ruby, Python}, everything is an object”.

                          I hate try/catch logic in Python. It works, but I much rather prefer the functorial structure espoused by Haskell.

                          I agree on the description on IO! The book does mention IO is technically without effects, since it is merely a description of what IO actions should be taking place instead of imperatively executing IO.

                          1. 4

                            I like the idea of saying data DynamicValue = S String | I Int | L List | O Object | E Exception for dynamic languages

                            That’s a good intuitive starting point, but it’s not the whole story. The “unitype framework” is a way of understanding dynamic types from a static typing perspective, but we can also start from a “dynamic typing perspective” and see where it gets us. A dynamically-typed language is one where you can manipulate type definitions at runtime, and test for type membership at runtime. So saying DynamicValue = S String | I Int ... will always be incomplete, because we might not know something’s type even after creating it!

                            An intuition I like here is that dynamic types are boolean predicates: x is of type T iff T(x). This means we can define the type of, say, all functions with format as an optional keyword, or the type of all iterables that aren’t strings, or the type of all types that appear as return values in functions.

                            (The downside of this is it gets hellishly confusing very quickly, and allows for a lot of spooky action at a distance. Also, there’s no guarantee that checking type membership is total. That’s one reason even highly dynamic languages shy away from going deep in dynamic types.)

                            I think from a type theory perspective I finally understand what people say when they say “In {Ruby, Python}, everything is an object”.

                            Everything can also be an object in a static system, too. My rule of thumb for what “everything is an object” means: are object methods also objects? Is the call stack an object? If you are using message passing, are the messages objects? Can you send messages to other messages?

                            1. 1

                              RE: “Everything is an object”, I usually put it in quotes since the meaning is usually context-dependent. A colleague once said that to me regarding Java, and I replied that Java classes aren’t objects yet Python classes are; and Python’s if branches aren’t objects yet Smalltalk’s are (at least, the closures are); and so on.

                              RE: Unitypes, etc. yes the static/dynamic distinction can be built up from the dynamic side too (I tend not to think that way since fixed points and things trip me up). The important thing is that people aren’t talking past each other, e.g. it’s fine to argue about whether “foo can/can’t do bar”, but not so useful when each person is using “foo” and “bar” to mean different things ;)

                          2. 3

                            The overloaded word “void” across languages is confusing. In Rust and Typescript (and others?) they called it “never”, which I think makes it really clear: values of this type never actually exist. And “this function returns never” is like a pun: “this function never returns” and “this function’s return type is ‘never’”.

                            1. 3

                              This was wonderfully clear. Thank you.

                        1. 2

                          Let us first imagine a hypothetical scenario, before going back to system design consequences. First, suppose, hypothetically, that I take some E. coli, genetically engineer it to have a gene that both encodes some basic gentle antibiotic resistance and also a complete copy of Bee Movie, and release my engineered strain into the wild.

                          Imagine the cease-and-desist letter that I might receive; what might it demand? I cannot wipe out a single E. coli strain. I have no continuing action which I might stop carrying out. Regardless of what I do, the gene will propagate, and as it naturally mutates, all of the consequences of the mutation are beyond my control. Certainly, I might face more dire legal consequences, but censorship of the information is effectively impossible.

                          While I want to therefore recommend that we design systems which have this self-replicating, uncensorable sort of property, I am also wary of the way in which this feels like we are trying to change the world permanently, or speed up a natural process.

                          1. 2

                            While the Bee Movie is a fantastic one, I’m pretty sure the correct movie would be Osmosis Jones. I think optimistically this hopes to litigate the few large businesses rather than every single host of content, which is probably intractable. If the public wills it, information once released is probably not able to be hidden again. Data mules exist in 2020, as do data dead drops.

                          1. 1

                            Girard overlaps somewhat with Baudrillard here in rejecting the usefulness of chains of symbols and signs as meaningful logic. For Baudrillard, the main problem with these chains is that they can loop back onto themselves and fail to reach a terminus; Girard here points out that the terminus not only need not exist, but need not be Boolean!

                            Girard’s criticism of categorical logic here is reasonable, but omits the answers provided by homotopy type theory. Indeed, a categorical logic concludes all results at once, an incredibly transparent statement which cannot be supported without an underlying computation. While two arrows may be equivalent, this equivalence nonetheless must be computed. We need to weaken equality somehow. Homotopy type theory provides the answer; “equality is equivalent to equivalence” as the saying goes, and so we can replace all equalities with a notion of equivalence which respects the laws but is weak enough to account for computation and construction.

                            I don’t quite understand the criticism of the natural numbers. The natural numbers are “categorical”, in an older sense of the word, in second-order logic; in modern category theory, we rephrase this by saying that, when there is an object representing the natural numbers, it is isomorphic (equivalent!) to any other object which might also represent the natural numbers. This is a great success; the real numbers do not possess such a universal property. I suppose that Girard may be irked that category theory implies a great many towers of infinite objects of truly remarkable size, but we should be well-used by now to using symbols to study the infinite indirectly.

                            I feel like Girard comes so close to understanding the usefulness of topos theory for building customized logics, but stops short because of some generic complaints about topology. These problems have been overcome for specific instances of real-world programming language development, and so a general barrier doesn’t seem to pose an issue. Indeed, from the other direction, the only reason I can see to be upset about this is the admission that the interplay between continuity and discrete computation seems to be extremely beneficial to us: We can use our discrete-logic computers to symbolically compute facts about infinite and continuous objects.

                            Girard ignores a critical behavior of CPOs in his thoughts on Scott. To be informal, in Scott’s setting it is possible to learn new facts without forgetting old facts. However, this is a non-linear property. It is also strange that Girard uses the word “degenerate” for skeletal categories, when he probably knows that categorical logic yields skeletal categories via the common technique of forgetting how one learned something, and that the powerset construction he discusses is intimately tied to building a topos via building a category of sheaves.

                            At some point, I wish that Girard could have had the fortune to know about Chu spaces. I feel like there is a much simpler way to express these geometric ideas, and that the self-duality of Chu spaces is a richer setting to formalize and study them. The crossword of data underneath the typical infinite Chu space is hidden behind the same veil of transparency as the other objects discussed, but I think that categorically that is not a problem.

                            1. 3

                              I wish that Python teaching materials would teach the with statement, which has been with us since Python 2.5 over a decade ago, for context management. Context managers are easily the best way to ensure that file handles are cleaned up. Everything works even on PyPy or Jython, although the PyPy documentation does not say it explicitly.

                              I find the comparison to Rust interesting. The author says:

                              Rust is the only popular language that models [file-descriptor management] right, reducing the chances of messing up.

                              I wonder to what degree a higher-level API would have removed the problem entirely. In Monte, file descriptors cannot be accessed at all, which might have prevented the problem, but all I/O is asynchronous, which might have exacerbated the problem.

                              1. 1

                                Completely agree with regards to context managers, they are a very elegant!

                                In Monte, file descriptors cannot be accessed at all, which might have prevented the problem

                                I’m not familiar with Monte but not exposing the backing fd is a good solution to the problem. If we can’t call close(2) on arbitrary file descriptors and the file object implementation does not expose the backing fd we should be good. The main issue is that Python’s API can’t change at this point, and some low-level APIs require file descriptors to be passed around, and Python semantics can’t capture this in way that misuses are prevented.

                                I brought up Rust because I really like that Rust solves it very elegantly through not having a close wrapper in the standard library, the way ownership works, and the fact that crates that wrap close(2) have this function as unsafe.

                              1. -3

                                This article is obviously wrong in its conclusion. To see how, first recall that while Haskell’s types don’t form a category, we can imagine a “platonic” Hask whose objects are types, whose arrows are functions, and where undefined and friends have been removed.

                                Now, consider that platonic Hask is but one object of Cat. From size issues, it is immediate that Cat cannot be a subcategory of Hask; that is, that Hask cannot describe all of Cat’s objects. It follows that Haskell typeclasses like Functor are not arrows in Cat, but endofunctors on Hask, and that Control.Category does not capture objects in Cat, but the internal category objects in Hask.

                                Finally, pick just about any 2-category, indeed say Cat, and then ask whether Hask can represent it faithfully: The answer is a clear, resounding, and obvious “no”. Going further, pick any ∞-category, say Tomb, and then ask whether Hask can even represent a portion of any object; an ∞-object is like a row of objects, one per level, but Haskell’s type system could only see one single level of types at a time. (This is not just theoretical; I have tried to embed Tomb into Haskell, Idris, and Coq, and each time I am limited by the relatively weak type system’s upper limits.)

                                I wonder why the author believes otherwise.

                                1. 16

                                  This article is obviously wrong in its conclusion.

                                  I think the word “obviously” is relative to the reader’s familiarity with category theory.

                                  For the purposes of the misconception she is addressing, the author’s conclusion — to me — is obviously correct.

                                  You appear to be refuting her argument in some different context. I’m interested to hear your argument (although it would probably be a long time before I learn the CT necessary to properly understand your argument), but switching out the context the argument was made in to refute the entire original argument makes your own argument (to me, at least) appear as an attack against a straw-man.

                                  1. -1

                                    My argument ought to follow readily for any ML, and we can see the scars it causes in the design of many MLs. Idris, for example, uses a hierarchy of universes to avoid universe-inconsistency paradoxes as it climbs this tower that I’m talking about. Haskell and Elm don’t bother trying to climb the tower at all. SML and OCaml have exactly one tier, adding on the module system, and strict rules governing the maps between modules and values.

                                    I’m not removing the word “obviously”. Cat obviously contains Hask, Set, and many other common type systems as objects; the size issues around Cat are usually one of the first things mentioned about it. (Third paragraph in WP and nCat, for example.) And Cat is one of the first categories taught to neophytes, too; for example, in the recent series of programmer-oriented lectures on category theory, Programming with Categories, Cat is the second category defined, after Set.

                                    My refutation is of the article’s title: Yes indeed, dynamic type systems are more open, simply because there are certain sorts of infinite objects that, when we represent them symbolically, still have infinite components. Haskell can represent any finite row of components with multi-parameter typeclasses but that is not sufficient for an ∞-category. By contrast, when we use dynamic type systems, especially object-based systems, our main concern is not about the representation of data, since that is pretty easy, but the representation of structures. For categories, for example, there are many different ways to give the data of a category, depending on what the category should do; we can emphasize the graph-theoretic parts, or the set-theoretic parts, or even transform the category into something like a Chu space.

                                    Finally, if static type systems are so great, why isn’t your metatheory, the one you use for metaphysics and navigating the world, a static type system? Probably because you have some sort of open-world assumption built into the logic that you use for day-to-day reasoning, I imagine. This assumption is the “open” that we are talking about when we talk about how “open” a type system is! Just like how we want a metatheory in our daily lives that is open, we all too often want to represent this same sort of open reasoning in our programming languages, and in order to do that, we have to have ways to either subvert and ignore, or entirely remove, limited static types.

                                    1. 5

                                      My argument ought to follow readily for any ML, and we can see the scars it causes in the design of many MLs. Idris, for example, uses a hierarchy of universes to avoid universe-inconsistency paradoxes as it climbs this tower that I’m talking about.

                                      Could you give examples of useful programs that are inexpressible in a typed way without a hierarchy of universes? Even when doing pure mathematics (which demands much stronger logical foundations than programming), most of the time I can fix a single universe and work with (a tiny part of) what lives in it.

                                      When programming in ML, the feature that I want the most badly is the ability to “carve out” subsets of existing types (e.g., to specify that a list must contain a given element). This would be actually useful for specifying preconditions and postconditions of algorithms (which is ultimately the point to programming, i.e., implementing algorithms). But it does not require hierarchical type universes.

                                      Yes indeed, dynamic type systems are more open, simply because there are certain sorts of infinite objects that, when we represent them symbolically, still have infinite components.

                                      You seem to be confusing symbols with their denotation. Symbols are finite out of necessity, but you can use them to denote infinite objects just fine, whether you use a type system or not.

                                      Haskell can represent any finite row of components with multi-parameter typeclasses but that is not sufficient for an ∞-category.

                                      The arity of a multiparameter type class has absolutely nothing to do with n-categories. But, in any case, why is Haskell supposed to do represent ∞-categories in its type system? It is a general-purpose programming language, not a foundation of mathematics.

                                      Finally, if static type systems are so great, why isn’t your metatheory, the one you use for metaphysics and navigating the world, a static type system? Probably because you have some sort of open-world assumption built into the logic that you use for day-to-day reasoning, I imagine.

                                      Every nominal type definition literally brings a new type of thing into existence. What exactly is this, if not dealing with an open world?

                                      And, by the way, my metatheory is ML.

                                      1. 3

                                        Can any programming language usefully represent these infinite objects? Is that ever useful?

                                        Surely you can just build something with opaque objects within Haskell if the type system is too restrictive?

                                    2. 9

                                      I wonder why the author believes otherwise.

                                      Probably because the author isn’t comparing Hask to all of category theory. They’re comparing it to the unitype, which cannot faithfully represent anything at all.

                                      1. -5

                                        As long as we are using “probably” to speak for the author, I think that they probably are not familiar enough with type theory to understand that there are size issues inherent to formalizing type systems.

                                        Please reread the original article; they do not talk about “unityping” or Bob Harper’s view on type theory of languages which don’t know the types of every value.

                                        1. 26

                                          The author is Alexis King, who is a PLT researcher, an expert in both Haskell and Racket and has discussed category theory in depth on Twitter. I’d be shocked if she didn’t understand the ramifications here and was intentionally simplifying things for her target audience.

                                          1. -1

                                            Sure, and I am just a musician. Obviously, therefore, the author is right.

                                            Anyway, they didn’t talk about size issues, nor did they talk about “unitype” ideas, in the article. I am not really fond of guessing what people are talking about. I am happy to throw my entire “probably” paragraph into the trash, as I do not particularly value it.

                                      2. 4

                                        I don’t know enough category theory to follow your argument precisely, but I’d argue that the category theoretic perspective isn’t relevant in this discussion. How much of category theory you can model using Haskell’s type system is totally unrelated to how much you can model with a program written in Haskell. I guess I don’t even need to make this argument, but still, whatever code you were planning to write with Javascript, can be mechanically translated by a Haskell beginner line-by-line to a Haskell program that simply uses JSON.Value everywhere.

                                        I believe the parts of category theory you can’t model in Haskell’s types corresponds to the kinds of relationships you can’t get the type checker to enforce for you. And you go into the language knowing you can’t model everything in types, so that’s no news. What’s relevant is how much you can model, and whether that stuff helps you write code that doesn’t ruin people’s lives and put bread on the table. As a full time Haskeller for a long time, my opinion is that the answer is “yes”.

                                        I think the friction comes from the desire to view the language as some sort of deity that you can describe your most intricate thoughts and it will start telling you the meaning of life. For me, once I stopped treating GHC (Haskell’s flagship compiler) as such and started viewing it as a toolbox for writing ad-hoc support structures to strengthen my architecture here and there it all fell into place.

                                        1. 2

                                          I’m going to quote some folks anonymously from IRC, as I think that they are more eloquent than I am about this. I will say, in my own words, that your post could have “Haskell” replaced with basically any other language with a type system, and the same argument would go through. This suggests that the discussion is not at all about Haskell in particular, but about any language with a type system. I would encourage you to reconsider my argument with that framing.

                                          (All quoted are experts in both Python and Haskell. Lightly edited for readability.)

                                          Maybe another way of making the point is to say that the job of a type system is to reduce the number of programs you can write, and proponents of a type system will argue that enough of the reduction comes from losing stupid/useless/broken programs that it’s worth it.

                                          The problem with this argument and the statement [IRC user] just made is the same, I think. It depends. Specifically, it depends on whether one is trying to use the type system as a mathematical object, or as a practical programming tool. And further, on how good your particular group of programmers is with their practical programming tools on the particular day they write your particular program. With a mathematical system, you can produce something correct and prove it; with a practical programming tool, you can produce something correct and run it.

                                      1. 2

                                        I don’t understand why people would want 10x the code. More code is not good; code is a liability that needs interest paid on it.

                                        1. 4

                                          I think that when people talk about 10x engineers (let’s not worry about whether they exist for now) they don’t mean people who write 10x the code, but deliver more value with the same amount of code. They write “the right stuff”. See for example https://www.google.nl/amp/s/www.newyorker.com/magazine/2018/12/10/the-friendship-that-made-google-huge/amp

                                        1. 2

                                          Why do people put up with Conda instead of using mature, community-supported tools like Nix? It seems like all of this trouble with using Conda here stems from poor ergonomics inside Dockerfiles. Compare and contrast:

                                          FROM nixos/nix
                                          RUN nix-env -iA python27Packages.flask
                                          

                                          For serious work, one might want to use a shell.nix file, but otherwise this is about it. As far as I can tlel, Conda only makes things more complex and confusing.

                                          1. 1

                                            If you’re installing Flask, no reason to use Conda. If you’re doing scientific computing or data science, Conda Forge has a massive set of pre-compiled libraries, and is well worth using.

                                            That is, the use case isn’t the Conda packaging tool, the use case is the Conda Forge package channel.

                                          1. 4

                                            Your side point about seL4 is a very interesting portal to the object-capability world. We have long sought a formal definition of plan interference, and that has itself rested upon what makes a plan distinct from an algorithm.

                                            Hyperproperties provide a nice way out of this problem. We can trivially (and perhaps incorrectly) suggest that an algorithm is distinct from a plan in that a plan is the actual source code of a program while an algorithm is the sequence of instructions executed by a machine running the program. We then immediately see that plan interference is a hyperproperty, since perhaps not every run of two interleaved plans will lead to interference.

                                            1. 3

                                              If you’re putting binary files into git you’re doing it wrong. One could argue about small files, but compiled code/executables, photos or “gifs for the readme” are definitely misplaced in a git repository.

                                              1. 12

                                                I do find that having image files in a resources/ directory for something like a website is often simpler than separating the two. Even then making sure that images are compressed and generally not bloating repo size / git history is essential.

                                                1. 18

                                                  I do find that having image files in a resources/ directory for something like a website is often simpler than separating the two.

                                                  Yeah, the is exactly the use case here. Mercurial (and git) aren’t designed for handling large binary files, but if you’re checking in static assets/resources that rarely change it still tends to work fine. This repo was fine on Bitbucket for many years, and is working fine on an hgweb instance I’ve spun up in the mean time.

                                                  I specifically asked about limits because if it’s just the size of the repo being a technical problem for their infrastructure, I can understand. But they would not specify any limits, but just reiterated several times that Mercurial wasn’t designed for this. So I don’t know which of these was the actual problem:

                                                  1. The repo is so taxing on their infrastructure it’s causing issues for other users.
                                                  2. The repo is so large it’s costing more to store than some portion of the $100/year account price can cover.
                                                  3. They are morally opposed to me using Mercurial in a way that it wasn’t designed for (but which still works fine in practice).

                                                  Cases 1 and 2 are understandable. Setting some kind of limit would prevent those problems (you can still choose to “look the other way” for certain repos, or if it’s only code that’s being stored). Case 3 is something no limit would solve.

                                                  1. 3

                                                    If you want to store large files and you want to pay an amount proportional to the file sizes, perhaps AWS S3 or Backblaze B2 would be more appropriate than a code hosting website? I don’t mean to be obtuse, but the site is literally called source hut. Playing rules lawyer on it read like saying “Am I under arrest? So I’m free to go? Am I under arrest? So I’m free to go?” to a police officer.

                                                    1. 5

                                                      B2 or S3 would make things more complicated than necessary for this simple repo. I’ve spun up a $5/month Linode to run hgweb and it’s been working great. I’m all set.

                                                2. 6

                                                  This case was hg, but the same limitations are present. Hg has a special extension for supporting this:

                                                  https://www.mercurial-scm.org/wiki/LargefilesExtension

                                                  And it’s considered “a feature of last resort”. It’s not designed to deal with these use-cases.

                                                  LFS support requires dedicated engineering and operations efforts, which SourceHut has planned, but is not ready yet.

                                                  1. 5

                                                    I have a repository with mostly PNG files. Each PNG file is source code; a chunk of data inside each PNG file is machine-readable code for the graph visually encoded in that PNG’s pixels. What would you have me do?

                                                    I suspect that you would rather see my repository as a tree of text files. While this would be just as machine-readable, it would be less person-readable, and a motivating goal for this project is to have source files be visually readable in the way that they currently are, if not more so.

                                                    git would not support binary files if its authors did not think that binary-file support were not useful; that is the kind of people that they are and the kind of attitude that they have towards software design.

                                                    With all that said, I know how git works, and I deliberately attempt to avoid checking in PNGs which I think that I will have to change in a later revision. It would be quite nice if git were able to bridge this gap itself, and allow me to check in plaintext files which are automatically presented as PNGs, but this is not what git was designed to do, and we all can imagine the Makefile which I’d end up writing instead.

                                                    1. 1

                                                      I like the project, but pardon my ignorance - aren’t the PNG files still binary assets produced by the “real” source code, which is the textual expression parsed to generate both the embedded bitstring and the dot graph? If they’re machine readable, that places them in the same category as compiled object files.

                                                      1. 3

                                                        The real source code is non-textual; it is the diagram (WP, nLab) which is being given as a poset (WP, nLab). To achieve optimal space usage, each poset is stored as a single integer which codes for the adjacency matrix. However, this compressed format is completely unreadable. There are several layers around it, but each layer is meant to do one thing and add a minimum of overhead; JSON (in the future, BSON or Capn) for versioning and framing, and PNG for display and transport. There isn’t really source code; there’s just a couple Python and Monte scripts that I use to do data entry, and I want them eventually automated away in favor of API-driven development.

                                                        For example, the raw integer for this “big” poset is (at the time of writing) 11905710401280198804461645206862582864032733280538002552643783587742343463875542982826632679979531781130345962690055869140174557805164079451664493830119908249546448900393600362536375098236826502527472287219502587641866446344027189639396008435614121342172595257280100349850262710460607552082781379116891641029966906257269941782203148347435446319452110650150437819888183568953801710556668517927269819049826069754639635218001519121790080070299124681381391073905663214918834228377170513865681335718039072014942925734763447177695704726505508232677565207907808847361088533519190628768503935101450078436440078883570667613621377399190615990138641789867825632738232993306524474475686731263045976640892172841112492236837826524936991273493174493252277794719194724624788800854540425157965678492179958293592443502481921718293759598648627823849117026007852748145536301969541329010559576556167345793274146464743707377623052614506411610303673538441500857028082327094252838525283361694107747501060452083296779071329108952096981932329154808658134461352836962965680782547027111676034212381463001532108035024267617377788040931430694669554305150416269935699250945296649497910288856160812977577782420875349655110824367467382338222637344309284881261936350479660159974669827300003335652340304220699450056411068025062209368014080962770221004626200169073615123558458480350116668115018680372480286949148129488817476018620025866304409104277550106790930739825843129557280931640581742580657243659197320774352481739310337300453334832766294683618032459315377206656069384474626488794123815830298230349250261308484422476802951799392281959397902761456273759806713157666108792675886634397141328888098305747354465103699243937608547404520480305831393405718705181942963222123463560268031790155109126115213866048693391516959219000560878337219324622230146226960346469769371525338127604307953786112516810509019551617885907067412613823285538493443834790453576561810785102306389953804151473860800342221969666874213156376831068606096772785272984102609049257833898258081466729520326827598704376424140779421965233471588921765110820238036094910936640446304632443760482611408445010230964335747094869968021425396439555206085281953007985784739643408074475440039274314217788647485602069097474262381690379456154426900896918268563062231294937080146199930562645748389040251871291840481739518244706752426504146889097315360662429293711705265772337748378759001582638301784557163848933046038798381667545043026975297902178839764134784634179453671000024868722179355800776002690855305662785522771116635997791339179517016284742206819482196944663461005128697584753594559406283638837841370287286682993990297923202976404261911087739188860505577427942276773287168600954693735964671046522557013031834557159173262849132567983767216098382093390056878765856939614383049277441.

                                                        1. 1

                                                          Ah, okay, I see. Makes sense, thank you for explaining!

                                                    2. 4

                                                      I’ve seen this argument quite a number of times, and almost always without a coherent explanation of why is that wrong. What’s the rationale behind this argument?

                                                      1. 4

                                                        Shameless plug, I contributed heavily to this help topic back when I was the PM for Microsoft’s Git server: https://docs.microsoft.com/en-us/azure/devops/repos/git/manage-large-files?view=azure-devops

                                                        FWIW I disagree with the comment up-thread which says that GIFs for READMEs don’t belong. If you’re going to check in a logo or meme or whatever, that’s perfectly fine. Just don’t do 1000 of them and don’t churn it every week.

                                                        1. 2

                                                          I think a big part is also “are my tools there for me or am I slave to my tools?”

                                                          If I have a website and most content is under version control, it’s annoying and complicated to have (big) assets outside. Most people simply want one repo with everything inside, and it’s mostly additive, often once per week - it simply doesn’t matter if it’s the wrong tool.

                                                    1. 59

                                                      Update: been working on a better approach to these problems that leave affected users feeling less put-out. I’ll be starting with a better email template in the future:

                                                      https://paste.sr.ht/~sircmpwn/3d32eb7bbc564170c3d30f041e5e8dc71aa5a1c6

                                                      In the future I’ll be working on better automated soft limits, so that users aren’t surprised by this.

                                                      @sjl: after thinking it over more, I was unprofessional and sarcastic with you. I apologise.

                                                      1. 41

                                                        I think it would be beneficial for you to take on the mindset that your users’ use cases are always valid, by definition, as a premise. Whether or not your service can handle their use cases, maybe not, but this idea that you know better what your users should be doing is not going to accomplish your goals.

                                                        As another example, I happen to need 2-3 more GiB RAM than the sr.ht freebsd build services offers at the moment, and have offered to up my monthly donation to account for the resource usage, and you’ve turned me down, on the grounds that I’m fundamentally abusing computer hardware in some moral way. As a result, Zig freebsd builds have many of the tests disabled, the ones where the bootstrapping compiler is a bit memory hungry. Zig’s FreeBSD users suffer because of this. And as a result, when someone else offers me a different FreeBSD CI service with more RAM, I’m ready to accept it, because my use case is valid.

                                                        1. 6

                                                          Could linking with the boehm conservative gc work as a stop gap? I think it won’t require any code changes.

                                                          1. 4

                                                            Something Andrew doesn’t mention here is why he needs 2-3 GiB more RAM: because, by design, his compiler never frees memory. Nearly all of that RAM is dead memory. In order to accomodate this use-case, I’d have to provision dedicated hardware just for Zig. Sometimes, use-cases are wrong, and you need to correct the problem at the source. Just because someone is willing to throw unspecified sums of money at you to get their “use-case” dealt with doesn’t mean it’s worth dealing with. I have finite time and resources and maybe I feel like my time is better spent implementing features which are on-topic for everyone else, even at the expense of losing some user with more money than sense.

                                                            1. 44

                                                              even at the expense of losing some user with more money than sense.

                                                              I really hope you change your tune here. Insulting users is pretty much the worst thing you could do.

                                                              Another thread recently talked about the fact that compilers don’t free memory, because the goal of a compiler is to be as fast as possible, so they treat the heap as an arena that the OS frees. Compilers have done this for 50+ years—zig isn’t special here.

                                                              1. 4

                                                                I didn’t mean to imply that Andrew doesn’t have sense, but that the hypothetical customer-thats-always-right might not.

                                                                As for compilers never freeing to be fast, utter bollocks. So the compiler should OOM if I have <8G of RAM to spare? Absolutely nuts. Freeing memory is not a performance bottleneck.

                                                                1. 38

                                                                  Your reasoning is sound, your wording and phrasing choices are not. In what I’ve read you don’t come off as witty when you’re dealing with a paying customer and telling them they can’t do something which I also think is unreasonable, you come off as a dick. That’s how it appears. I don’t have any problems with you or your services and I think you working on this stuff is awesome… but I wouldn’t pay for insults in addition to whatever else you might provide.

                                                                  1. 5

                                                                    As long as I’ve known him Drew has pretty consistently been like this. It’s not a bad thing. It’s quite refreshing actually.

                                                                    1. 35

                                                                      It’s refreshing to have a business make fun of you?

                                                                      1. 9

                                                                        It’s quite refreshing to see someone willing to say ‘no you’re wrong’ instead of the typical corporate ‘the customer is always right’ bullshit so many people here have obviously come to expect.

                                                                        Sometimes the customer is wrong.

                                                                        1. 34

                                                                          It’s OK for both people to be right, and the customer to stop paying for the service and walk away. It’s then OK for the customer to go tell people about how they were treated. Hopefully that happens more.

                                                                      2. 24

                                                                        As a former moderator in prior communities, I politely disagree. Folks that are never not toxic are a serious liability and require special effort to handle well. I recall one memorable day when Drew dared me to ban him; I should have let the emotions flow through me and removed him from the community.

                                                                        Also, as a former business owner, I politely disagree that this is good business practice.

                                                                        1. 16

                                                                          I agree it’s good, now I know to avoid this business!

                                                                  2. 14

                                                                    CPU speed vs memory usage is a fundamental resource tradeoff that occurs all the time in computing. Just because you disagree with where on the spectrum someone has chosen to aim their design doesn’t mean they’re stupid. Especially when they too are a mostly-one-person project operating on limited resources.

                                                                    It’s PERFECTLY VALID to say “I don’t have time to accommodate this one special case, sorry”. It is NOT perfectly valid to say “you are stupid for needing this special case, go away”. Money vs. person-time is another fundamental resource tradeoff where different people have different priorities.

                                                                    1. 21

                                                                      Regardless of the use case, I’d really rather not have my SCM platform making discretionary decisions about what I’m working on. The users aren’t paying for you to audit them, they’re paying for the services provided by the software. If you want your service to come with the exemption that you get to unilaterally decide whose content is allowed and whose content isn’t allowed, you’re free to do that. Just expect the community to nearly unanimously respond with “we’ll go elsewhere”

                                                                      1. 7

                                                                        He’s not making ‘discretionary decisions about what [you’re] working on’. I don’t see Drew saying ‘you can’t use this service because I don’t like the way your compiler is designed’. He’s saying ‘provisioning dedicated hardware for specific projects is a lot of time and effort that I don’t have, so I’d need to have a really really good reason to do it, no matter how much money you’re willing to throw at me, and you haven’t given me one’.

                                                                        Every service out there gets to decide what is allowed and what isn’t. Look at the terms of service of any file or repository hosting service anywhere. GitHub, GitLab, Bitbucket, imgur, pastebin services… ALL of them make it clear in their terms of service that it’s entirely up to their whim whether they want to host your files or not.

                                                                        1. 32

                                                                          Drew is literally commenting on a particular users project, and how its design is a problem, so I have no idea what you’re talking about:

                                                                          Something Andrew doesn’t mention here is why he needs 2-3 GiB more RAM: because, by design, his compiler never frees memory. Nearly all of that RAM is dead memory.

                                                                          As for compilers never freeing to be fast, utter bollocks.

                                                                          @andrewrk can hopefully clarify, but I thought his offer to up monthly donations was to improve sr.ht’s FreeBSD offering, in general, not necessarily to only improve Zig builds (Zig builds would improve as a byproduct of improving the FreeBSD infrastructure). If the donations were only to be used to improve Zig-specific experiences, then I understand the argument that Drew doesn’t want to commit to that.

                                                                      2. 12

                                                                        It just seems weird to me that one of your criteria for whether or not to give a customer resources is based on a personal audit of their code. Are you going to do this for every customer?

                                                                        1. 3

                                                                          I completly understand the concern here, and take it very seriously. I usually don’t dig into the repo at all and just reach out to the user to clarify its purpose. In this case, though, the repo was someone’s personal website, and named as such, and connecting the dots did not require much.

                                                                          1. 2

                                                                            As explained downthread, it’s “Alert fires -> look for what’s caused the alert -> contact customer whose repo tripped the alert”.

                                                                        2. -2

                                                                          ‘The customer is always right’ is nonsense.

                                                                          1. 10

                                                                            Nobody’s suggesting otherwise.

                                                                            1. -8

                                                                              Literally everyone else in this thread is acting like the customer is always right.

                                                                        3. 18

                                                                          You handled this very professionally and courteously, I plan to continue to use sh for many happy years to come.

                                                                          1. 6

                                                                            You are under no obligation to explain or justify what your business model is to anyone, or on a personal level what self sustainability, your own peace of mind, well being or definition of meaningful sustainable work is.

                                                                            There is a particular mode of doing business these days which people inside that paradigm often do not understand that they are inside and therefore apply force to get others to conform.

                                                                            You’re breaking old paradigms and inventing new ways of running organisations and that is brave, ground breaking and commendable and politically powerful.

                                                                            I hope issues like this does not deter you one bit from blazing your own trail through the fucked up world that is tech organisations in late stage capitalism and I hope you share as much as you can about how you’re doing personally and in ‘business’.

                                                                            1. 2

                                                                              git-lfs implementations often don’t allow to reclaim unreachable blobs: once you push a binary blob, even on a branch that you deleted, it will take some space forever.

                                                                              Maybe it is worth investigating git-annex while you’re on this topic.

                                                                              1. 5

                                                                                Yeah, git annex is also something I intend to study. I’m only just setting up large file servers for blob storage, figuring out how to apply them is the next step.

                                                                            1. 1

                                                                              I enjoyed this read.

                                                                              I wonder whether objects might be the simplest things that can maintain an internal state and update itself in response to messages. That is, if an object might be a function from a pair of state and messages to messages. This would be nice, because then objects would be Mealy machines.

                                                                              And, of course, I’m thinking categorically. There’s a category Aut of Moore machines; if we add state transformations to simulations, then we get a category Mealy whose objects are Mealy machines. This category’s abilities depend on its underlying state-transition formalism; when we put the category on top of set theory, then I think that we get a Cartesian closed category.

                                                                              In simpler words, we should be able to compile object-based languages to networks of Mealy machines.

                                                                              1. 1

                                                                                an object might be a function from a pair of state and messages to messages

                                                                                Reminds me of Objects as Closures:

                                                                                We have described the semantics of object-oriented languages by treating objects as closures. Specifically, we interpret an object as a message environment (binding messages to methods) with a hidden local environment (binding instance variables to values or locations).

                                                                              1. 2

                                                                                Very cool independent derivation of this trick. I first learned about it via factorial bases; I think that you’ve found Lehmer coding.

                                                                                1. 1

                                                                                  Thank you for linking these! I didn’t think that came up with this first, but I couldn’t figure out how to search for preexisting solutions. Lehmer code looks great. Cheers.

                                                                                1. 2

                                                                                  Haskell memes are eating themselves, it would seem.

                                                                                  There are as many opinions on what constitutes Simple Haskell as there are people who have them. We don’t want to endorse any one view, but promote the the general idea.

                                                                                  Following that are three bland paragraphs about the corporate benefits of accessibility, maturity, and (not) leaking complexity.

                                                                                  Does anybody really want for their code to be used by corporate hackers who aren’t going to contribute any money or time back to their library? Does anybody really want to bend over backwards and ensure compatibility with a corporation-approved list of ancient standards and extensions? Not really, I bet. But everybody loves “simple”.

                                                                                  1. 6

                                                                                    For my open source projects I’m not too concerned with it, but as someone who’s dealt with one bait and switch after another in the corporate world taking Haskell jobs only to make it 6 or 12 months in and management starts getting irrationally twitchy about having the business successfully running on code that’s written in Haskell, I hope that empty gestures like this might be the start of the spin we need to stop Haskell from getting constantly evicted from the corporate world.

                                                                                  1. 3

                                                                                    Sounds important, but the article is too long and complex for me to understand :/ Also, at some point the author mentions that in the Python example, the missing drain in write seems an oversight; so, if the bug was fixed, would the article still have any reason to be written? Is it basically an article about a bug in a Python library’s API? I’m confused.

                                                                                    1. 7

                                                                                      The article mentions this introductory article on backpressure. The WP article is not bad either.

                                                                                      The main concept is that, in a data-processing pipeline, we do not want data to pool or buffer. We want data to flow. It turns out that a key concept for good flow is letting each component of the pipeline understand how fast the data is coming in and how fast it is being read away. Without backpressure, data can linger arbitrarily in memory, creating effects like bufferbloat.

                                                                                      1. 2

                                                                                        Hmh, so I see I wasn’t precise enough: in fact, I kinda mostly understood the general idea of backpressure as it was explained in the introductory part of the original article; but I got lost immediately afterwards once the article started trying to explain how it applies to async and how it leads to an issue in Python. Sorry for not being clear about that; is there a chance you could help me understand this part as well? Do some of the links you provided above already explain it in terms of async? Thanks!!

                                                                                        1. 2

                                                                                          Part of the issue is not using await, which causes async actions to be effectively invisible since they are not waited for. async/await sugar is not magical, though, and it’s the same as any future-based or promise-based system.

                                                                                          The other part of the issue is not being able to explicitly manage backpressure. The author uses an ad-hoc token bucket to quantify how much backpressure is currently manifest.

                                                                                    1. 5

                                                                                      To repeat myself again: This ought to be only for CPython, and not for interpreters like PyPy which are not following the Python Software Foundation’s deprecation schedule. Quoting again from an old PyPy blog post:

                                                                                      PyPy’s own position is that PyPy will support Python 2.7 forever—the RPython language in which PyPy is written is a subset of 2.7, and we have no plan to upgrade that.

                                                                                      I continue to be deeply disappointed in the ethics of the Python Software Foundation for their decades-long snub of PyPy, Jython, and the rest of the alternative interpreter ecosystem.

                                                                                      1. 8

                                                                                        I continue to be deeply disappointed with PyPy, and the rest of the bloody ecosystem’s, perverse and unhealthy obsession with sticking to Python 2 forever.

                                                                                        1. 5

                                                                                          Then use PyPy for Python 3.6. Additionally, you could roll up your sleeves and help maintain PyPy. The problem is not the deprecation of Python 2 as much as the continued shunning of PyPy.

                                                                                          In a prior life, I recall demonstrating that PyPy could keep pace with your handwritten C++ and C♯ code, in the context of homebrewed Minecraft servers. I wonder how much of your disappointment is sour grapes.

                                                                                      1. 2

                                                                                        From the linked reference: Defunctionalization is thus a whole-program transformation where function types are replaced by an enumeration of the function abstractions in this program [1]

                                                                                        I’m not sure what they mean by enumerate. Suppose you have a program that begins with a few functions and creates more using function-transformations. So the total count of functions is not finite and it seems these couldn’t be finitely enumerated.

                                                                                        But maybe I’m not quite getting the idea.

                                                                                        [1] https://www.brics.dk/RS/01/23/BRICS-RS-01-23.pdf

                                                                                        1. 1

                                                                                          Perhaps there is a finite basis. If so, then we could build all relevant functions from a small set of combinators. It happens to be the case that the essential structure of Cartesian closed categories is captured by such a finite basis. Indeed, the And constructor from the article would be part of such a basis.

                                                                                          To explicitly construct the enumeration, sort all of the basis’ constructors, and assign an index to each. When the index grows greater than the number of constructors, wrap around with a modulus operation and continue on, ala ASCII. Discard any ill-typed expressions.

                                                                                          1. 1

                                                                                            Hi, OP here.

                                                                                            A “function abstraction” literally refers to a lambda expression (/ function decl). The term comes from the lambda calculus. So these are indeed finite.