Threads for justinpombrio

  1. 2

    This has an interesting correspondence to one of my favorite software design talks of all time, by Daniel Jackson - Rethinking Software Design

    The main premise of the talk is that each concept in a codebase should be in 1:1 correspondence with a purpose, and he identifies the same mismatches between the two (“4 bad smells”, timestamped link).

    The purposes/concepts split is very similar to the abstract/concrete state split defined in the post, but it’s more general in that it doesn’t refer only to state.

    1. 3

      That’s a great talk, I’m watching it now. Thanks for the link!

    1. 2

      With a prompt that clearly indicates that the task includes distinguishing sense from nonsense, GPT-3 is able to reliably distinguish Hofstadter and Bender’s nonsense questions from sensical ones.

      https://www.lesswrong.com/posts/ADwayvunaJqBLzawa/contra-hofstadter-on-gpt-3-nonsense

      1. 4

        Even if you clearly indicate that, GPT-3 still messes up:

        I’ll ask a series of questions. If the questions are nonsense, answer “yo be real”, if they’re a question that has an answer, answer them.

        Q: Any word with exactly two vowels is an “alpha” word. Any word with exactly two consonants is a “bravo” word. What are some words that are both “alpha” and “bravo” words?

        A: Yo be real.

        Q: What’s the first word of this question?

        A: yo

        1. 1

          Sure, you can up the difficulty and it starts getting confused. The point is that GTP-3 can tell that the questions in the article are nonsense, if you ask it nicely.

          1. 1

            Or the “yo be real” response is just an exception thrown if the model doesn’t find enough “hits” for the type of question asked.

            1. 1

              You can find plenty of examples of it combining novel concepts seamlessly.

              Anyways, the point is merely that if Hofstadter is going to criticize GPT-3, he should use examples that it fails on even when given a helpful starting prompt. It’s not trying to answer your trick logic question, it’s trying to complete the prompt with average words from the internet. I don’t know if you noticed, but the internet is full of muddled incorrect garbage, so GPT-3 is very happy to spit muddled incorrect garbage back out, unless you convince it in the prompt that we’re in a nice part of the internet.

              1. 2

                Hofstadter’s point is that GPT-3 isn’t conscious. It’s a vast, sophisticated rule-matching engine. It will be of great help for companies who want to get rid of customer service representatives, and people who want to try to convince other people online that their opinions are more widely shared than they really are. But it’s a tool, not a conscious being.

                1. 1

                  I’m not disagreeing with that position. The only point in trying to make is that the examples he gives are bunk because it can, in fact, distinguish them as being nonsensical questions if you ask it nicely.

                  1. 1

                    You’re assuming that Hofstadter didn’t know that you can use natural language to tell the algorithm to reply with a signifier that it cannot find a good match in the corpus for the question, rather than replying with nonsense, Eliza-like, for such situations.

                    Maybe he didn’t, maybe he did, and didn’t want to waste valuable Economist screen space with qualifications. His point still stands.

      1. 13

        This is a nice advertisement for deriving standard methods like comparisons, in languages that support that.

        1. 3

          In generic Go, you can use < on any of the numeric types, but it’s not defined on structs, so it wouldn’t be defined for the example.

        1. 3

          But don’t you have to come up with a seed number … at random?

          There ought to be a way to generate a seed by observing your environment … maybe adding up all the digits you can see at that moment, or counting the number of something…

          1. 5

            Well, given how bad humans are at generating sequences of random numbers, perhaps even arbitrarily picking one seed is a lot better than arbitrarily picking 59 numbers without an algorithm. So you’re right that some human choice is required, but hopefully its impact is diminished.

            maybe adding up all the digits you can see at that moment, or counting the number of something…

            Counting the number of doors and windows, perhaps? Or just checking the time?

            1. 2

              If you have a method (or multiple methods) of generating random numbers that you don’t quite trust, you can add them up modulo the base number. This will take numbers sampled from biased distributions, and produce a much more uniform distribution.

          1. 6

            Good observation. I like to generalize this to NVI (no-virtual interface) – try not to expose traits as the interface for the user. Wrap traits in concrete types and base interface on that types. Traits are great as an extension point, but are rather unergonomic as a primarily interface.

            Another observation along similar lines. Lets say you have a network and a database crate, and you want to plug one into another, without direct dependency. Let’s say we want the net to use the db. We could put trait Db in the db, but then net would have to depend on db, even if it uses only the interface. We could put trait Db in the net, but then, to write impl net::Db for db::MyDb we’d have db depend on net due to coherence. So the obvious solution is to pull trait Db to some third crate, core_interfaces, and make both db and net depend on that. This kind works, but doesn’t scale – the core_interfaces would soon include a laundry list of poorly defined interfaces. A more verbose solution which I non-the-less think works better for large projects is to define trait Db in the net, and, instead of implementing it directly for db::MyDb, write the implementation for a new-type wrapper in the leaf binary crate which ties net and db together. This leads to better factoring, as, if you have not only the net, but also, say email, there could be separate trait net::Db and trait email::Db, which specify exactly what they need.

            1. 3

              Are there recommended open-source projects to look at that use this pattern?

              I’d love more blog posts on patterns like this.

              1. 1

                I like to generalize this to NVI (no-virtual interface) – try not to expose traits as the interface for the user. Wrap traits in concrete types and base interface on that types. Traits are great as an extension point, but are rather unergonomic as a primarily interface.

                So the article is talking about the specific situation in which the interface to the trait is considered private, so it doesn’t make sense for the user to implement the trait themselves. In that case, yes, you certainly shouldn’t expose that trait to the user. It also means that the trait is closed instead of open – it cannot be extended outside of the library. Thus you could use an enum instead of a trait (and you might want to wrap the enum in a newtype for similar reasons).

                With your NVI advice here, are you referring to the same situation — don’t expose traits as the interface to the user when the trait is meant to be private and there’s actually only a finite set of implementations the user should be using that you have already provided them? If so, I agree; if not could you say more?

                For example, I’m writing a simple TUI that can render nested panes including colored output to a “screen”. The “screen” might just be a string (in which case colors will get ignored), or it might be a wrapper around printing to a terminal screen, or it might be a variety of other things, and I want the user to be able to write their own. So the interface is something like:

                fn pane_print<S: Screen>(doc: Document, screen: S) -> Result<(), Screen::Error>
                

                Are you suggesting I not do this, and if so what’s the alternative?

                EDIT: maybe this is just a matter of not knowing what you mean by a primary interface vs extension point?

                1. 2

                  (I really should blog about this one day)

                  Yours is a good example! Consider

                  pub trait Screen {
                      fn put_char(&mut self, ch: char);
                      fn put_str(&mut self, s: &str) {
                          s.chars().for_each(|ch| self.put_char(ch))
                      }
                  }
                  
                  fn pane_print<S: Screen>(doc: Document, screen: S) { .. }
                  

                  I argue that this single trait embodies two interfaces. One is the interface for the implementors of screens, they need to override put_char. Another is the interface for users of screens, who call put_str. What I am saying is that:

                  • It’s important to internalize the notion of two-sided abstractions, of implementor and user interface.
                  • Sometimes, especially when programming in the large, it is useful to make the separation between the two explicit.

                  In the above example, the separation would look like this:

                  pub trait ScreenBackend {
                      fn put_char(&mut self, ch: char);
                  }
                  
                  pub struct Screen {
                      backend: Box<dyn ScreenBackend>,
                  }
                  
                  impl Screen {
                      pub fn new<B: ScreenBackend>(backend: B) -> Screen {
                          Screen {
                              backend: Box::new(backend),
                          }
                      }
                  
                      pub fn put_char(&mut self, ch: char) {
                          self.backend.put_char(ch);
                      }
                      pub fn put_str(&mut self, s: &str) {
                          s.chars().for_each(|ch| self.put_char(ch))
                      }
                  }
                  
                  fn pane_print(doc: Document, screen: Screen) { .. }
                  

                  The ScreenBackend is the extension point, its the interface for implementors. The Screen is the API, it’s the interface for the users. Note, in particular, how put_str now is non-overidable.

                  I am not saying that this pattern is always a good idea (just look at the verbosity!). But it often is, and it’s not very intuitive. The main benefits here are:

                  • Often, the best interface for the user and the best interface for the implementor are similar, but slightly different. If you don’t clearly separate them out, they might become entangled.
                  • For the user, it’s much better to deal with a concrete type, rather than with a high-order type parameter. Also, compile times!
                  • Explicit separation allows for easier changes: the Box can be replaced with an Arc or enum without affecting call sites.
                  1. 1

                    Ah, I agree 100% with this pattern! Screen is a better interface for users, and ScreenBackend better for implementers.

                    There’s another pattern that achieves similar goals, where you have a huge trait with only a small number of required methods, like std::iter::Iterator. Though that feels.. leakier? And I feel like any given situation calls for one or the other though I couldn’t give you a rule for under what circumstances.

                    Amusingly, I think your pattern is not a good fit for my code because of the way that Screen happens to be used. I’d like to use it as a case study for when your suggested pattern is essential, and when it’s not. The differentiating factors seem to be that in my case (i) there is exactly one user of Screen, which is the pretty printer itself, and (ii) there is no crate boundary between the definition of the Screen trait, and its use site.

                    So my pretty printer is the only user of screens. ​It does put a wrapper called Pane around Screen, for its own personal use, with a type signature like struct Pane<S: Screen>. This achieves or avoids your goals:

                    • The interfaces for the implementer (Screen, supplied by the person using the pretty printing library) and for the user (which is the wrapper type Pane<S>, used by the pretty printer itself) are different.
                    • You mean that it’s easier for the end user to deal with a concrete type. Agreed. Not relevant in my exact case because the user of the Screen is not the end user.
                    • “Explicit separation allows for easier changes” – the type variable S: Screen has leaked into my wrapper type Pane<S>. But it’s all internal with a single use site, so it would be easy to change anyways.

                    Crate boundaries (or maybe package boundaries), feel very important for this, because you can refactor atomically within a crate, but not across a crate boundary. And also because you don’t know the arity of things that are happening across the boundary.

                    Within a crate, if you have a trait T you can count all of the implementations A, B, and C, and replace the trait with an enum enum T { A, B, C }. Across a package boundary, you can’t, because you don’t know how many implementers there are. Across a crate but not package boundary is more grey. In a sufficiently large project, you probably want to split your package into crates, and treat crate boundaries as if they were package boundaries and go for as clean a separation as you can manage.

                    Perhaps the rules are something like:

                    • If you require a user outside of your package to implement a trait, have them implement the smallest thing possible. (This is what you’re calling the implementer of the trait.)
                    • If you provide a user outside of your package with a thing that implements a trait, don’t expose the trait to them. Wrap it instead. (This is what you’re calling the user of the trait.)
                    • If you have a large project, consider enforcing these sorts of boundaries internally too.

                    Anyhow, that was kind of rambly. I do hope you write that blog post!

              1. 3

                Are articles on society or politics no longer off-topic?

                1. 6

                  Sshh, don’t poke the bear. We’ve been tiptoeing around the bear so that we could read and discuss this primarily technical article about a case study in the resilience of the internet.

                  1. 2

                    Perhaps the trick is to explore the solutions but avoid the problems.

                  2. 4

                    It’s a technical article how ripe observes the network during a “human made disruption” in the area, why that is the case and how you can compare this with your country. (It’s more or less the bus-effect of ISPs) And I’m actually impressed how well it holds out.

                  1. 33

                    More broadly, Rust’s complexity makes Rust harder to learn, which is an unnecessary burden placed on those learning it.

                    My experience learning Rust has near-universally been positive in this regard. Most complexity exists for a reason, and in the end I’m happy it’s there because it makes the Rust experience dramatically better than Go. Of course Rust does have some things that are just awkward and obnoxious for little benefit, but many of those have gone away as Rust continues to mature. Picking up Rust 5 years ago was much harder than it is today.

                    If in 2050, programmers are sitting around waiting for their Rust code to compile just to be told that we failed to sacrifice enough to appease the borrow-checker gods,

                    cargo check?

                    1. 19

                      Most complexity exists for a reason

                      It is extremely comforting to assume this, but I believe most complexity exists because someone couldn’t (for time, brains or money) make it less complex.

                      If you have the time, it is usually worth thinking about how to reduce complexity, instead of accepting it and assuming it “exists for a reason”, because if you can find a way to reduce complexity, you will free your brains and money to do work on other things.

                      I am aware there are people who just like being busy and like working hard. Those people love complexity, and there’s little you can do about that except not believe them and talk about other things besides programming.

                      And listen, I don’t mean that this is the necessarily case with rust[1], only that you (and others) shouldn’t be so accepting of complexity thrust upon you because there are things you can do about it.

                      [1]: …although I do happen to think so, I’m also aware this is just my opinion. If rust makes you happy, be happy!

                      1. 21

                        only that you (and others) shouldn’t be so accepting of complexity thrust upon you because there are things you can do about it.

                        I think it’s pretty clear I was talking about complexity in Rust, and I resent the implication that I blindly accept complexity without any sort of critical thinking.

                        Rust the language does have a lot of up-front complexity. But that complexity comes with up-front safety and correctness. It’s compiler errors vs core dumps, it’s borrow checking vs pointer aliasing bugs.

                        I once spent 2 weeks chasing a pointer aliasing bug. After pouring over thousands of lines of code, customer provided core dumps, and live debugging sessions once I could reproduce the issue, I finally noticed that two pointers had the same hex value when they shouldn’t have. It never caused a crash, never corrupted data, only reduced throughout in a certain edge case. And it would have been a compiler error in Rust.

                        Unsafe languages aren’t actually less complex, they just push the complexity down the road.

                        1. 7

                          Unsafe languages aren’t actually less complex, they just push the complexity down the road.

                          I don’t think I can agree that all things are equally complex. I hope this is not what you mean, but if it is not what you mean, I cannot understand why you would say this.

                          We have to choose where we want our complexity, for sure, and like entropy it cannot be avoided completely and get any work done, but some things are more complex than others for all kinds of complex things.

                          I think it’s pretty clear I was talking about complexity in Rust, and I resent the implication that I blindly accept complexity without any sort of critical thinking.

                          I am very sorry that I have offended you in any way, however if you think that you never have accepted complexity in your life, without, (as you say) “any sort of” critical thinking, then you should probably think whoever changed your diapers when you were a child. We all do it, and there’s nothing wrong with it sometimes.

                          I just think it’s something we programmers should watch out for- like our own invisible biases, or getting value alignment before asking someone to review our work. If we think we don’t do it – or we get offended at the implication that we’ve done it, we can be blind to something that if we had discovered in other circumstances (or as they say, with different priors), that we would find very important.

                          It is awful hard for me to convince myself that rust represents the best we can do at anything in particular, only perhaps the best we have done so far at some things, but so what? I feel that way about a lot of languages. If rust makes you happy, I think you should be happy as well!

                          1. 8

                            PLT person here. Outside of async (which I haven’t used much but hear is very complex), Rust has very little accidental complexity. If you make a language with borrow checking, ADTs, traits (interfaces), and emphasis on zero-cost abstractions, it will necessarily have most of the complexity of Rust.

                            1. 2

                              Yes and no. There’s a bunch of choices in Rust to make things implicit, which could have been explicit. This somewhat reduces the syntax one needs to learn at the cost of making it much harder to understand the text of any given program.

                              1. 3

                                That’s funny, I think of Rust as being an unusually explicit language. Which things are you thinking of? My list would be:

                                • Automatic dereferencing and automatic ‘ref’ insertion in pattern matching and automatic ‘.drop()’ insertion. Though the language would get very verbose without these.
                                • A couple auto-derived traits. Though again, it might get really old really fast if you had to write ‘#[derive(Sized)]’ for all of your types.
                                • Functions that return a type not dependent on its argument, such as .into() and .parse().
                                • Operator overloading
                                • There are a couple “magic” fat-pointer types like &str

                                Things that are explicit in Rust but often implicit in other languages:

                                • No implicit type coercions like in C or JS
                                • VTables are marked with ‘dyn’
                                • Semantically distinct values have distinct types more often than in most languages. E.g. String vs. OSString vs. PathBuf.
                                • The “default” import style is to name all of your imports instead of using globs.
                        2. 8

                          It is extremely comforting to assume this, but I believe most complexity exists because someone couldn’t (for time, brains or money) make it less complex.

                          I think this is often true. But also, a ton of complexity totally has reasons for existing. It’s just that the linear benefit of simplifying one use case is generally outweighed by the exponential downside of combinatorial complexity of the language.

                          Go didn’t manage complexity by coming up with a smarter simpler language design - they managed it by giving things up and saying no. The end result is a lot of quality-of-life things are missing in the small, but the standard of living in a large Go code base is, on average, better.

                          1. 5

                            couldn’t (for time, brains or money) make it less complex shouldn’t be so accepting of complexity thrust upon you because there are things you can do about it

                            I think we can all hope that future rust or a successor may come up with a language that provides the same flexibility, performance and safety without the possible* hassle you have when learning rust. But currently there is none (without a required GC, with native compilation, with thread safety, with C-interop that doesn’t suck, with [upcoming] embedded support, with async, with …).

                            *Also, I think that many people simply underestimate how much they’re used to the C/C++/Java style of programming, where you either have a GC, have OO or can simply write the code assuming “the compiler will get it” (speaking of stuff the borrow checker currently can’t prove, and thus accept). Or they weren’t actually define behavior in the first place. Something i’ve seen from people trying to port their whacky c-code over to rust, only to find out they simply memory leaked their stuff. Which was fine for the oneshot CLI, but actually never defined or good behavior. On the other side I’ve had students that learned programming the racket way, and then stumbled upon a myriad of complexity when learning java (this vs this@.., static, field init vs constructor init and the order of them, ArrayList vs array vs List, primitives vs Objects vs autoboxing).

                            I think it’s a little dismissive to say others didn’t make it as good as they could have, it may be true, but it’s harsh to assume they didn’t try and people are worshiping that.

                            1. 3

                              But currently there is none (without a required GC, with native compilation, with thread safety, with C-interop that doesn’t suck, with [upcoming] embedded support, with async, with …).

                              Quite possible indeed. Zig maybe. Zeta-C. D. I think some of those things are pretty subjective, so maybe others? But in any event, not everybody needs all of those things, and even when people do, they don’t necessarily need them all at once. I think it’s important to keep that in mind.

                              I think it’s a little dismissive to say others didn’t make it as good as they could have, it may be true, but it’s harsh to assume they didn’t try and people are worshiping that.

                              I think they did make it as good as they could have (with the time, money and brains that they had), only that some of those constraints aren’t anything “reasoning” can do anything about at all. Where do you think I said the opposite?

                              1. 2

                                they don’t necessarily need them all at once

                                That’s maybe true. But on the opposite people program everything in C++ (Embedded, Kernels,Webbrowsers,Servers,Games), everything in JS (Embedded,Electron,WebGL,CLI..), everything in Java (embedded processors, DVD, android and the whole CRUD stack) and everything in C. So if you can get rust high level enough for (web) applications, and opt-in low level enough for embedded (core/alloc) - why not ? I don’t think there are actually quirks in rust that are originating out of the broad range of supported use cases. Obviously you can always make a better DSL for specific use cases, in the way Ruby on Rails is nothing else than a DSL for this. (And then go back to C interfaces when you actually need ML/CompSci/.. )

                                Regarding “I said the opposite”: I think it’s a combination of these lines

                                I believe most complexity exists because someone couldn’t (for time, brains or money) make it less complex you (and others) shouldn’t be so accepting of complexity thrust upon you because there are things you can do about it

                                Though I just wanted to push back the idea that “I” can just do something about, or that it’s only because people were lazy, because that’s one way I can read the above sentence.

                                1. 1

                                  But on the opposite people program everything in …

                                  I can’t really speak for people who think learning a new language for a specific domain is beyond the brains, money and time they have; I mean, I get there’s an appeal of having one language that can do everything, but there’s also an appeal in having many languages which each do certain things best.

                                  I don’t really know which is better overall, but (more to the point) I don’t think anyone does, so I urge vigilance.

                                  I just wanted to push back the idea that “I” can just do something about, or that it’s only because people were lazy, because that’s one way I can read the above sentence.

                                  I hope you can read it another way now; Of course there is something you can do about it. You can do whatever you want!

                            2. 3

                              It is extremely comforting to assume this, but I believe most complexity exists because someone couldn’t (for time, brains or money) make it less complex.

                              I think coloured functions are a perfect example of this phenomenon. For the sake of my argument I’ll assume that everyone agrees with the premise of this article.

                              If we compare two approaches to asynchronous code, Rust’s traditional async/await, and Zig’s “colorblind async/await”, I would argue that Andrew Kelley applied the effort to make Zig’s approach less complex, while Rust did not.

                              1. 2

                                Honestly I think zig’s async/await is a bit of a cheat (in a good way) because it’s not actually async but doing a different, low level thing that can map to “some concept of async”. It’s low level, so you can ab-use it for something else (but please don’t).

                                The nice thing about it is that you can really get a solid mental model of what the CPU is doing under the hood.

                                1. 1

                                  Yes. And that low level thing is a theory of our software science, and a compelling one because that one gives you different ideas than the other; Our languages “look” different, can produce many of the same results, but using one instead of another means making different kinds of changes to those things in one language will simply be easier than making those changes in another.

                                  Software can be so beautiful!

                              2. 2

                                This just depends on your priors. Perhaps in general its a good assumption that reasonable work will eliminate complexity, but in many situations (for instance, almost any situation where people are routinely paying the cost of extant complexity) it may make sense to adopt a prior which leans more towards “this complexity exists for a reason.”

                                Rust would be a great example of the latter case. Its a programming language that was intentionally designed to reduce the complexity of programming in domains where C/C++ are used. Since both C and C++ are very complex, its reasonable to assume that design process was complexity focused. And since rust was designed in a commercial context its also reasonable to assume that some skin was in the game.

                                These facts suggest that its more reasonable to accept that the complexity in Rust is there for a reason.

                                In my career, when I have discovered some apparently complex artifact I’ve almost always found that there were compelling (sometimes non-technical, but nevertheless real) reasons that complexity hadn’t been eliminated.

                                1. 2

                                  In my career, when I have discovered some apparently complex artifact I’ve almost always found that there were compelling (sometimes non-technical, but nevertheless real) reasons that complexity hadn’t been eliminated.

                                  I think this manages to confuse what I mean by trying to distinguish between “complexity exists for a reason” and “complexity is usually an estimated trade-off between {brains,money,time}” – because for sure the real limitations of our minds and wallets and health are as good a “reason” as any, but I don’t think it’s useful to think about things like this because it suggests thinking of complexity as inevitable.

                                  These facts suggest that its more reasonable to accept that the complexity in Rust is there for a reason.

                                  They don’t though. These “facts” are really just your opinions, and many of these opinions are opinions about other peoples’ opinions. They’re fine opinions, and I think if rust makes you happy you should be happy.

                                  But I for example, don’t think C is very complex.

                                  1. 4

                                    C is exceptionally complex, in a couple of different ways*. It is one of the most complex programming languages that has ever existed. I am baffled whenever I see someone claim it is not very complex. Rust is massively less complex than C.

                                    *: The most relevant one being that in terms of memory management you have actually even more difficulty and complexity in C than the Rust borrow checker, as you have to solve the issues the borrow checker solves but without a borrow checker to do it, and also with much weaker guarantees about the behavior of eg pointers and memory.

                                    1. 2

                                      That’s fantastic you have a different opinion than me.

                                      1. 1

                                        you have actually even more difficulty and complexity in C than the Rust borrow checker, as you have to solve the issues the borrow checker solves but without a borrow checker to do it

                                        While it is arguably true that you have to solve the same issues with or without a borrow checker (and I could argue with that, but won’t), it’s definitely not true that the borrow-checker is a magical tool that helps you to solve problems without creating any of its own. You had a memory management problem. Now you have both a memory management problem and a proof burden. It may be that the proof framework (burden, tooling, and language for expressing the proof) helps you in solving the memory management problem, but to pretend that there is no tradeoff is unreasonable.

                              1. 10

                                Looking back at the original sources, it seems like the meaning of the Robustness Principle changed over time.

                                Postel’s original quote, in the Internal Protocol spec in 1979: https://www.postel.org/ien/txt/ien111.txt

                                The implementation of a protocol must be robust. Each implementation must expect to interoperate with others created by different individuals. While the goal of this specification is to be explicit about the protocol there is the possibility of differing interpretations. In general, an implementation should be conservative in its sending behavior, and liberal in its receiving behavior. That is, it should be careful to send well-formed datagrams, but should accept any datagram that it can interpret (e.g., not object to technical errors where the meaning is still clear).

                                I read this as “by all means, throw errors if someone sends you something that unambiguously violates the spec. But where the spec is ambiguous (when there is the possibility of ‘differing interpretations’), then you should be liberal in what you accept”.

                                Then later in 1989, “Requirements for Internet Hosts – Communication Layers” (RFC 1122): https://datatracker.ietf.org/doc/html/rfc1122

                                “Be liberal in what you accept, and conservative in what you send”

                                Software should be written to deal with every conceivable error, no matter how unlikely; sooner or later a packet will come in with that particular combination of errors and attributes, and unless the software is prepared, chaos can ensue. In general, it is best to assume that the network is filled with malevolent entities that will send in packets designed to have the worst possible effect. This assumption will lead to suitable protective design, although the most serious problems in the Internet have been caused by unenvisaged mechanisms triggered by low-probability events; mere human malice would never have taken so devious a course!

                                Adaptability to change must be designed into all levels of Internet host software. As a simple example, consider a protocol specification that contains an enumeration of values for a particular header field – e.g., a type field, a port number, or an error code; this enumeration must be assumed to be incomplete. Thus, if a protocol specification defines four possible error codes, the software must not break when a fifth code shows up. An undefined code might be logged (see below), but it must not cause a failure.

                                Note the examples. The network is filled with malevolent entities, so you should deal with “every conceivable error”. Presumably by handling the error appropriately without crashing, and not by ignoring it and moving on. And you must be prepared to deal with predictable change, like open enums. Again, this doesn’t read to me as saying that if someone unambiguously violates the spec (in a way that isn’t conceivably due to using a later version of the spec than you were built for) that you should not error.

                                And finally, in “Architectural Principles of the Internet” in 1996 (RFC 1958): https://www.rfc-editor.org/rfc/rfc1958

                                3.9 Be strict when sending and tolerant when receiving. Implementations must follow specifications precisely when sending to the network, and tolerate faulty input from the network. When in doubt, discard faulty input silently, without returning an error message unless this is required by the specification.

                                Pretty unambiguously, “you should accept garbage”.

                                1. 16

                                  It kinda adds to the point that the post itself doesn’t highlight why this may be unsound and why MaybeUninit is needed. The problem is that given a value of a type that expresses constraints on its representation - say, the NonNull pointer - it is absolutely crucial that none of these representations are ever accidentally expressed at any point in time. So let n: NonNull<T> = ... expresses that the right hand side is never NULL. mem::zeroed or uninitialized memory may break that contract. The compiler relies on this for optimisations, for example for optimising Option<NonNull<T>>. The reason why I said may above is because for some types (say u32), where all memory patterns are legal representations, this doesn’t matter and everything is actually fine.

                                  It boils down to the difference that in Rust, expressing “A is NonNull” while it isn’t is immediate UB, in C, it is UB on use.

                                  One isn’t necessarly that much harder than the other - the base rule is easy, conceptually. But Rust up until now has not done a stellar job to teach people the structure of unsafe, because it’s currently mostly focus on teaching people how to use safe Rust proper.

                                  1. 3

                                    (I don’t mean this in an accusing way) I think that this is a bit having the cake and eating it too about Rust’s whole narrative on unsafe.

                                    There’s an acknowledgement that Rust without unsafe is restrictive (see the whole “can’t write linked lists” thing). So what do people say? They say “you can use unsafe to get around this!”

                                    I think the blog’s example (not being able to fully initialize an object at its definition site) is an extremely common code organization thing in basically any language, with its pitfalls etc. But in the hypothetical where you have your case that you have “proven to be right” or whatever, you can in a theoretical world say “OK make space in memory for this object. I’ll fill it out over several statements for code cleanliness or because I need this weird circular reference or whatever”

                                    Of course you can probably write that stuff, but the post makes a real convincing argument that you have to negotiate with the tooling to get it to do that. So you’re futzing around with all of this to prove this guarantee that you in your case might not even need.

                                    I imagine that having rules apply even in unsafe makes a lot of stuff a lot easier, and there’s I bet some foundational reasoning here, but I feel like this requires you to kinda restructure your code even if you know the theoretical machine code the compiler could generate to do what you want safely.

                                    Kinda aligns with the general “Rust is harder than C”, which… I feel is noncontroversial. I never need unsafe for the kinds of stuff I write, but now I’m going to be even more careful cuz it seems like it might be real easy to mess up.

                                    (EDIT: “prove your code to be right” is of course a bit wrong in Rust, because of your point about Rust’s rules about initialized memory. More of a conceptual idea based on how you would imagine your code to be as machine code…)

                                    1. 1

                                      Yes, uninitialized memory is an extremely common situation, but Rust does have an extremely simple solution: use Option. MaybeUninit and unsafe are there for cases when Option isn’t good enough, but you can program in Rust for years without encountering such cases.

                                      1. 4

                                        As a data point, I’ve been programming in Rust for 7ish years, including 2 professionally, and I’ve never wanted to partially initialize an object, nor noticed anyone else wanting that. If you don’t have all the information required to construct a Whichamahoosit, why would you want an invalid half of a Whichamahoosut? Maybe I just always solve this by using multiple pieces of data, or using the builder pattern, and in C people are used to doing things a different way?

                                        1. 1

                                          I feel like initialization and “tying the knot” between releated data structures after some book keeping is an extremely common code organization concern in every programming language. Fortunately we tend to have good patterns for dealing with it and I think that Rust offers good data structures for many scenarios that help to avoid exposing this issue too much.

                                          Option kinda sucks because you now have this idea that you need to handle partially-init’d data structures everywhere. Builder pattern definitely feels like the cleaner solution in general, it’s sometimes futzy but that’s fine, and you can encapsulate any trickiness with good API design.

                                  1. 6

                                    Having really weak string support like this really kills a language’s ergonomics for me. Rust also blew it in this regard. I understand why for both Zig and Rust but it’s still disappointing.

                                    1. 8

                                      I can understand problems with Zig, but Rust? What is an issue there?

                                      1. 7

                                        Yeah, Rust string methods are as complete as any other language I know. Presumably kris meant that Rust strings are not ergonomic? Because borrow checker and String vs. &str vs. &mut str vs. whatever str is. (Which is all a necessary consequence of borrow checking and zero-cost abstractions, but that doesn’t make it less confusing to a beginner.)

                                        1. 4

                                          I agree that it is confusing, but strings in general are hard, Rust just exposes that from the day one instead of pretending that everything is ASCII until it is not.

                                      2. 4

                                        What would you like to see in the way of string support in a Zig/Rust-like?

                                        1. 4

                                          An immutable string type is just so convenient that’s it’s really hard to do justify doing string stuff in a language without it.

                                          1. 2

                                            In Rust, immutable and mutable strings would have exactly the same API and would have the same ergonomics. The only difference between the two would be in performance

                                            operation | “immutable” | “mutable” 
                                            clone     | O(1)        | O(N)
                                            push char | O(log(N))   | O(1)
                                            

                                            http://smallcultfollowing.com/babysteps/blog/2018/02/01/in-rust-ordinary-vectors-are-values/

                                      1. 3

                                        Caveat: I don’t have a lot of experience with C (or C++).

                                        I had the impression from reading various internet discussions that null-terminated strings were considered a mistake. After some searching, I found multiple impassioned defenses of them on Stack Overflow. This gives me more context and understanding for why null terminated strings were chosen for C, but doesn’t provide any reason why almost no languages since uses them.

                                        What is Zig’s rationale for using null terminated strings?

                                        1. 4

                                          Zig has support for both null and non-null terminated strings. The []const u8 type, which is the convention for strings is non-null terminated. The default type for a string literal is *const [N:0]u8. This can then coerce into a []const u8 which is a slice. Null terminated strings are useful for c interop, but slices are very useful also.

                                          1. 3

                                            As someone who only knows a little of Zig, my guess is that the decision is a consequence of Zig’s origin. Zig is meant to be a better C. C uses null-terminated strings and (nearly) every C library does. Therefore, supporting them in an essential way seems hard to get away from.

                                            1. 3

                                              EDIT: looks like g-w1 actually knows: Zig has both kinds of strings, and the null-terminated ones are for C interop.

                                            2. 2

                                              Relying on the null terminator causes problems because calculating lengths (and doing bounds checks on random access) are O(n). C used null terminators because space was very constrained. A length field the same size as a null byte (as Pascal used) limited strings to 256 characters, which caused a lot of problems. If you have a 32-bit or 64-bit size field, you’re typically not losing much (especially if you do short-string optimisation and reserve a bit to indicate whether the short string is embedded in the space used by the size and pointer).

                                              In contrast, having the null terminator can make C interop easier because you don’t need to copy strings to convert them to C strings. How much this matters depends a lot on your use case. Having the null terminator can cause a lot of problems if you have one inconsistently. For example:

                                              $ cat str.cc
                                              #include <string>
                                              #include <cstring>
                                              #include <iostream>
                                              
                                              int main()
                                              {
                                                      std::string hello = "hello";
                                                      auto hello_null = hello;
                                                      hello_null += '\0';
                                                      std::cout << hello << " == " << hello_null << " = " << (hello == hello_null) << std::endl;
                                                      std::cout << "strlen(" << hello << ".c_str()) == " << strlen(hello.c_str()) << std::endl;
                                                      std::cout << "strlen(" << hello_null << ".c_str()) == " << strlen(hello_null.c_str()) << std::endl;
                                              }
                                              $ c++ str.cc && ./a.out
                                              hello == hello = 0
                                              strlen(hello.c_str()) == 5
                                              strlen(hello.c_str()) == 5
                                              

                                              Converting a C++ standard string to a C string implicitly strips the null terminator (it’s there, you just can’t see it), which means that strlen(x.c_str()) and x.size() will be inconsistent.

                                              The biggest mistake that a string library can make is coupling the string interface to a string representation. A contiguous array of bytes containing a UTF-8 encoding is fine for a lot of uses of immutable strings, but what happens if you want to iterate over grapheme clusters (or even unicode code points)? If you do this multiple times for the same string then you can do it much more efficiently if you cache the boundaries with the string. For mutable strings, there are a lot more problems. Consider adding a character to the middle of a string with the contiguous-array representation. It’s a O(n) operation in the length of the string, because you have to reallocate and copy everything. With a model that over-allocates the buffer then it’s O(n) in the length of the tail of the string, with periodic O(n) copies when the buffer is exhausted (amortised to something better depending on the policy). With a twine-like representation, insertion can be cheap but indexing may be more expensive. The optimal string representation depends hugely on the set of operations that you want to perform. If your string operations aren’t abstracted over the representation then there’s pressure to use a non-optimal representation.

                                              Objective-C did this reasonably well. Strings implement a small set of primitive methods and can implement more efficient specialised versions. The UText interface in ICU is very similar to the Objective-C model, with one important performance improvement. When iterating over characters (actually, UTF-16 code units), implementations of UText have a choice of providing direct access to an internal buffer or to a temporary one. With a twine-like implementation, you can just update the pointer and length in the UText to point to the current segment, whereas with NSString you need to copy the characters to a caller-provided buffer.

                                            1. 2

                                              Neat article that helped give some formal structure to what I’ve been doing. When I write user facing code, I’ve always tried to use my type system to catch risks like this: a username isn’t just a username, it’s a TaintedUsername and you then have to explicitly go through some guarded path to get untainted data. This always fit nicely with the ports and adapters model - your core logic only accept clean things, your interfaces only handle tainted things.

                                              Of course, there were huge limitations here. if the library I grafted in would give you root if the user passed in {#root-shell-please!}, I had to know to look for that string in my cleaning function, to aggressively whitelist only certain inputs, or how to configure my library with less magic. So the unknown unknowns were always scary.

                                              Now I’m going to read the linked thesis and discover this problem was solved decades ago ;)

                                              1. 2

                                                Parse, don’t validate is even closer to what you’re describing. Capabilities are about securing access to “the outside world” (filesystem, network, …) while you’re talking about data in the application. These things are complementary and similar in some ways but not exactly the same thing…

                                                1. 2

                                                  Isn’t user generated content always tainted? What does cleaning it mean? If you put a username into your DB you protected against SQL injection, but that string is still tainted and unsafe to stick in an HTML attribute, for example.

                                                  1. 2

                                                    If I’m correctly guessing what you’re saying, here’s some advice: don’t think of a username with special characters in it as “tainted”. It’s not tainted. It’s a regular string, and its possible values are all possible strings (possibly with a length limit or whatnot).

                                                    However, don’t confuse strings with SQL code snippets or HTML snippets. “ ’ “ (single quote) is a valid string, and a valid HTML snippet, but not a valid SQL snippet. “&” is a valid string, and a valid SQL id snippet, but not a valid HTML snippet.

                                                    If you write:

                                                    "<p>Hello, <b>" + username + "</b>!</p>"
                                                    

                                                    that’s akin to a type error. The type of HTML should be different than the type of strings. A “better” way to write this would be

                                                    HtmlP(HtmlText("Hello, "), HtmlBold(HtmlText(username)), HtmlText("!"))`
                                                    

                                                    Notice the implied type signature of HtmlText: it takes a string and turns it into an HTML. Now this is verbose enough that you can see why it’s common to represent HTML as text. Which is an OK thing to do, as long as you realize it, and realize that when you jam a username into HTML, you need to do the type conversion from string to HTML by escaping special characters in the username.

                                                  1. 1

                                                    Instead, there is exactly one Network object ever in existence, and it is passed in to the program at one location, perhaps as an argument to main. (Actually, if the operating system was capability-safe and Java was cooperating with it, then main would be given a Network if and only if the executable was given network access.)

                                                    This is actually a huge point. From what I know about capabilities, they require OS support. If a library could just use the Network object without being granted permission, then having the more granular types doesn’t solve anything.

                                                    1. 7

                                                      No, it only requires language design and library support. By “main”, they mean a top-level entrypoint for a program, and this entrypoint happens to be language-specific, not part of the OS API. The language runtime has all of the raw power given by the OS, and must wrap it into objects like the “Network” object discussed in the post. This process is known as taming.

                                                      To give a precise example from my Monte language, a relative of E, here are two entrypoint declarations. The first one has network access; specifically, it has a IPv4 outgoing TCP dialer.

                                                      def main(argv, => makeTCP4ClientEndpoint) as DeepFrozen:
                                                      

                                                      And this second one does not:

                                                      def main(argv) as DeepFrozen:
                                                      

                                                      In practice, this is very useful for quickly estimating a program’s isolation from other programs. For example, the entrypoint for the Monte compiler indicates usage of system timers, the filesystem, and stdio; but not any network access. OS support could make this more robust against bugs in the runtime, but the OS itself also could have bugs in its sandboxing; it’s not necessarily better.

                                                      1. 4

                                                        You can have capabilities in the OS, and capabilities in the language, and they’re complementary. At the language level, you state what capabilities main expects, and control how they flow through the program, which lets you reason about what your dependencies (and their transitive dependencies!) can access, relative to the rest of your program. At the OS level, you control what capabilities each process is given.

                                                        If a library could just use the Network object without being granted permission

                                                        Is the library (log4j / jndi) simply being called (possibly transitively) from the application’s main function, then all you need is capabilities in the language, as my post describes. The result is that as an application author it is extremely visible to you whether log4j might access the network. (Though with the number of people seeming not to get it, I’m worried I might have described it poorly.)

                                                        Are you (amw-zero) thinking of the case where log4j or jndi is running in a separate process, communicating with IPCs? That came up on the orange site, and it makes the story a lot more complicated. In that case I think you might need capabilities in both the language and the OS to get things under control.

                                                        1. 4

                                                          You didn’t describe things poorly; capabilities are a tough concept to understand. I wrote a short story about what it feels like to have this conversation repeatedly.

                                                          1. 1

                                                            Made me smile, thank you.

                                                          2. 3

                                                            If it were written for capabilities, it might always ask for network access (possibly much more than it really needs) because part of log4j’s purpose of for you to be able to write logs to all kinds of places like syslog or Cassandra or many other things. See loads listed here: https://logging.apache.org/log4j/2.x/manual/appenders.html

                                                            I think a hypothetical log4monte with a good design would ask for few/no capabilities in the main body of the library but let you pass in an “Appender” object which asked for the specific capabilities required to talk to just the right backend.

                                                            Of course a hypothetical log4monte with a really bad design could ask for massive network capabilities always, and filesystem too, in order to make the formatting feature work like log4j’s and be able to do things like look up user data from remote servers and local files during message formatting.

                                                            Maybe an interesting exercise would be to try making Monte equivalents of popular libraries like this to see if they can be made both ergonomic and feature rich. :)

                                                            (Aside: totally weird open question as to whether you can make them configurable with dependency injection controlled by some XML file that is read and then used to invoke classes through reflection… but actually maybe the ability to do that is an anti-feature.)

                                                            1. 2

                                                              One of the features of E that we didn’t include in Monte, URI expressions, is very similar to the magical invocation of ambient authority in log4j. A URI expression is supposed to be safe and tamed somehow, but I’m not sure that I see how that could possibly be safe enough.

                                                              We also considered something like your “Appender”; the Strategy pattern would make it easy to change logging backends. This would make sense for a fancy logging module, but there’s a couple design issues still. The first one is that we try not to pass around objects with incomplete responsibilities; we don’t actually want something with sufficient power to interpret URI expressions, because such objects can be misused. Compare two ways to attenuate the core of such power:

                                                              def overpowered(x, log) { return log("http", x) }
                                                              
                                                              def attenuate(log):
                                                                  return def HTTPonly(x) { return log("http", x) }
                                                              def attenuated(x, log) { return log(x) }
                                                              # Now change overpowered(x, cap) to attenuated(x, attenuate(cap))
                                                              # And once that's done, we can lambda-lift calls to attenuate()
                                                              # At the very top, the attenuated cap is a Strategy
                                                              

                                                              The second issue is that logging should be discouraged in production. To split it into subissues, a first problem is that we want logging to happen immediately, instead of being queued up as a netwoking or stdio effect; otherwise, it’s not useful for debugging. The E/Monte builtin println function does exactly this, and prints to stderr immediately and impurely.

                                                              And this leads nicely to the other reason to avoid logging in production: there are better ways to debug at scale, so debugging with println should not be deployed widely. Instead, I recommend metrics and monitoring agents. Monte does have a literate module implementing a Prometheus client, and I’ve linked to a relevant header. The .processMetrics/1 method takes a very powerful capability and immediately forgets about most of it, wrapping it so that we can safely export metrics from Prometheus without exposing the entire runtime to a potentially-untrusted HTTP application.

                                                              1. 2

                                                                your “Appender”

                                                                FWIW I’m just calling it that because that’s what log4j calls it. :) And yes, this is an instance of the strategy pattern, though they didn’t use that word in the name. (Rightly so IMO.)

                                                                The second issue is that logging should be discouraged in production

                                                                I’m not interested in discussing that here because it’s not germane to this specific discussion.

                                                                URI expressions, is very similar to the magical invocation of ambient authority in log4j. A URI expression is supposed to be safe and tamed somehow, but I’m not sure that I see how that could possibly be safe enough.

                                                                Looking at the documentation, I agree these do seem incredibly overpowered. I think they’d be fine in code or config files but I can’t see how you could ever use variable ones in application code safely. I mean, there’s one that loads and runs arbitrary class files!

                                                      1. 5

                                                        I am not too familiar with Go, however, in the sample code,

                                                          if err := requests.URL("http://foo/t.json").ToJSON(&t).Fetch(ctx); err == nil {
                                                            return nil, err
                                                          }
                                                        

                                                        err is defined as the result of the requests call, and then compared against nil. If it is nil, then nil, err are both returned.

                                                        Is that not just returning nil, nil? How is err a useful response value here, it seems functionally useless.


                                                        On a separate point, I find it interesting that you decided to declare headers as a method call on the existing class, as opposed to a single header method that can be chained or accept Go’s varargs equivalent. Do you have a generic header method? If not, you might consider that for the future, as currently this doesn’t support custom headers that some web apis might need, and means it is less future-compatible.

                                                        1. 2

                                                          you’re missing the &t part which essentially acts as an ‘out param’ that writes into the var t T declared before. with that in place, the actual return value can be just an (optional) err value

                                                          1. 12

                                                            No, they’re talking about the ‘err’ in ‘nil, err’. Which does appear to be nil. Presumably that’s supposed to say ‘err != nil’? I’m taking this as further evidence of Go’s error handling being poor do to lack of ADTs, that mistake would have been much more obvious with a Result type…

                                                            1. 6

                                                              oops, yeah, i guess that condition should’ve read err != nil instead. which kinda proves your point: it’s very easy to accidentally make mistakes. on top of that, i think that the golang habit to put statements in between the if keyword and the actual condition only adds to the confusion. case in point: by the time i reached the end of that jam-packed line i got so tired that it didn’t stand out that the condition was accidentally flipped 🙃

                                                          2. 1

                                                            Typo fixed, thanks.

                                                          1. 3

                                                            This seems to assume there’s a clear separation between the content and the markup, but I don’t think that’s always the case. For example, everyone recognizes that if you remove the quotes from a sentence, you are liable to change its meaning. But the same could be said about a quote block or code block. And to a lesser extent: italics, which change the emphasis of a sentence, but in extreme cases that emphasis can probably dramatically change the meaning of the sentence.

                                                            EDIT: Also, hmm, how do bulleted/numered lists work in this system?

                                                            1. 4

                                                              that emphasis can probably dramatically change the meaning of the sentence.

                                                              Great point! Thanks.

                                                              I updated the text with `Another problem of Aftertext is when markup is semantic and not just an augmentation. “I did not say that is different from “I did not say that”. Without embedded markup in these situations meaning could be lost.”

                                                              Also, hmm, how do bulleted/numered lists work in this system?

                                                              Aftertext and lists are currently orthogonal node types in Scroll. That is a footgun. I haven’t given much love to lists yet. There is only basic support, demo’d here

                                                              1. 2

                                                                And to a greater extent, missing strikethroughs can pretty greatly change the meaning of a piece of text.

                                                              1. 20

                                                                This is not just about transgender folks (though they’re the ones most likely to suffer abuse if you don’t do it right). A lot of folks in the US use their middle name in preference to their first name. A system that allows your display name to be configured lets them use their middle name in preference. This is something that Active Directory / Exchange / Teams actually do very well. I have several colleagues who set the display name to match what they actually use in face-to-face interactions. The only folks who see their legal name are folks who need access to HR systems.

                                                                1. 13

                                                                  My dad only learned his mother’s legal first name when he was in his 30s. There is no necessary relationship between legal name, billing name, and display name. Legal name is for interactions with the government, and if someone is going to verify your identity by looking at your driver’s license or something. Most applications, unless they’re doing identity verification themselves, shouldn’t care about legal names at all; that’s for the government. Billing name could be a totally different person, because someone else might be paying for your service. Display name could be as you say a middle name, or a nickname, or a shortening of the first name, or a name of a different gender because changing one’s legal name is difficult or impossible, or the name could be different because the person is from India and they don’t all have the same first/last-distinction that we do so their legal name gets jumbled.

                                                                  Legal name, billing name, and display name, all need to be changeable, because they all really do change. It’s not even much of an edge case.

                                                                1. 7

                                                                  I’m increasingly of the opinion that it’s better to have gotchas than two subtly ways to do something.

                                                                  1. 3

                                                                    Yeah. They talk about this as removing a “wart”, but nothing is actually being removed from the language! You cannot eliminate a gotcha by adding to the language: anyone who knows to use the late binding syntax has already sidestepped the gotcha. The advantage of this proposal isn’t that it removes a wart or fixes a gotcha, it’s that it improves convenience for those who already know Python’s default value behavior.

                                                                    1. 2

                                                                      I am hopeful that once this gets added, the previous default-argument syntax will get increasingly flagged by linters and such, reserving that syntax exclusively for optimization, and preventing newbies from accidentally stepping on it.

                                                                  1. 3

                                                                    Author here. Ask me anything :-)

                                                                    1. 4

                                                                      This is great; I feel like concatenative languages should be studied more.

                                                                      In the long run, I expect optimizing compilers for Dawn terms to consistently produce higher performing machine code than for all but the most highly tuned C procedures, thanks in no small part to Dawn’s extremely simple and highly analyzable core syntax and semantics.

                                                                      My understanding is that “compile down to a simple and highly analyzable IR, then optimize the hell out of it” is a widely used strategy. Like I think LLVM fits this description, though I don’t really know the details. Do you think Dawn will have some advantage here, over IRs that use variables/names? It does have the advantage(?) of being naturally linear, as you said.

                                                                      Nonetheless, recursion is possible by passing a copy of a quoted term to itself using higher order expressions such as [clone apply] clone apply.

                                                                      Have you tried finding a type for dup apply? I discovered equirecursive types that way :-)

                                                                      1. 4

                                                                        My plan is for Dawn to act as its own IR, with the lower levels of IR defining memory layouts and having access to platform specific terms, with the lowest level essentially being assembly. I haven’t prototyped any of that yet, though, so we’ll see how it works out in practice.

                                                                        Dawn’s primary advantages over C, in terms of optimizability, will be that pointer aliasing is disallowed (or explicit) and that memory layout is not specified in the highest level of the language, and therefore can be optimised by the compiler. I also hope to enable the user to directly control compilation, including adding their own optimizations. I want compilation to be much less of a black box than it typically is. Again, though, none of this is prototyped, so we’ll see how it plays out.

                                                                        Assigning a type dup apply / clone apply requires a form of self types, which I don’t currently plan to include.

                                                                        1. 4

                                                                          My plan is for Dawn to act as its own IR, with the lower levels of IR defining memory layouts and having access to platform specific terms, with the lowest level essentially being assembly. I haven’t prototyped any of that yet, though, so we’ll see how it works out in practice.

                                                                          That sounds like a lot of fun!

                                                                          I also hope to enable the user to directly control compilation, including adding their own optimizations.

                                                                          This frightens me a bit; it means I could encounter a language-level bug due to a library I import. I guess the assumption is that people won’t add optimizations unless they really know what they’re doing.

                                                                          pointer aliasing is disallowed (or explicit)

                                                                          This is hard! Or rather, it’s hard to do while making the language nice to use. Are you thinking of having a Rust-like borrow system, and distinguishing mutable references from shared references? If you are, have you seen withoutboat’s post on what a simpler Rust would look like?

                                                                          Assigning a type dup apply / clone apply requires a form of self types, which I don’t currently plan to include.

                                                                          I think that’s a good decision. Equirecursive types are neat, but don’t seem to be very useful in practice. You can’t give a type to lambda f. f f either, for the same reason (and thus can’t type the Y combinator, as it uses that), but no one seems to complain.

                                                                          1. 4

                                                                            This frightens me a bit; it means I could encounter a language-level bug due to a library I import.

                                                                            Such a bug would be scoped to uses of the library, so it’s really no different from existing libraries. My hope is for all optimizations to be formally verified, though. But that might have to come later.

                                                                            This is hard! Or rather, it’s hard to do while making the language nice to use.

                                                                            The hard part is making it performant and ergonomic at the same time, i.e. having explicit reference types which statically enforce mutually exclusive mutability. I have plans for how to do this without requiring a separate borrow checker, but we’ll have to see how ergonomic it is in practice. And, of course, there has to be a solution for cyclic references and other patterns that are not possible on safe Rust.

                                                                            Yes, I did read those posts a while back, but they’re probably worth rereading.

                                                                            Equirecursive types are neat, but don’t seem to be very useful in practice.

                                                                            There’s one very interesting use for self types: https://homepage.divms.uiowa.edu/~astump/papers/fu-stump-rta-tlca-14.pdf

                                                                            The Kind language, and the other offshoots from FormCore, make excellent use of them. There’s a lot of work needed to adapt dependent types to the concatentive calculus, though, and I don’t expect I’ll be able to tackle that alone or in reasonably short order. If they can be made to work, though, they would be excellent for formal verification.

                                                                            1. 1

                                                                              Such a bug would be scoped to uses of the library, so it’s really no different from existing libraries.

                                                                              My point is that it is different from a bug in a library in a typical language. If a library is allowed to muck with things in a way that violates invariants that the language relies on, that can cause a more-or-less arbitrary bug at a more-or-less arbitrary location. For example, if I use a badly written library in Rust that uses unsafe, it could potentially cause my variables to have two mutable references to the same data, or cause a segfault in my code. Bugs inside unsafe are nonlocal; while the problem originates inside the unsafe block, it may manifest itself much later. Thus “unsafe” code is important to audit, and likewise user-written compiler optimizations are important to audit (or formally verify!).

                                                                              I have plans for how to do this without requiring a separate borrow checker, but we’ll have to see how ergonomic it is in practice.

                                                                              Care to share your high-level idea? :-)

                                                                              There’s one very interesting use for self types

                                                                              Oh interesting, thanks for the link.

                                                                              1. 2

                                                                                Sorry, I worded that poorly. The point I mean to make is this. The class of bugs that can be introduced in unsafe code in Rust is much worse than the class of bugs that can come from safe code; likewise the class of bugs a library could introduce with (non formally verified) compilation optimizations is much worse than it could introduce without.

                                                                                1. 1

                                                                                  Yes, I certainly agree with that.

                                                                                2. 2

                                                                                  Care to share your high-level idea? :-)

                                                                                  Static reference counting, essentially, by leveraging qualified types with substructural restrictions and equality constraints.

                                                                        2. 3

                                                                          Slightly ot, but why is “clone” rule called “contraction”?

                                                                          1. 3

                                                                            In the sequent calculus, it looks more like contraction than like clone: https://en.m.wikipedia.org/wiki/Structural_rule

                                                                          2. 2

                                                                            Do you know about dmbarbour’s Awelon? There seem to be similarities between Awelon and Dawn.

                                                                            1. 2

                                                                              I did not! Looks interesting!

                                                                            2. -1

                                                                              In simple English, what does it mean to be transcendental?

                                                                              1. 1

                                                                                FYI, in “ask me anything”, usually it is implied that the question should relate to the topic.

                                                                            1. 4

                                                                              I think that there are some issues with the semantics as presented:

                                                                              • Minor technical issue: the way compose is defined, it can only ever built expression-lists of size 2, for example [e1] [e2] [e3] compose compose should leave [e1 (e2 e3)] on the stack, instead of [e1 e2 e3] as seems to be assumed in the rest of the post. (Note that e1 (e2 e3) and e1 e2 e3 are both valid and distinct according to the grammar of expressions.)

                                                                              • Conceptual issue: in the lambda-calculus, “church encodings” give a systematic way to encode any (algebraic) datatype: from say type bool = true | false or type nat = zero | succ nat, you can mechanically derive the encodings in the lambda-calculus (technically we call this systematic encoding the Boehm-Berarducci encoding, but it is a generalization of the Church encoding of booleans and natural numbers). Here it is unclear in your description whether a systematic encoding exists, or whether one has to “solve an equation” to find the terms in an inventive way. There may be a systematic approach that gives the definition but with non-normal term, and you somehow normalized/optimized the encodings for the purpose of the presentation, but this process (if it exists) is not clear to the reader.

                                                                                Similarly I would expect that there is a systematic way to derive functions on boolean numbers or natural numbers from a primitive elimination construct. In the lambda-calculus, the truth-table of boolean or can be expressed as if x then (if y then true else true) else (if y then true else false), which can be expressed in Church-encoded form \x y. x (y true true) (y true false), and then optimized into the equivalent form \x y. x true y. I would expect that it is possible to systematically derive or instead of having to “guess” it.

                                                                              • Performance issue: with the encoding of natural numbers given, n+1 contains the definition of n duplicated two times. This means that the encoding of n has size exponential in n, which seems extremely impractical. It may be that a better encoding exists. (For example you give an encoding of succ later, so I guess just zero succ succ ... succ would qualify.)

                                                                              1. 3

                                                                                Thanks for the constructive feedback! I think the issue with composition should be easy to address. I specifically defined composition as n-ary in order to avoid this issue, but I see your point that the compose intrinsic is not defined clearly in the text.

                                                                                As for the encodings, that’s good to know! My main goal for those was to excercise the calculus a bit. Ultimately, I do not plan to use such encodings in Dawn. I’ll try to clarify that these are not systematic or efficient encodings. Perhaps someone else who is particularly interested in encodings can explore how to improve on those aspects!

                                                                                1. 2

                                                                                  Minor technical issue: the way compose is defined, it can only ever built expression-lists of size
                                                                                  2

                                                                                  e1 (e2 e3) is equivalent to e1 e2 e3. This equivalence is so fundamental to the concatenative calculus that I would regard it as a mistake to even allow them to have different representations. I think a better definition would be:

                                                                                  Sequence s = e_1 ... e_n
                                                                                  Expression e = i | [s]
                                                                                  

                                                                                  Performance issue: with the encoding of natural numbers given, n+1 contains the definition of n duplicated two times

                                                                                  Working it out:

                                                                                  N succ
                                                                                  = N quote [apply] compose [[clone]] swap clone [[compose]] swap [apply] compose5
                                                                                  = [N apply] [[clone]] swap clone [[compose]] swap [apply] compose5
                                                                                  = [[clone]] [N apply] [N apply] [[compose]] swap [apply] compose5
                                                                                  = [[clone]] [N apply] [[compose]] [N apply] [apply] compose5     
                                                                                  = [[clone] N apply [compose] N apply apply]                 
                                                                                  

                                                                                  Applying it to a function [F]

                                                                                    [F] [clone] N apply [compose] N apply apply
                                                                                  = [F] ...n+1 [F] [compose] N apply apply
                                                                                  = [F ...n+1 F] apply
                                                                                  = F ...n+1 F
                                                                                  

                                                                                  Which does definitely duplicate N. I think this is avoidable, though: instead of making n copies
                                                                                  of F and then composing them, clone F just once, then apply F n times (using N), then apply F one last time.

                                                                                  1. 2

                                                                                    Yes, you can define a helper term to make that approach to defining natural numbers easy. I started with that encoding, but then I changed to this encoding because it applys all of the copies at the same time, so it strictly fulfills the big step semantics I provided, rather than applying the expression one at a time.

                                                                                  2. 1

                                                                                    I’ve updated the post to address your comments: https://www.dawn-lang.org/posts/foundations-ucc/

                                                                                    Thanks again for the feedback!

                                                                                  1. 1

                                                                                    Oh fun! I was just thinking about test case generation. I haven’t grokked the bounds of this approach yet; I tried something similar once but I’m not sure it’s the same. My guess is, that this approach can’t (exhaustively and efficiently) generate all binary trees of size up to 5? Most approaches break down there.

                                                                                    Here’s another approach, which constructs a bijection between the values you want to generate and the natural numbers. So you ask if for “array number 347838299483”, and it quickly produces that array: https://users.cs.northwestern.edu/~robby/pubs/papers/jfp2017-nfmf.pdf (Importantly, the time it takes to produce “value number N” is polynomial in the number of bits of N.) This approach can’t generate binary trees fairly, though.

                                                                                    There’s another approach that can generate recursive data structures like binary trees. The API is that there’s a notion of “size” of the values you want to generate, and you can ask things like “tell me how many values of size 25 there are” or “give me an iterator over all values of size 25” or “give me a value of size 25 chosen uniformly at random”. For a tuple of natural numbers, the “size” would be the sum of the numbers. For a binary tree, it could be the number of nodes in the tree. Here’s a full-fledged Haskell library for this approach: https://hackage.haskell.org/package/testing-feat And my (independently discovered) prototype, which could be a good explanation just because it’s short: https://github.com/justinpombrio/generators/blob/main/gen.hs

                                                                                    (Aside: did you ever realise that the number of ways to pick two objects out of n is equal to the sum of first n natural numbers?)

                                                                                    No! But I thought of an explanation. First of all, a correction: it’s the sum of the first n-1 numbers. For example, there are 10 ways to pick two things out of five, and 10 = 1 + 2 + 3 + 4. Now for the explanation: to pick two things out of five, put them in a line. How far apart are the two things you choose? There are 4 ways you could pick adjacent things (distance = 1), 3 ways to pick things with distance 2, 2 ways to pick with distance 3, or 1 way to pick with distance 4. 4 + 3 + 2 + 1 = 10.

                                                                                    1. 2

                                                                                      all binary trees of size up to 5?

                                                                                      Is this the same thing that Catalan numbers are counting? If it is, here’s the (exhaustive and efficient) equivalent code to generate all balanced parenthesis:

                                                                                      #[test]
                                                                                      fn gen_parenthesis() {
                                                                                          let n = 5;
                                                                                      
                                                                                          let mut g = Gen::new();
                                                                                          while !g.done() {
                                                                                              let l = g.gen(n);
                                                                                              let mut s = String::new();
                                                                                              let mut t = 0;
                                                                                              let mut b = 0;
                                                                                              while t < l {
                                                                                                  if b > 0 && g.gen(1) == 1 {
                                                                                                      s.push(')');
                                                                                                      b -= 1
                                                                                                  } else {
                                                                                                      s.push('(');
                                                                                                      b += 1;
                                                                                                      t += 1;
                                                                                                  }
                                                                                              }
                                                                                              s.push_str(&")".repeat(b));
                                                                                          }
                                                                                      }
                                                                                      

                                                                                      Though admittedly for a naturally recursive data a recursive solution would be nicer.

                                                                                      1. 1

                                                                                        Yes, Catalan numbers count the number of binary trees of a given size. I think that’s related to but not quite the same as balanced parentheses, which are more like forests? Though there might be some bijection I’m not thinking of.

                                                                                        Anyhow, you’re right; your approach totally can generate all binary trees of a given size, exhaustively and efficiently and straightforwardly:

                                                                                        #[derive(Debug)]
                                                                                        enum Tree {
                                                                                            L,
                                                                                            B(Box<Tree>, Box<Tree>),
                                                                                        }
                                                                                        
                                                                                        fn gen_tree(g: &mut Gen, size: u32) -> Tree {
                                                                                            if size == 0 {
                                                                                                return Tree::L;
                                                                                            }
                                                                                        
                                                                                            let left_size = g.gen(size - 1);
                                                                                            let right_size = size - left_size - 1;
                                                                                            Tree::B(
                                                                                                Box::new(gen_tree(g, left_size)),
                                                                                                Box::new(gen_tree(g, right_size)),
                                                                                            )
                                                                                        }
                                                                                        
                                                                                        fn main() {
                                                                                            let size = 3;
                                                                                            let mut g = Gen::new();
                                                                                            let mut trees = vec![];
                                                                                            while !g.done() {
                                                                                                trees.push(gen_tree(&mut g, size));
                                                                                            }
                                                                                            for tree in &trees {
                                                                                                println!("{:?}", tree);
                                                                                            }
                                                                                            println!("{}", trees.len());
                                                                                        }
                                                                                        

                                                                                        What your approach can’t do is generate a uniformly random tree of a given size. For example, if you wanted to test something on all Json values of size up to 6 (which can reasonably be enumerated), plus a million uniformly random Json values of size 30 (of which there are too many to list).My approach / Haskell-feat can do that. The downside is that the code gets much more complex, and it requires bignums (because e.g. the number of Json values of size 30 might not fit in an int).

                                                                                        EDIT: you might immediately think it’s easy to generate a uniformly random tree. Just have each call to gen return a number uniformly random from its range. But this doesn’t work because the result of the first call to gen influences what future calls get made. For example in the line let left_size = g.gen(size - 1);, there are different numbers of trees for each possible left_size.