Threads for jtm

  1. 4

    This is a very type-theoretic way of looking at types, the sort of thing you do if you want to take the maximalist approach of defining all computation as simply relaxing expressions to their normal form. Does this approach actually lead to ergonomic code, though? It was certainly useful for learning about dependent typing as presented in books like The Little Typer, but I’m doubtful expressing everything about types in terms of intro and elim forms will help you do things with a programming language. Do any extant languages actually do this?

    1. 9

      I would argue that any language in the ML family works this way. When you define a type, the only thing you can do with it is introduce it (via construction) or eliminate it (via pattern matching). In some sense, both ML and Haskell are about relaxing expressions to normal forms.

      1. 5

        Yeah, it’s fine I think - you don’t necessarily need to used this as a guide for writing code… more as a way of understanding the deeper things going on with types. For example, you can see pattern matching as being a convenient way of building up elimination forms. You may or may not implement it this way (there are pros and cons to elaborating to elimination forms - some languages do it, others don’t), but I do think it’s useful to think in those terms as a language designer.

        1. 4

          Types in Lean often have a method called elim as a primitive pattern matching deconstructor, but it probably wouldn’t be seen in idiomatic code.

          1. 4

            Sure, an extremely simple example that shows up all over the place are “smart constructors”. In Rust:

            pub struct NonNegativeF32(f32);
            impl NonNegativeF32 {
              pub fn introduce(value: f32) -> Option<NonNegativeF32> {
                if !(value < 0.0) { Some(NonNegativeF32(value)) } else { None }
              pub fn eliminate(&self) -> &f32 { &self.0 }

            From a perspective outside of this module (which thus respects pub declarations) the properties of the intro ensure that for all x: NonNegativeF32, x.eliminate() >= 0.0. The properties of the type are exactly specified in the intro/elim rules.

            1. 2

              It’s worth noting that NBE isn’t the only way that types apply to computation. There’s also native type theory: take the values which arise on your concrete machine, collect them into types, and then describe the behaviors of your machine using functions on those types.

              Edit: Changed second link; “previously, on Lobsters…”

            1. 4

              I would love to have this in printed book format.

              1. 6

                I would also love a printed format! FWIW there are also pdf and epub versions of the book.

                1. 2

                  Thanks for these links!

                  And that is a pretty PDF as well.

              1. 1

                Great read. @icefox you might be interested to know that this is almost exactly how Coq implements typeclasses. For example, the definition of Eq from your example would look something like:

                Class Eq (A : Type) := {
                  equal : A -> A -> bool;
                Instance Int_Eq : Eq int := {
                  equal := fun x y => (* ... definition of eq for ints *);

                The Class and Instance keywords in Coq are essentially the same as defining and instantiating a struct (you can pass around an Int_Eq just like any other ol’ struct). The only magic is that Instance updates a database of terms to be implicitly resolved, in the same way that you use the implicit keyword in your generic_sum example. Using the above definitions, one could then define:

                Definition explicit_func (A : Type) (H : Eq A) : unit.
                Definition implicit_func (A : Type) {H : Eq A} : unit.
                                                 (* ^ curly braces *)

                At which point calling (explicit_func int Int_Eq) and (implicit_func int) are equivalent.

                I personally find Coq typeclasses a little clunky (typeclasses were added later, and sort of compete with the module system inherited from OCaml). Coq also primarily deals with mathematics, which has some steep hierarchies of inheritance (e.g. a total order :> partial order :> reflexive etc) which can be pretty cumbersome.

                One takeaway from Coq that you might find useful is the notion of sections, which can be used to automatically append typeclasses to function arguments when used in a function body. For example, in some pseudo-garnet, this might look like:

                section WithLen
                  -- Any function in this section that uses `len_impl`
                  -- will have `len_impl` appended to its arguments
                  context len_impl: Len(@C)
                  fn foo(container: @C): int = 
                    let len = len_impl.len(container)
                  fn bar(container: @C): int= 
                    return foo(container)
                -- foo is now foo(container: @C, len_impl: Len(@C))
                -- bar is now bar(container: @C, len_impl: Len(@C))
                --   ^ because it called foo, which requires len_impl

                This is a rough example (you’d probably need some extra logic / syntax to unify all those @C’s), but hopefully this is some more food for thought. It might prove useful in say, a list library, where Cmp, Eq, and friends are pervasive.

                1. 35

                  Pretty ironic to see this post hosted on a .sh domain name. Yes, .io domains are harmful but so are .sh domains!

                  1. 16

                    true!! I wish someone had informed me before I hitched my life to a .sh domain. i hope my post helps someone avoid similar regrets.

                    1. 1

                      Well the web isn’t (yet) hotel california, domains can move hinthint

                      1. 3

                        easier said than done when my email is hosted on that domain as well & everyone polling my RSS feed would need redirection 😥 plus, all my historical links would break. i would like to move but it’s a lot of work and churn for a site that i’ve built up for years.

                        1. 5

                          I hosted my website, RSS feeds, and email on a .in domain for 16 years. Then I decided to move everything to a .net domain. Since my entire setup is version controlled, it took only a few commits to move my website and email from the old .in domain to the new .net domain:

                          The migration went well without any issue. There has been no loss of RSS feed subscribers. The web server access logs show that all the RSS clients and aggregators were successfully redirected to the new domain.

                          1. 5

                            receive emails for both old and new domains

                            I agree with you that it’s not hard to switch domains (I’ve also done it myself), but, because I used my old domain for email, I’m essentially forced to keep paying for it. Otherwise, someone could just buy up my old domain and start receiving my emails. I’ve gone through the process of updating my email for just about every service I know, but even after four years I still get the occasional email sent to my old domain. Maybe if I can go five years without receiving any emails for my old domain I’d feel more comfortable giving it up, bit even then it still feels like a security risk.

                            Switching domains for ethical reasons seems moot if you still have to keep paying for it.

                  1. 3

                    Awesome list. I think the list missed this one: in C++, if an enum is defined inside a struct, its values are namespaced. Whereas in C, the enum values are in the global namespace (i.e. C’s only namespace).

                    struct S {
                      enum { E } e;
                    struct S s;
                    s.e = E;    // OK in C, error in C++
                    s.e = S::E; // OK in C++, error in C
                    1. 2

                      The article mixed a couple of things together:

                      • Code that compiles as one out of C or C++ and is an error in the other.
                      • Code that compiles as C/C++ but has different semantics.

                      The first of these is not very interesting because 90% of C++ is in this category. The second is a lot more interesting. In particular, things like void foo() are very bad because they disable type checking on arguments and force the K&R calling convention in C, whereas in C++ they are equivalent to void foo(void). C23 is fixing this and making it an error in C.

                      The typical examples of mismatched semantics depend on the fact that C++ doesn’t require tags for referencing struct definitions and so expressions such as sizeof that take a size or a type will see a different value depending on the language.

                      I think that dropping file extensions from C++ standard library headers was one of the biggest mistakes that the original C++ standard made. It would be much better to have three different extensions for C, C++ and C/C++ headers. Hopefully the modules tooling will grow support for importing C headers (e.g. the Linux kernel headers that use a lot of syntax that is invalid C++ but that don’t do anything that you can’t map to a C++ AST).

                    1. 9

                      Previous discussion on The creator, @slightknack, is also on!

                      1. 14

                        Hey y’all, I was surprised to see this here today! I guess a few updates since the last time this was posted:

                        • Currently we’re in the middle of a massive rewrite of the entire compiler pipeline. It currently lives in the big-refactor branch, and we’re trying to get it done by July to create the next release.

                        • Since then I’ve done a lot of research and thinking on traits and effect systems, and think I have a neat unified approach that I’d like to start prototyping. We’ve also been working on the way Passerine integrates with Rust, and we hope to provide a safe sandboxed parallelizable runtime with a high-level API that integrates well with existing Rust libraries.

                        • We’ve also been rewriting the macro system to allow for compile-time evaluation. This will be much more powerful lisp-style procedural macro system. Lisp is a powerful programming language for manipulating lists, which is why lisp macros, that operate on lists, fit the language so well. Passerine aims to be a powerful programming language for manipulating ADTs, so by representing the language as an ADT to be operated on by macros, we hope to capture the same magic and power of lisp’s macro system. (Note that the current rule-based macro system will still be available, just implemented in terms of the new one.)

                        • The majority of the discussion around the development of the language happens on our Discord server[1]. We have meetings the first Saturday of each month with presentations on PLT, compiler engineering, and other neat stuff.

                        • I’ve been working on an experimental BEAM-style runtime called Qualm that has a custom allocator that supports vaporization (the memory management technique Passerine uses, essentially tagged pointer runtime borrow checking.) I’m not sure it’ll be ready for the next release (as it requires type layouts to be specified, and the typechecker is currently WIP), but it is a nice demo for what I think is possible for the language.

                        I’m here to answer any questions you may have about the language! I’m based in Europe so I might not see them till tomorrow morning, but don’t be afraid to ask, even if you think it’s a silly question.

                        [1]: I tried starting a Matrix channel after people on brought it up last time. After a grand total of zero users had joined six months later, I went ahead and scrapped it. I love Matrix, so I might consider bridging the server in the future.

                        1. 3

                          I’m very curious about the differences between what Passerine does and what Perceus does in terms of both compile-time and runtime memory management!

                          (For context, I’m working on a programming language that currently uses Perceus.)

                          1. 3

                            Sorry for the late response. I started writing something, but like 2k words later I realized it should probably be like a blog post, and not a random comment. I’ll summarize the gist of the argument, and link to the blog post later once I’ve finished it:

                            So, as I’m sure you know, Perceus is a form of reference counting (limited to inductive types) that drastically reduces the number of reference count instructions that are produced. This makes reference counting more efficient, but Perceus is still reference counting at its core.

                            Vaporization uses a single tag bit to keep track of whether a reference is currently immutable shared or mutable owned. This is a bit different than reference counting, as the number of shared references is not kept track of.

                            When passing an object reference to a function, we set the reference tag to immutable shared if the reference is used again later in the caller’s scope. If this is the last-use of a value, we leave it as-is, allowing for efficient in-place updates in linear code. To update an object, the reference we have to it must be mutable owned; if the reference is immutable shared instead, the runtime will make a copy of the portion required. All references returned from functions must be mutable owned; when a function returns, all other mutable owned references tied to that stack frame are deallocated.

                            If effect, a mutable owned reference is owned by a single stack frame; ownership can be passed up or down the call-stack on function call or return. When calling a function, we create a child stack frame that is guaranteed to have a shorter lifetime than the parent stack frame. Therefore, we can make as many immutable references as we’d like to data owned by parent stack frames, because all immutable references to that data will disappear when the child stack frame returns to the parent stack frame.

                            Both Perceus and Vaporization do not allow cyclic data structures to be created (without requiring users to manage circular references themselves). This operating assumption drastically limits the type of data structures that can exist (in terms of pointer graphs). Because object graphs only be trees rooted at the stack (e.g. anything trivially serializable to JSON), it’s very simple to keep track of when things should be pruned from the heap.

                            Someone much smarter than I am described Vaporization as “using the stack as a topological sort of the acyclic object graph.” I’m not sure whether this statement is fully correct, but I think it captures the gist of the idea.

                            So, to answer your question, here’s what Vaporizations does with respect to compile-time and runtime memory management:

                            • At compile-time, we annotate the last use of every variable in a given scope. When generating code, if we encounter a non-last-use variable, we emit an instruction to set the tag of the reference to shared immutable. (We also ensure that all closure captures are immutable).

                            • At runtime, we update reference tags as annotated at compile-time, and we create deep copies of (portions of) objects as required when converting references from immutable shared to mutable owned.

                            If we know type layouts, it’s possible to inline the code responsible for copying data and updating reference tags such that there is no runtime dependency. This makes Vaporization suitable for both static and dynamic languages alike.

                            Hope this helps!

                            PS—Oh, I see you’re the author of Roc! I remember watching your talk “Outperforming Imperative with Pure Functional Languages” a while back, it was quite interesting!

                            1. 1

                              Very helpful, thank you so much for the detailed explanation!

                              Also, glad you found the talk interesting. Feel free to DM me if you have any questions about Roc!

                          2. 2

                            I’m interested in the run-time borrow checking idea. The part that makes sense to me is doing copy-on-write with refcounting: so you have pass-by-value semantics, but you can also do an in-place update when the refcount is 1.

                            But by “borrow checking”, do you mean you want to give the programmer a way to say: “destroy this value at the end of this scope; if I mess up and retain a reference to it, let me know”? As opposed to: “keep this alive as long as I have references to it”.

                            1. 1

                              See my sibling answer for some more info on vaporization. We essentially use a single bit embedded in the pointer for the refcount, which can either be ‘1’ (mutable owned) or ‘more than 1’ (immutable shared).

                              All values not returned from a function and not passed in as parameters to a function will be destroyed at the end of a given scope. The only way to retain a reference to an object would be to return it, which makes it very hard to mess up and retain a reference to it.

                            2. 1

                              While you’re open to big ideas, have you considered capability security?

                              One of the coolest things I’ve added to awesome-ocap lately is Newspeak, with a really cool module system:

                              Newspeak is an object-capability programming platform that lets you develop code in your web browser. Like Self, Newspeak is message-based; all names are dynamically bound. However, like Smalltalk, Newspeak uses classes rather than prototypes. The current version of Newspeak runs on top of WASM.

                          1. 18

                            1 int is enough if it’s a big int.

                            1. 26

                              Accessing RAM is just slicing into one really big int.

                              1. 1

                                I’m gonna be that person: someone please do it.

                                1. 2

                                  Speaking of verification, I have a seemingly simple problem the systems I tried it on (TLA+, Coq) seem to be unable to address (or, more likely, I don’t have the tools).

                                  So I have these two integers a and b, that are in [0, K] (where K is a positive integer). I would like to prove the following:

                                  • a + b ≤ 2 K
                                  • a × b

                                  Should be easy, right? Just one little snag: K is often fairly big, typically around 2^30 (my goal here is to prove that a given big number arithmetic never causes limb overflow). I suspect naive SAT solving around Peano arithmetic is not going to cut it.

                                  1. 4

                                    This should be pretty easy for any modern SMT solver to prove. I’m not exactly an expert, but this seems to work for Z3:

                                    ; Declare variables/constants
                                    (declare-const K Int)
                                    (declare-const a Int)
                                    (declare-const b Int)
                                    ; Specify the properties of the variables
                                    (assert (> K 0))
                                    (assert (>= a 0))
                                    (assert (>= b 0))
                                    (assert (<= a K))
                                    (assert (<= b K))
                                    ; Now let's prove facts
                                    (push) ; Save context
                                    ; Note how I've actually inserted the opposite statement of what you are trying to prove, see below as to why
                                    (assert (> (+ a b) (* 2 K)))
                                    ; If you get an `unsat` answer, it means your statement is proved
                                    ; If instead you get a `sat` answer, you can use the (get-model) command here
                                    ; to get a set of variable assignments which satisfy all the assertions, including
                                    ; the assertion stating the opposite of what you are trying to prove
                                    (pop) ; Restore context
                                    (assert (> (* a b) (* K K)))
                                    ; See above for the comment about the (get-model) command

                                    Save that to a file and then run z3 <file.smt>.

                                    Z3 should give you 2 unsat answers in a fraction of a second, which means that your 2 statements were proven to be true. Notably, it proves this for any K > 0 (including 2^30, 2^250, 2^1231823, etc…)

                                    As far as I understand, the biggest gotcha is that you have to negate the statement that you are trying to prove and then let the SMT solver prove that there is no combination of values for the a, b and K integers that satisfy all the assertions. It’s a bit unintuitive at first, but it’s not hard to get used to it.

                                    1. 3

                                      my goal here is to prove that a given big number arithmetic never causes limb overflow

                                      I’m not exactly sure what you mean here, is it that you’re using modulo arithmetic? If not, I’ve got a little proof here in Coq:

                                      Theorem for_lobsters : forall a b k : nat,
                                        a<=k /\ b<=k -> a+b <= 2*k /\ a*b <= k*k.
                                        - lia.
                                        - now apply PeanoNat.Nat.mul_le_mono.

                                      I think even if you’re doing modulo arithmetic, it shouldn’t be too hard to prove the given lemmas. But you might need to put some tighter restrictions on the bounds of a and b. For example requiring that a and b are both less than sqrt(k) (though this is too strict).

                                      1. 1

                                        My, I’m starting to understand why I couldn’t prove that trivial theorem:

                                        • I’m not sure what “split” means, though I guess it splits conjunction in the conclusion of the theorem into 2 theorems…
                                        • I have no idea what lia means.
                                        • I have no idea how PeanoNat.Nat.mul_le_mono applies here. I guess I’ll have to look at the documentation.

                                        Thanks a lot though, I’ll try this out.

                                        1. 2

                                          I’m not sure what “split” means, though I guess it splits conjunction in the conclusion of the theorem into 2 theorems…


                                          I have no idea what lia means.

                                          The lia tactic solves arithmetic expressions. It will magically solve a lot of proofs that are composed of integer arithmetic. The docs on lia can be found here. Note that I omitted an import statement in the snippet above. You need to prepend From Coq Require Import Lia to use it.

                                          I have no idea how PeanoNat.Nat.mul_le_mono applies here. I guess I’ll have to look at the documentation.

                                          The mul_le_mono function has the following definition, which almost exactly matches the goal. I found it using Search "<=".

                                               : forall n m p q : nat, n <= m -> p <= q -> n * p <= m * q

                                          I used now apply ... which is shorthand for apply ...; easy. The easy tactic will try to automatically solve the proof using a bunch of different tactics. You could do without the automation and solve the goal with apply PeanoNat.Nat.mul_le_mono; destruct H; assumption, if you’re so inclined.

                                          I hope this is helpful!

                                    1. 12

                                      Nice! Regarding pv(1), I recently noticed that OpenBSD uses ftp(1) to create arbitrary progress bars. For example to get a progress bar when extracting tarballs, you can do:

                                      $ ftp -VmD "Extracting" -o - file:archive.tgz | tar -zxf -
                                      Extracting archive.tgz 100% |*********************| 7580 KB    00:00

                                      It’s a clever trick that turns ftp(1) into cat(1) with a progress bar. The interface for pv(1) is much nicer, but sometimes it’s nice to use tools that are in the OpenBSD base install.

                                      This technique is also how OpenBSD displays the progress bars during install.

                                      1. 3

                                        I have mixed feeling about this. On one side, it is cool, but on the other side, it looks like ftp is becoming something else. For starters, it is called ftp while it is used for other protocols as well. Now a (cool) trick to make it work as a progress bar. Where’s the next stop? The init system!?! (just joking).

                                        1. 3

                                          …it is called ftp while it is used for other protocols…

                                          Well, the p stands for program, not protocol, these days 😉:

                                          The fact it supports locally mounted file systems is news to me! I never knew.

                                      1. 2

                                        Re: content type, the M in MIME stands for mail. :-)

                                        Go has a package for reading multipart mime bodies:

                                        1. 1

                                          Thanks! But where is Content-Type as a mail header defined? In which RFC I mean?

                                          1. 3

                                            It looks like it’s defined in the MIME RFC, specifically section 5 of RFC 2045. Though, there is a note mentioning that Content-Type predates MIME, and was first defined in RFC 1049.

                                            1. 2

                                              This RFC suggests proposed additions to the Internet Mail Protocol, RFC-822, for the Internet community, and requests discussion and suggestions for improvements.

                                              Perfect, thank you!

                                        1. 12

                                          I’m also no cryptographer, but it seems to me that deterministically generating private keys from passwords… almost totally subverts the security model of public/private key based cryptography? Like the private key is supposed to be “something you have” bit, not the “something you know” bit; you generate them for each e.g. device, not each user?

                                          1. 4

                                            Yea, I’m agreeing that it kinda subverts it, that’s why “toy” or whatever you wanna label it :) However looking at the crypto currency (I’m referring to BIP-39 and BIP-32 for example) they have done it for many years now. If you get a hardware wallet, be prepared to have to write down 24 words on a paper which could recover your key if lost.

                                            My idea was to do the same with backup keys for servers, cause I’m working in environments with FreeIPA and/or CA authentication which sadly can fail and where you wish you had a physical vault somewhere with root ssh keys. (Luckly never happened to my workplace, but it’s a realistic threat if the right services/servers goes down at the same time - recall facebook’s lockout back some months)

                                            So at least for my usage, it’s not for everyday usage but only for having a analog (pen-paper) backup where I don’t have to manually type in a PEM encoded private key in vim which seemed horrible :)

                                            So at least in my case, it’s not something I know, it’s something I have - secret is too long for me to remember it :)

                                            BIP refs;

                                            1. 4

                                              A quick glance at BIP-39 suggests that they’re deriving the 24 words from a randomly generated key, rather than deriving a key from a list of 24 words. Is that what you’re trying to accomplish? You could generate an ed25519 key and convert it to a word list using the same methodology as BIP-39 (i.e. divide the key in 11-bit chunks and lookup the unique word associated with each chunk), no? (Correct me if I’m wrong about what BIP-39 is doing).

                                              1. 1

                                                Yea, my toy tool is in the opposite order with BIP39.

                                                In BIP39 they make a seed first, then calculate 24 words for you to save which is revert-able back into the seed. The way I use it myself for the backup keys is that I pipe something equal to the entropy of the seed into it, which for example could look like (a random generated seed from I just did ) “470273efc564f694df89980967f9ee9d26df59ac7ea1eda9dda7458fc952fe3f4affa6162e71c70d5c0db44fdbac67f06619334412d82412319c0dc96510a1b4”

                                                However in my opinion I think this (a hex key) is waaaay better to re-type into the digital world in a disaster recovery than something PEM base64 encoded. Like the example bellow. Fun when you lack a “A” or something in the rows of A’s for example, or miss a capital/downcase. Exactly what you would need when you’re stressed due to production is down :P It’s like installing “sl” on production servers if you know the prank tool.

                                                So for me, it’s about pleasure in a future horrible disaster recovery. What someone else that find this useful for is up to them, but I also tried to make it very clear that stupid usage of this tool can be a disaster in itself :)


                                                1. 1

                                                  base32 or even base58 would be improvements over base64 for manual entry for sure.

                                                  rsa keys were quite large though back in the day, and I dont think manual entry was much expected, so base64 made a bit more sense then.

                                          1. 22

                                            It’s magic.

                                            1. 18

                                              It’s not magic if it doesn’t do something. It’s magic if it does something.

                                              1. 2

                                                (Given what it is, it’s close to magic that it can do anything at all.)

                                              2. 9

                                                < short circuiting noises intensify >

                                              1. 1

                                                @solomon is the podcast related to your blog at Or is the name coincidental?

                                                1. 3

                                                  Yeah its related, but Sandy created the podcast. There is a little of group of us who have formed a group and we are trying to make a few projects together.

                                                1. 10

                                                  So, my idea of using formal methods to help develop our core systems (they need to be very robust, payments and money stuff) has been received with excitement, so I’ll be writing models for one of the systems to show the rest of the team and start teaching them how this works and what the process will be from now on.

                                                  Super excited about this.

                                                  1. 2

                                                    The dream! What tools/languages are you using to create the models?

                                                  1. 11

                                                    Can we get integration with the weekly “what are you doing this week?” threads? That would be cool.

                                                    1. 11

                                                      It would also be neat if hosted a finger server. For example finger could return some of the information found on Maybe users profiles also have a “plan” section for “what are you doing this week” answers? Not sure the utility of all this, but the idea tickles me!

                                                      1. 5

                                                        Well, there’s this at least:

                                                        1. 4

                                                          That would be really cool indeed

                                                      1. 15

                                                        Seeing the examples use floats for currency made my eye twitch uncomfortably. None of the code I’ve written for financial institutions did that. Is that really done in this space?

                                                        1. 37

                                                          I used to work at a firm that did asset valuation for bonds, and we used floats :).

                                                          It’s generally fine to use floats when it comes to asset valuation, since the goal is to provide an estimate for the price of a security. Our job was to estimate the value of a bond, discounting its price based on 1) the chance that the issuer will go bust or 2) the chance that the issuer will pay off their debt early.

                                                          My understanding is that floats are never used in places where a bank is dealing with someone’s physical assets (it would be a disaster to miscalculate the money deposited into someone’s account due to rounding errors). Since our firm was not dealing with money directly, but instead selling the output of statistical models, floats were acceptable.

                                                          1. 9

                                                            That makes absolute sense to me. Thanks for sharing the difference. We were dealing with transactions (and things like pro-rated fees, etc.) so even for things where it made sense to track some fraction of a cent, it was “millicents” and integer arithmetic. I wasn’t thinking in terms of model output.

                                                            1. 4

                                                              it would be a disaster to miscalculate the money deposited into someone’s account due to rounding errors

                                                              IME the really really hard thing is that summing floats gives different answers depending on the order you do it in. And summation operations appear everywhere.

                                                            2. 7

                                                              @jtm gave you a bit more detail, the original post offers this in the Other notes section:

                                                              One of things that tends to boggle programmer brains is while most software dealing with money uses multiple-precision numbers to make sure the pennies are accurate, financial modelling uses floats instead. This is because clients generally do not ring up about pennies.

                                                              1. 6

                                                                Ah I missed this, but yes – exactly this.

                                                                This is because clients generally do not ring up about pennies.

                                                                An amusing bit about my old firm: often times, when a bond is about the mature (i.e. the issuer is about to pay off all of their debt on time), the value of a bond is obvious, since there is a near-zero chance of the issuer defaulting. These bonds would still get run through all the models, and accrue error. We would often get calls from clients asking “why is this bond priced at 100.001 when its clearly 100?” So sometimes we did get rung up about pennies :).

                                                                1. 2

                                                                  If that was there when I read it, I overlooked it because my eye was twitching so hard.

                                                                  1. 2

                                                                    It’s completely possible they added the Other notes section later! Just wanted to share since it addressed your question directly.

                                                                2. 3

                                                                  I never wrote financial code, but I also never understood the desire to avoid floats / doubles. They should have all the precision you need.

                                                                  Decimal is a display issue, not a calculation issue. I think the problem is when you take your display value (a string) and then feed it back into a calculation – then you have lost something.

                                                                  It’s like the issue with storing time zones in you database vs. UTC, or storing escaped HTML in the database (BAD), etc.

                                                                  Basically if you do all the math with “right”, with full precision, then you should be less than a penny off at the end. I don’t see any situation where that matters.

                                                                  Although on the other side, the issue is that “programmers make mistakes and codebases are inconsistent”, and probably decimal can ameliorate that to some extent.

                                                                  I also get that it’s exact vs. inexact if you advertise a 0.1% interest rate, but I’d say “meh” if it’s a penny. It’s sort of like the issue where computer scientists use bank account balances as an example of atomic transactions, whereas in real life banks are inconsistent all the time!

                                                                  1. 11

                                                                    I also never understood the desire to avoid floats / doubles.

                                                                    Addition isn’t associative, so the answers you get from summations are less predictable than you would like

                                                                    1. 7

                                                                      I think in practice the issue may actually be that floats can be too precise. Financial calculations are done under specific rules for e.g. rounding, and the “correct” result after multiple operations may actually be less mathematically accurate than if you’d just used 64-bit floats, but the auditors aren’t going to care about that.

                                                                      1. 4

                                                                        It’s not just that, it’s that the regulations are usually written to require that they be accurate to a certain number of decimal digits. Both the decimal and binary representations have finite precision and so will be wrong, but they’ll be differently wrong. Whether the binary floating-point representation is ‘too precise’ is less important than the fact that it will not give the answer that the regulators require.

                                                                      2. 4

                                                                        like @lann and @david_chisnall mentioned, it’s not about being precise, it’s about getting the answer expected by the accountants and bookkeepers and finance people. Back when they were doing it all on paper, they built certain rules for handling pennies, and you have to do it the same way if you want to be taken seriously in the finance/banking/accounting industries. Back then they couldn’t cut a physical penny in half, so they built rules to be fair about it. Those rules stuck around and are still here today and are sometimes codified into law[0]

                                                                        As for “meh” it’s a penny, they generally don’t care much about anything smaller than a penny, but they absolutely care about pennies. I regularly see million dollar transactions held up from posting because the balancing was off by 1 penny. They then spend the time it takes to track down the penny difference and fix it.

                                                                        0: PDF paper about euro rounding

                                                                        1. 1

                                                                          how do you store 1/3 with full precision ?

                                                                          1. 1

                                                                            Not with a decimal type either :)

                                                                            1. 1

                                                                              sorry I misread your post

                                                                      1. 9

                                                                        This is nice. Though, you could do some of the same things with XPath. For example, the first two examples could be done using xmllint:

                                                                        $ curl -s | xmllint --html --xpath "//*[@id='get-help']" -
                                                                        $ curl -s | xmllint --html --xpath "//@href" -

                                                                        Unfortunately xmllint doesn’t support html5, and complains about the <header> and <main> tags in the above example.

                                                                        1. 5

                                                                          XPath is really powerful, but it’s also really hard to grok (because it was designed for a much more powerful use case). I wonder if something that used the JQuery selector format wouldn’t be more appreciated :)

                                                                            1. 2

                                                                              Hxselect is super handy

                                                                              1. 2

                                                                                This tool is great, and already in Debian. Interesting that you need to give it XML syntax, but it comes with tools hxclean and hxnormalize -x that solve that for you.

                                                                                1. 1

                                                                                  Wow! Fantastic, thanks!

                                                                            1. 9

                                                                              Added new riscv64 platform for 64-bit RISC-V systems.

                                                                              Awesome stuff. Just got my hands on a HiFive Unmatched and I’m very excited to tinker with this.

                                                                              1. 1

                                                                                that board looks cool! Just out of curiosity, did you have a project in mind you are working on with it?

                                                                                1. 4

                                                                                  To be honest, I’m not entirely sure yet. There are a number of open source projects trying to target this board (Haiku and OpenBSD, among others), and I’d love to contribute somehow. If not there, I’d like to maybe try writing my own (crappy) OS for fun, since the components on the board are open and fairly well documented. If you have any fun ideas, let me know!

                                                                              1. 1

                                                                                This is a nonsensical argument

                                                                                1. 1


                                                                                  1. 1

                                                                                    Well, Godel’s Theorem has basically nothing to say whatsoever about the difference between brains and computers. Its just a statement about formal systems. Neither brains nor computers nor general mechanical objects are formal systems.

                                                                                    1. 1

                                                                                      But computational systems are formal systems. The essay is making the case against Mechanism: that brains are purely computational systems. I think the author’s argument is weird, but sound; a formal system should have certain properties and limitations, and the brain seems to neither have those properties nor be bounded by those limitations. In particular, we are conscious of inconsistencies.

                                                                                      I think it’s a strange essay to read because most of us think that Mechanism isn’t true. I also think that trying to disprove Mechanism via Gödel is unusual. But, I think it’s clever, and I think the essay deserved a better rebuttal than “this is a nonsensical argument”.

                                                                                      1. 1

                                                                                        “Mechanism isn’t true” I think its most likely mechanism is true. I have some formal training as a neuroscientist (PhD) and I think most neuroscientists think mechanism is true.