Threads for ianthehenry

  1. 3

    Neat! The benchmark that the homepage highlights is a bit of a silly workload, but it’s cool to see how lightweight fibers are. My go-to scripting language is Janet, and Janet fibers are a little heavy. I was curious how much heavier:

    $ hyperfine --warmup 10 'zig-out/cyber/cyber test/bench/fiber/fiber.cy' 'janet fiber.janet' 'lua test/bench/fiber/fiber.lua' 'luajit test/bench/fiber/fiber.lua'
    Benchmark 1: zig-out/cyber/cyber test/bench/fiber/fiber.cy
      Time (mean ± σ):      31.2 ms ±   2.3 ms    [User: 19.3 ms, System: 11.5 ms]
      Range (min … max):    27.9 ms …  44.0 ms    93 runs
    
    Benchmark 2: janet fiber.janet
      Time (mean ± σ):     195.9 ms ±  12.1 ms    [User: 157.8 ms, System: 36.3 ms]
      Range (min … max):   184.2 ms … 219.8 ms    14 runs
    
    Benchmark 3: lua test/bench/fiber/fiber.lua
      Time (mean ± σ):     249.9 ms ±  16.2 ms    [User: 185.4 ms, System: 59.2 ms]
      Range (min … max):   230.2 ms … 277.5 ms    12 runs
    
    Benchmark 4: luajit test/bench/fiber/fiber.lua
      Time (mean ± σ):      86.4 ms ±   6.1 ms    [User: 58.2 ms, System: 26.6 ms]
      Range (min … max):    78.6 ms … 105.1 ms    34 runs
    
    Summary
      'zig-out/cyber/cyber test/bench/fiber/fiber.cy' ran
        2.77 ± 0.28 times faster than 'luajit test/bench/fiber/fiber.lua'
        6.28 ± 0.61 times faster than 'janet fiber.janet'
        8.01 ± 0.79 times faster than 'lua test/bench/fiber/fiber.lua'
    

    That’s running this very direct translation of the benchmark into Janet:

    (defn main [&]
      (var count 0)
    
      (defn inc []
        (+= count 1)
        (yield)
        (+= count 1))
    
      (def fibers @[])
      (for _ 0 100000
        (def f (fiber/new inc))
        (resume f)
        (array/push fibers f))
    
      (each f fibers
        (resume f))
    
      (print count))
    

    I used to write a lot of code in a language with ARC and I think it’s a pretty nice memory model, but I admit I leaned heavily on static analysis to help me with retain cycles. It sounds like Cyber’s memory model is… ARC + a GC?

    By default, references that outlive the first release op are tracked by the VM. The VM then checks for abandoned reference cycles automatically and frees them. The check can also be explicitly triggered in the user’s script. For embedders, the automatic check can be turned off and triggered manually by the VM host.

    To my amateur ear this sounds like it’s describing a generational garbage collector? I don’t think I understand this. I guess one difference is that destruction still happens eagerly in the case that a value is explicitly released, instead of waiting for the next GC cycle? But that seems like a small difference. Would like to hear more about this.

    Very interested to hear more about the gas mileage thing too; that’s the first I’ve heard of this as a first-class concept in a language (prior art I can look at?). My only real experience with embedded languages is Janet, and in order to interrupt the VM you have to spawn a separate OS thread in order to pre-empt it, which is very annoying normally and just impossible (as far as I can figure out) if you’re running it in WebAssembly.

    1. 3

      Very interested to hear more about the gas mileage thing too; that’s the first I’ve heard of this as a first-class concept in a language (prior art I can look at?). My only real experience with embedded languages is Janet, and … to interrupt the VM … is very annoying normally and just impossible (as far as I can figure out) if you’re running it in WebAssembly.

      I haven’t looked at Cyber or Janet, and I’m not very familiar with the WebAssembly landscape, but I recall at least one Wasm implementation, Wasmtime, having a concept built in of “fuel” that is consumed by executing Wasm operations, with execution being interrupted if it consumes too much fuel.

      1. 2

        I haven’t implemented the GC aspect of it yet. It’s different in the sense that it won’t run on a separate thread. But it will have to build a graph and detect cycles. When does this happen? I think it would perform this action once in awhile after a release op if it detects that it needs more memory. How frequent this happens can also be manually controlled by the user script or runtime. Also providing the weak refs should help lessen the load if you know that there will be a cycle somewhere.

        As for the mileage check. I think it will just be interrupt ops placed in functions calls and the beginning of loops. You specify an a threshold. It wouldn’t be counting the number of instructions just a hops from one interrupt to the next interrupt instruction.

        1. 2

          Each “interrupt” operation should be able to know at compile time how many instructions will occur before the next one, more or less. I guess that comes down to knowing the size of the basic block you’re in. When you enter a basic block you know that BB is X instructions long, and by definition contains no jumps until the end. So the start of every basic block just checks “am I out of fuel? If no, subtract X from the fuel consumed”. I thiiiiink that should let you account for fuel both pretty efficiently and pretty accurately.

          1. 1

            I think you might be right. To handle branching, interrupts could be placed at jump instructions and those would contain info about how many instructions until the next jump or branching instruction.

      1. 3

        Lots of annoying non-writing things that I need to do to publish my book about Janet. Switching from Hugo to a custom build system, improving the syntax highlighting, autogenerating the table of contents instead of manually keeping it in sync… dumb things like that I’ve been putting off for a while.

        1. 1

          What is Janet? As a technical book (wannabe?) author myself I am curious as to how other people do their thing!

          1. 3

            Indeed, the programming language. It’s such a niche language that there are very few syntax highlighters for it, so I had to write one – or, really, extend one that I had written a while ago to actually cover the whole language. The book features a lot of repl sessions as well, so I did some stuff to highlight those correctly… obviously none of that was necessary, but I think most programmers are used to looking at color, so it’ll make it easier to read.

            When I started writing this book I just opened a text editor and started going, and used Hugo to generate the website because I had used it before. But Hugo is incredibly inflexible, so I knew I’d have to replace it at some point, and I managed to put that off that until the book was ~80% written. So now I’m using redo to build it directly with a hacked together mdast parser – JavaScript is not my first choice, but I wrote the syntax highlighter as a CodeMirror extension (because the book includes a repl) so using JS lets me re-use the same parser to highlight the code snippets in the book.

            Anyway, all that work is behind me now, so I can get back to actually finishing the book!

            1. 1

              nice, from one (putative) author to another good luck! I do think the key is keeping on with the grind.

            2. 2

              Not OP, but i’d guess https://janet-lang.org/

          1. 5

            Amazing DIY effort!

            Check out keymouse (dot com) for the original split with a trackball. It used to be wireless (that’s the one I run), but they gave up with that and went back to wired.

            1. 1

              Thanks! Ha, keymouse seems… interesting. I’m not sure I would actually use two trackballs, heh. I’d probably get used to only using one with my right hand.

              1. 2

                I would love two trackballs; one mapped to scroll and the other to mouse movements.

                1. 1

                  took more than a fair bit to get used to, but an eye tracker combined with a 3d mouse - scale it down and combine in a way like this (https://hackaday.com/2017/07/27/unholy-mashup-of-spacemouse-and-sculpt-keyboard-is-rather-well-done/ ) and it might be something. Twisting left/right is a nice scroll up/down, lifting/pushing as zoom in/out and the tilting to pan. Eye tracker gets you the coarse initial warp-to point while the 3d mouse adds the missing precision.

                  1. 1

                    I have scroll horiz and scroll vert on two different layers, so one thumb press (not even sure which one now since it’s all muscle memory now) and I’m scrolling away.

                    Are you saying that you actually use an eye tracker for this? And it works well? Holy cow that is freaking amazing!

                2. 2

                  I have both the original Alpha model and the Track. Both are good, but I far prefer the track. (I have had serious RSI issues over the years.)

                  I do mostly use the track ball in the right hand, but I do a bit of both. It’s pretty neat … worth trying if you can get your hands on one. And Heber (the founder) is a tech guy who has a passion for it … definitely not a get-rich-quick scheme 🤣

                  1. 1

                    How do you like the thumb clusters? With the distance between the thumb keys and the trackball, it seems like you have to stretch the Thumb quite a bit to move to one or the other?

                    (I thought I had pretty much seen it all after following /r/ergomechboards for a while, but somehow hadn’t seen keymouse yet…)

                    1. 2

                      I have the original and not the current. Thumb clusters are good. Personally, I’d drop the furthest thumb reach (the down and away button) and add two proper keys at the top of the existing cluster. Also, I don’t use the outside pinky column at all (I have a completely custom layout). Any extra reach is really hard on RSI, so I do 99% on 3 rows (I rarely use the numbers row) plus the thumb keys, with only one-key horizontal stretch on either the forefinger (easy) or pinky (less easy). I generally write 50-100kloc per year, plus lots of non-code stuff. No RSI in years now.

                      1. 2

                        Great to hear! I have switched to a Kinesis Advantage a couple months ago and my wrist pains have disappeared. But I am still very much interested in designs that push the state of the art forward.

                        1. 3

                          Kinesis are great keyboards. I have 2 of the Advantage Pros (with foot pedals) 🤣 and that’s all I used for many years.

                          1. 3

                            What do you use the foot pedals for?

                            1. 2

                              It’s been a while since I’ve used them, but the foot pedals are mainly for modifier keys or changing layers (e.g. accessing Kinesis macros). The Kinesis models that I have are a bit old now, so they don’t have the amazing level of programmability that you’d expect today from new keyboards with built in ARM chips or whatever. But I did use them to write a few software products (i.e. I personally typed in many hundreds of thousands of lines of code) without inflaming my horrible RSI, so I have a great deal of love and appreciation for Kinesis 😊 … so if you’re in doubt, always give Kinesis the benefit of the doubt by default.

                              On my Keymouse setup, I’ve added a dedicated layer for each hand to put all of the modifiers (ctrl, shift, alt, cmd) on home row. So one layer turns the left hand into a dedicated modifier set (and leaves the right hand unchanged), and another layer turns the right hand into a dedicated modifier set (and leaves the left hand unchanged). Then I have a layer for num pad (left hand is all modifiers, right hand is num pad), and a layer for function keys (left hand is all modifiers, right hand is all function keys). Here’s my layouts as of a year ago: https://1drv.ms/w/s!Al7tOqyQS2IveWlYnwO2D9msNHE

                              (Edit: I should explain a bit about the layers. I often have to type crazy combos like shift-cmd-8 or alt-command-f7 or whatever. This is the IDE keystroke hell that programmers have to deal with sometimes to avoid the mouse, e.g. in the amazing IntelliJ IDEA debugger.)

                              Coincidentally, this week I had hand surgery (unrelated to RSI) and I now have a literal club hand wrapped with an inch of protective stuff with a few fingers semi-sticking out, so for the first time in years, I’m using a normal keyboard 🤣

              1. 6

                So, ultimately, expect is equivalent to assert (String.equal stdout_string expected_string)? It’s very surprising to hear this kind of idea coming from Ocaml people - I’d have expected them to prefer keeping the safety of their strong typing rather than go with a stringly-typed testing framework. I am not certain this is a good idea even if you have strong confidence in your pretty printers, I can imagine it resulting in weird code contortions when attempting to test something hard to print/that doesn’t have a readily-available printer.

                1. 3

                  It’s possible to get a false negative if you don’t design your pretty-printers well. For example, if you print a string directly, without quotes, then you might not notice when it’s missing/empty. But in practice, 1) debug representations are often hardened against that anyways, because precision here is also useful for debugging, and 2) overall, I think the productivity gains and low friction of snapshot testing outweigh the risks.

                  Jest, I believe, will actually embed the data structure into the test when possible, rather than just a string representation. Besides type-safety, the practical question is whether you have a good diff tool for structured objects. If you don’t, then diffing against the string representation will produce better error messages when fixing a broken test. Pytest, for example, will let you assert equality of various objects, but it’ll show you the string diff in addition to a structured message (like “item 1 in list did not match”) because it’s often higher-quality.

                  1. 3

                    Well you need to know the type of something in order to print it, so the type system is still working for you in test code. But I think you’re right that this doesn’t make sense without a reasonable debug representation for your types.

                    I can imagine it resulting in weird code contortions when attempting to test something hard to print/that doesn’t have a readily-available printer.

                    But since this isn’t the only way to write tests, you don’t have to contort anything if this isn’t a good fit for the particular thing you’re testing. From the article:

                    Classical assertion-style unit tests still have their place—just a much smaller one.

                    In practice though I’d usually define a new type that contains all of the relevant data that I want to assert on and then print that out to keep the ergonomic benefits of writing expect tests. (If an assertion is anything more complicated than an equality check, a property test is often a better fit anyway.)

                    I feel like the Hardcaml example is a pretty compelling argument for spending the time to write a decent pretty-printer, though – imagine what that test would look like with assertions!

                  1. 2

                    A previous post on this blog had a few more examples of expect tests in practice: https://blog.janestreet.com/computations-that-differentiate-debug-and-document-themselves/

                    1. 2

                      Continuing to work on my book about the Janet programming language. Last weekend I added a repl to the (website version of the) book, which I think is pretty neat, though I still need to implement history for it.

                      The next chapter is about embedding Janet in larger programs, so I need to come up with an interesting microproject to use as an example for that… I’m thinking maybe an art generator? Turtles? Spirograph machine? Something like that?

                      1. 13

                        This is still frighteningly common. Untyped (or at best integer) duration parameters, writing “kilobytes” when we mean “kibibytes”, datetimes without a time zone or TZDB version number.

                        1. 9

                          Which codebases actually use different types for pennies and dollars, or minutes and seconds? I’ve never seen it in real life.

                          I’d guess that approximately 100% of codebases we use from day to day (including say lobste.rs) don’t have such a distinction in the type system. They use integers or floats. I’d be interested to learn otherwise.

                          While it’s admirable that the author dove into some code she didn’t “own” or understand, I’d also say that the code lacked tests. Whenever I fix a bug, the first thing I do is write a test to reproduce it and fails. Then verify that the fix makes it pass. This seems obvious, but I still see a lot of people “reading and hoping” rather than testing.

                          1. 12

                            I don’t think that creating distinct types for “dollars” and “pennies” is a very good idea, although I realize this is the fix that the article proposes.

                            The codebases I have worked on that handle money use an opaque type for “quantity of money,” with constructors for creating these quantities out of pennies or out of dollars or whatever. Same with time spans – that way you never write a function that says “I need this argument in milliseconds,” you write a function that says “I need a time span,” and the caller can decide how it wants to give you one. The type represents the logical thing you’re measuring, not the specific unit that you measure it with, and all unit conversions go through the opaque type interface.

                            Of course it’s still possible to make mistakes, and write code like time_span_of_ms(time_span_to_minutes(span)), but I claim without evidence that that sort of code would stick out like a sore thumb. Generally the only time you leave the safe confines of the opaque type is on the edges of your program, when you’re interacting with the outside world or a third-party API, and that code is going to receive a lot of scrutiny and test attention.

                            1. 3

                              OK having a separate type for all money sounds a lot more reasonable. So you’re using operator overloading in those codebases?

                              But yeah that is not what the original article suggests.

                              1. 6

                                Yeah, as one example adding two quantities of money uses an overloaded +. But it’s not like a transparent “numeric” interface thing – you can’t multiply two quantities of money, for example, or divide by a quantity of money. (You can divide a quantity of money by a quantity and arrive at a price, but those are separate types because you really don’t want to be able to mix them up, and this operation is not spelled /.)

                                1. 1

                                  Custom infix operators plus abstract type allow to do that easily, e.g., make a *$ operator for multiplying an opaque type for money by an integer. E.g., in OCaml:

                                  module type MONEY = sig
                                    type t
                                    val ( +$ ) :t ->  t -> t
                                    val ( *$ ) : t -> int -> t
                                    val from_int : int -> t
                                    val to_int : t -> int
                                  end
                                  
                                  module Money : MONEY = struct
                                    type t = int
                                    let ( +$ ) l r = l + r
                                    let ( *$ ) amount multiplier = amount * multiplier
                                    let from_int amount = amount  
                                    let to_int amount = amount
                                  end
                                  

                                  Since OCaml has local module opening (let open Money in (Money.from_int 10) *$ 2), one could as well reuse *, but a distinct *$ name makes it safe to open the module in a wider context.

                                  1. 1

                                    Calling the constructor from_int just moves the problem up the stack.

                                    1. 3
                                      1. Conversion of raw inputs to money is doomed to have to be done somewhere, so I’m not sure what’s your point.
                                      2. Any constructor is better than using raw ints to mean money because it’s easy to find where that conversion happens.
                                      3. I’m talking about an interface to the money type that would allow convenient but safe way to calculate prices, which is a distinct problem.
                                      1. 2

                                        Yeah, you can implement custom operators to do whatever you want. I claim that that is just not an operation that I want to do very often, so I don’t want it to be very easy to do. It’s like a code smell – did you really mean to multiply dollars? You probably meant to multiply by a price – a distinct type – to arrive at a “quantity of dollars.” There might be cases where you actually want to scale an amount of money (duplicating a transaction multiple times…?) but I would rather have to reach for a named function that makes it very clear why I’m doing that.

                                        Regarding to_int/of_int, to my original point I would not write those functions in my code. I don’t know what they do; they seem scary. I would write to_pennies and of_dollars and similar functions, that act over some fixed-precision type (i.e. not integers, since you often want to represent sub-penny values). Internally, sure, you probably represent Money.t as some integral quantity, but I don’t want my application to be able to observe that implementation.

                                        1. 1

                                          Why would you have different types for prices and quantities of money? How would they differ?

                                          1. 4

                                            Helps you catch mistakes. Physically, they’re the same, but the more explicit you are in the type system the less likely it is that you’ll accidentally forget to multiply out a price. Mistakes can be very expensive when you’re working with money!

                                            1. 3

                                              Haven’t thought about this before, but if I draw an analogy to distance. A price is like a rate while a quantity of money is like a distance. You want separate unis got $ and $/ea for the same reason you want separate units for km and km/hr.

                                        2. 1

                                          Sorry, when I say “calling _ _”, I mean “naming _ _” or “referring to _ as _”. I just mean it could probably have a name that will reduce mistakes in construction, like from_pennies or from_smallest_subunits or something.

                                          You’re right that regardless of the name, having a single source of truth constructor is still better. And yes, it wasn’t entirely relevant to the issue of calculating prices, but it’s relevant to the broader thread topic of practices that will help avoid mistakes.

                                2. 2

                                  A nice thing about Go is that pretty much any function that deals with durations takes a time.Duration instance. Poor JavaScript is stuck with window.setTimeout that uses milliseconds. Python actually has a timedelta class, but my experience very few functions actually accept one, which sucks.

                                  1. 1

                                    The problem is that Durations are limited to 200 years and Duration*Duration makes no sense but is required.

                                    1. 1

                                      Have you run into a use for Durations longer than 200 years? My main experience is that if you subtract a current Time from a zero Time or vice versa, you get back a saturated Duration (which led to me adding Duration.Abs() after I hit a bug with naively trying to convert the minimum duration to a positive number), but I don’t have any use for durations longer than the few years in which my databases have been running.

                                      1. 2

                                        Sure. How old is the Magna Carta? When will the US Sextacentennial be? (In fairness, it’s 290ish years, not 200)

                                        I’m more bothered by division and multiplication.

                                        1. 3

                                          You can’t represent the signing of the Magna Carta in time.Time because it doesn’t know about https://en.m.wikipedia.org/wiki/Old_Style_and_New_Style_dates 🙂

                                3. 6

                                  There are duration data types in most mainstream OO languages, like Java and Python. Alternatively, there’s Hungarian notation, as in duration_millis. For money there’s, for example, Currency in Java or, again, Hungarian notation if that’s not available. The lack of these are either lack of foresight, laziness, or overzealous variable shortening.

                                  To be clear, you don’t need different types for different multiples of the same unit; having a Pennies class and a Dollars class would just be wasteful.

                                  1. 2

                                    I think what you’re suggesting is reasonable, but the original article wrote something different:

                                    If that function had demanded a type called “dollars” and the caller had another one called “pennies”, it simply would not have passed the type checker/compiler.

                                    Agree about instant vs. duration being different types.

                                  2. 5

                                    I used to work at a company that worked with cryptocurrency, and that had code that assumed that numeric types represented amounts denominated in whatever cryptocurrency asset was specified in another field. Often these numeric types were Python floats, which was itself a problem - currency values should never be represented as a floating-point number!

                                    At one point, I needed to write some code that worked more closely with the Ethereum ecosystem, where it’s common to represent amounts as integer numbers of wei. A wei is 1.0*10^-18 of an ETH, so 1) reasonable values of currency would not fit into a float without risking losing precision, and 2) it would be disastrous if we ever used a numeric value representing wei in a context where ether was expected. I was paranoid that I’d screw this up somehow and tried to fix the representation of currency amounts everywhere in the codebase to minimize the chance of misusing a numeric value (and while I was at it, got rid of every place we represented a currency amount with a float since that was never correct to begin with). Anyway, another reason why programmers should use programming languages that allow them to easily create new types that will allow the compiler to protect them from themselves.

                                    1. 4

                                      I think you might be misunderstanding the idea. The idea isn’t that pennies and dollars are different types, but that you have a Money class and then you create your credit like ad_credit = Money(250, 'USD'). You have to feed the values into constructors with clear intentions and then can use those objects to shift stuff around.

                                      It’s a similar problem to untrusted input. You do only have “raw” numbers or bytes at the start and end of your codebase, so the instant you receive untrusted input you sanitize/validate/whatever, and end up with useful business logic in 95% of your code. End result is you don’t have to be “careful” all over.

                                      Other recent example is that Rust will accept Duration and you do things like sleep(Duration::from_milliseconds(500)).

                                      I think none of this is extremely modern or niche though. Simple stuff like timestamp manipulation will often be done by passing this stuff into a date library first

                                      1. 2

                                        The idea isn’t that pennies and dollars are different types

                                        That’s what is suggested in the article:

                                        If that function had demanded a type called “dollars” and the caller had another one called “pennies”, it simply would not have passed the type checker/compiler.

                                        What I’m saying is that that this solution isn’t common. I’d also doubt that it’s a good solution. Some things require testing and types are limited.

                                        Instants and durations should indeed be different types; that’s a different argument.

                                      2. 2

                                        Which codebases actually use different types for pennies and dollars, or minutes and seconds? I’ve never seen it in real life.

                                        Most of the codebases I see at $WORK do this, but that’s probably because they’re written in Ada, which makes defining new numerical types easy and whose users have a culture of defining new numerical types for the different things they represent.

                                        I’ve also seen it in one of our internal Ocaml codebases, where we have different string types to represent absolute paths, relative paths and so on.

                                        1. 1

                                          a reasonable middle ground that I have seen real life code use is to forbid positional arguments, and have your keyword argument named something like duration_ms so it’s clear what is expected.

                                      1. 2

                                        I’m writing a (free, online) book about the Janet programming language. It’s a delightful small language that I’ve been having a lot of fun with, and I think it’s a shame that more people haven’t heard of it.

                                        I just finished the chapter on PEGs, and I’m about to start on Concurrency and Coroutines. So still chipping away, but hopefully I’ll be done with it some time in January. But we’ll see. Writing a book takes a long time!

                                        I still need to come up with a name for the book. Suggestions welcome!

                                        1. 2

                                          Potential book title from the Rocky Horror Picture Show: Damn it, Janet, I love you

                                        1. 2

                                          Hosting a big Friendsgiving potluck! So lots of cleaning and prepping that will leave no time for code.

                                          1. 8

                                            Started a new job today! So I’ll be adjusting to the schedule change (working New York-ish hours from California – for once in my life, the end of daylight time worked in my favor) and hopefully remembering how to write OCaml.

                                            I’m probably not going to get anything done outside of work, but recently I’ve had an urge to write an introductory book about the Janet programming language. It’s a great little scripting language that I’ve used to make command-line tools, browser apps, and desktop games. So I might putter around with that for a while before I decide that it’s too much work.

                                            1. 1

                                              Working on a street starting with J?

                                              1. 2

                                                Hey now, there are dozens of shops using OCaml in the industry!

                                                but umm yeah in this particular case yeah

                                                1. 1

                                                  I’ve now found that Facebook and Dropbox and Bloomberg use it. None of them have a primary address starting with the letter J

                                                  1. 3

                                                    Jropbox

                                              2. 1

                                                Nice. I havent stumbled upon Janet programming language. Share your book

                                              1. 4

                                                When I was a kid I used to play this DOS game called Sopwith. You were a little CGA plane; you flew around and dropped bombs and sometimes you fought another plane.

                                                Last weekend I played a game called Luftrausers, which is sort of a bullet-hell score attack thing, but it’s also a 2D side-scrolling plane game. It was fun for a little while, but I couldn’t help but think of Sopwith, and I missed the satisfaction of dropping a well-aimed bomb. (In Luftrausers you can only shoot directly forward, despite the fact that you’re fighting targets that are imminently bombable.)

                                                It brought a sort of nostalgia trip and I longed for a game that was sort of a hybrid of the two. I want to play a “modern Sopwith,” whatever that’s supposed to mean. Does that exist? I started making one and will probably continue working on it this weekend unless someone points me at something that will scratch this itch.

                                                It was fun to play with the “physics” of it. The plane generates lift based on its velocity, and the faster you’re going, the less sharply you can pitch. By subtly tweaking thrust, drag, lift, and pitch rate, you wind up with very different feeling planes. Little demo here (mp4, 18mb.) There’s no game yet, but it’s fun to fly around a little world you made. It’s very peaceful. At the moment.

                                                Also! I optical illusioned myself. I was very surprised to see that the fog appears to be stronger near the base of mountains, and tapers off as the elevation increases. I was very surprised because I didn’t code it that way. And in fact it doesn’t: it’s just an illusion. There’s no variation in mountain color over the Y axis, but I still perceive a variation because the tops have a higher contrast against the sky. Neat, right?

                                                1. 2

                                                  I still think the optical illusion is caused by dithering. Surely the fog disappears if you stop dithering the mountains… right? I’m very unsure. I’m not an optical illusion expert.

                                                  1. 2

                                                    If anything I think the effect is more apparent without dithering: https://ianthehenry.com/drops/undithered.png

                                                    Taking out the sky gradient greatly reduces the effect for the most distant mountain, but I still see it for the other two: https://ianthehenry.com/drops/flat-sky.png

                                                    I think my brain is just seeing what it expects to see.

                                                1. 10

                                                  It was fun to read this. I used to work on an exchange implementation, so it was fun to see an outsider’s take on it – although this is clearly from a crypto exchange standpoint, which I don’t know very much about (traditional exchanges don’t have a concept of “accounts” or “balances” – they just pair orders).

                                                  You definitely want to split your buy and sell orders into separate structures, as well as keeping a separate order book for every symbol (I think in your case that would be “per AssetInfo;” I couldn’t tell if you’re already doing this). That way when you’re matching you just find the appropriate book and then traverse it in order by price/time/priority. So you can always start at “the front” of the order book and iterate until your order is filled or you run out of resting orders that matches the limit price. No need to hop around through all the price levels on the other side. Which brings me to:

                                                  Storing orders in a tree is an extravagant luxury :). One of those fun actual performance vs algorithmic complexity paradoxes – you’d have to have a huge order book before a tree came anywhere near the performance of a flat array. You want to make sure that your order book is packed into memory as densely as possible (at least for a symbol/side pair) so that you’re matching out of cache. This is especially true because “most” orders are purely taking orders that will never sit on the books, so you want to optimize heavily for traversal rather than insertion. (I have no idea if this is true in cryptoland.)

                                                  I realize this is just for fun, so I won’t harp on the use of floating point for money in your sample code :)

                                                  There’s something very pleasing about order books and matching algorithms. It’s a neat, pure, well-defined problem, and one where the simplest implementation is, generally, the best implementation.

                                                  Everything else about exchanges, meanwhile…

                                                  1. 4

                                                    One of those fun actual performance vs algorithmic complexity paradoxes…

                                                    I always enjoy those :)

                                                    I think my favorite was a multidimensional search that someone had done with a k-d tree: they insisted that it was the fastest way to do it and had a lovely Big-O argument to prove it. We used a couple of heuristics to bound the search space then just did a brute-force search and came out orders of magnitude faster. Caches change everything!

                                                    1. 3

                                                      The lovely Big-O argument simply forgot one complicated truth - memory access is not a constant time operation!

                                                    2. 2

                                                      Glad you enjoyed the post, thanks for your insight here! Really helpful perspective. 🙏

                                                      although this is clearly from a crypto exchange standpoint

                                                      Guilty as charged, I do come from a crypto background but I also tried to design this to be as flexible as possible across different types of assets.

                                                      You definitely want to split your buy and sell orders into separate structures

                                                      I think this is going to be in the part 2 update! I kind of independently reached this conclusion as well, since splitting them would let me avoid the weird and ugly buy-to-sell-side matching logic in the attemptFill function. Keep an eye out for that, I’ll publish it on here, too.

                                                      as well as keeping a separate order book for every symbol

                                                      Yes, the intended design is to create any arbitrary number of markets under a single exchange, with each market handling one buy and sell side for its asset.

                                                      One of those fun actual performance vs algorithmic complexity paradoxes You want to make sure that your order book is packed into memory as densely as possible (at least for a symbol/side pair) so that you’re matching out of cache

                                                      This is really interesting. My naive approach assumed that the lookup speed of a tree would be faster than searching via flat array iteration, but I hadn’t considered the memory and implementation details of that. I do plan to run a series of benchmarks on my design once I’ve got it to a good place, maybe part of that will be testing out a different order storage implementation and seeing how it performs with just throwing them into slices.

                                                      I realize this is just for fun, so I won’t harp on the use of floating point for money in your sample code :)

                                                      😅 Yes, integers would certainly be better, I just didn’t want to focus on that as a problem here, I was more interested in the exchange problem itself because, exactly as you pointed out, they’re super interesting pure problems. Great for beating your head against on a long flight. Probably another good item for part 2 of this series.

                                                    1. 2

                                                      Over the weekend I started a documentation page for Bauble, which meant adding a compact display mode, custom resolutions, and lazily loading multiple GL contexts on a single page. So now I need to actually flesh out the docs. Also maybe I’ll finally tackle custom lighting… the last big missing feature.

                                                      1. 2

                                                        Still hacking on Bauble. Bit of a tech debt/refactoring cycle this week, as I want to add new features but I have outgrown the “ehhh I’ll just update the DOM by hand” prototyping phase. So I’m trying out a UI framework, and finally getting to experience firsthand the hell that is the modern JavaScript TypeScript ecosystem.

                                                        Main goal is to make it possible to render multiple Bauble views on a single page with a reduced UI, so that I can write a help document with interactive examples. I think Bauble is pretty neat now, but I also think there’s no way that anyone but me can use it because there’s no documentation beyond the tutorial. So, next step: document stuff!

                                                        1. 12

                                                          For the last six weeks I’ve been working on https://bauble.studio and it’s basically… done-ish? Almost done? There are still tons of things to add, but animation and instanced repetition were the last “must have” things. So this week I’m going to play with it! And add an “Export to Shadertoy” feature so that people can save and share their creations without my having to write a backend. Although it would be a great excuse to try Gleam…

                                                          1. 3

                                                            Oh, that is delightful. Good job!

                                                          1. 6

                                                            I’ve been working on a playground for making art with lisp and math, and this week I wrote a lisp-to-GLSL compiler so that you can write arbitrary expressions instead of just combining built-in shapes and transformations. This is pretty fun by itself, but it also means that I can implement animation pretty easily now. Which means I have a bit of a UI project ahead of me this weekend: I want to make a decent scrubber, looping controls, play/pause buttons etc so that it’s easy to debug animations.

                                                            Also I’ll probably spend a lot of time just applying different functions to distort 3D space and going “ooh” when I land on a cool looking shape.

                                                            1. 5

                                                              I think OCaml’s syntax won’t present much of a challenge for any seasoned programmer, especially for someone with a passing knowledge of a similar language like Haskell.

                                                              I find the hardest part of OCaml is remembering the syntax :). I made https://ocamlsyntax.com because there are certain constructs that just will not stick in my brain and I have to look them up every time I need them (looking at you, polymorphic abstract type variables). And that doesn’t even include any of the object-oriented stuff! That’s a whole other mirror universe of different syntax on top of the rest of the language.

                                                              no syntax for list comprehensions (I often use those in Clojure, Haskell and Erlang, although I’m well aware they are not something essential)

                                                              Definitely a personal style thing – even in Haskell I prefer an explicit do to list comprehensions, which OCaml… sort of has… if you install a syntax extension… okay yeah not great.

                                                              the compiler is super fast, but that was expected given that OCaml is famous for it.

                                                              Really?? That’s fascinating. My only significant experience with OCaml was working on a large-ish codebase, and the unbelievable slowness of even incremental recompilation was one of my biggest gripes with the language. And doing an optimized build, especially post-flambda… is it really famous for being a fast compiler? (Is this a “compared to GHC” situation?)

                                                              I was amused to see that the popular book Real World OCaml directly recommends replacing the standard library with a third-party alternative.

                                                              For context, one of the co-authors of Real World OCaml wrote the third-party alternative that the book advocates. (I mean, he didn’t literally write it all by himself, but you know what I mean.)

                                                              There’s also support to target JavaScript from OCaml sources, which looks interesting, but I haven’t tried it yet.

                                                              For the curious, I think js_of_ocaml is shockingly good. It really does what it says on the tin. The generated output isn’t slim – I don’t know if there’s some kind of tree-shaking thing you can do to improve that – but if you can afford to serve 1mb .js files, you really can write a webapp in OCaml.

                                                              1. 1

                                                                My only significant experience with OCaml was working on a large-ish codebase, and the unbelievable slowness of even incremental recompilation was one of my biggest gripes with the language.

                                                                With a large codebase surely the build time is mostly influenced by what build tool is used and how the build is done. Was the build optimized properly i.e. were modules being rebuilt that didn’t need to be? And were the correct settings used to optimize for dev workflow for incremental compilation e.g. the -opaque flag?

                                                                1. 2

                                                                  I don’t know! It was a long time ago, and I never tried to dig into the build system to see if I could speed it up. There was a team dedicated to build performance (and tooling), and I trust that they knew what they were doing, but I can’t make any more intelligent claims than that. Incremental recompilation was definitely much faster than not-incremental recompilation, and they built a system to make almost all builds into incremental builds with a shared artifact cache, so I assume they were doing all the right things. But I couldn’t swear to it. We had knobs to compile modules without linking anything, which made incremental rebuilds tolerably fast, but that meant no tests… so yeah.

                                                                  Anyway I am genuinely asking about that point, because I know very little about OCaml outside of my bubble. And because this was the largest codebase I have ever worked on in any compiled language, I have no intuition for what a “fast” or “slow” compiler feels like at that scale.

                                                                  1. 2

                                                                    If it was a few years ago it may have also changed drastically. The compiler has always been very fast, but the build tools need to exploit it (and parallelize and avoid rebuilds where possible). Anecdotally, I worked on a project that counted about a million lines of ocaml. It used to build with omake, which would take about 10 minutes (not including tests). When we ported it to jbuilder/dune, we could build the full toolstack (so the huge project + all its related libraries and extra components + tests components) in less than two minutes, with rebuilds (usually) in the order of seconds.

                                                              1. 4

                                                                I’ve been working on a playground for making 3D art with lisp and math (using “signed distance functions,” if that means anything to you). At the moment it’s mostly a curiosity: you can make cubes and spheres and stuff, apply a very limited set of spatial transformations, assign simple surfaces, and that’s pretty much it.

                                                                But! Last weekend I started working on adding expressions to the tool, so you can say things like “the color of this shape gets darker when the surface normal’s y component is negative” or color = red + perlin-noise * blue or whatever. Which is fun, but the real fun is writing expressions for shapes: “the radius of this sphere is the sine of its y coordinate” – and now you have a wobbly sphere.

                                                                My thing will never be as flexible/general as writing shaders by hand, but this is going to significantly narrow the gap. And it will mean that I can support animations! Which I’m very excited about. So my goal this week is to wrap up the expression rewrite, update the “tutorial,” and then probably just play with procedural animation for a while. Teach my yorp how to walk. Maybe record a video of myself making something?

                                                                1. 4

                                                                  High-level pitch: programmatic 3D modeling software in Lisp.

                                                                  A few weeks ago I wrote a little proof-of-concept toy for composing signed distance functions in a language called Janet, compiling them to OpenGL shader language, and then raytracing them in a WebGL canvas. I’ve continued to hack on it a little bit here and there, and I think it’s pretty close to being something that other people could have fun with too. A big missing piece was figuring out how to do surfacing, but I took a stab at it yesterday and I’m feeling pretty good about the approach I landed on.

                                                                  This weekend I’m going to try to get it into “alpha” status and publish it somewhere. Which means making it work at all on mobile, making the DSL a little more friendly, and adding more primitives. But mostly it means writing some kind of tutorial…

                                                                  1. 2

                                                                    Errm, making sure : you’ve seen libfive right? It has this (well, in scheme)

                                                                    1. 1

                                                                      I have not! Thanks for sharing that; that looks really cool. I especially like the feedback between the rendered image and the source – that’s something that I’ve vaguely wished for, but can’t think of a good way to implement. I’ll have to look through this for shapes and combinators to add :)

                                                                  1. 4

                                                                    Janet is an interpreted language that looks a lot like Clojure, but its semantics are more similar to JavaScript. It can produce native executables that statically link the Janet runtime and interpreter, although the Janet code you write is only compiled to bytecode. But the single executable means that it’s very easy to deploy, and the runtime is tiny. There is a friendly HTTP framework that powers, e.g., https://janetdocs.com/.

                                                                    How’s SCI’s runtime overhead? I have no idea if it’s up to the task of running an HTTP server, since I’ve mostly heard of it through Babashka. But it would probably work, right?

                                                                    There are also people writing ClojureScript, compiling it to JS, and running the result via Node. Which is not at all “native,” but fits the bill for a lower memory-overhead way to run Clojure code.

                                                                    Common Lisp is the “obvious” answer for native binaries, but it’s not very Clojure-like at all. Chicken Scheme also compiles to native code. So can Racket, apparently? But if this is only for fun, you can probably get away with one of those interpreters.