Threads for nwjsmith

  1. 1

    Are there plans for the Elixir compiler to leverage these optimizations at some point?

    1. 1

      Doesn’t Elixir generate BEAM bytecode? If so, I’d expect it to also benefit from this. That said, I was a big confused by reading the announcement because a lot of the things that they were talking about sounded a lot like things from the original HiPE paper, and I thought HiPE had been merged around 15 years ago.

      1. 1

        HIPE has been deprecated in favour of this approach

      2. 1

        Anything generating BEAM byte code benefits at runtime

      1. 6

        Reclaiming 900GB+ of disk space on a Pg server after dropping a JSONB column from a table with 2.5B rows. We’re using pg_repack to avoid extended downtime. After that, burying my work computer in the garden while I take a week off.

        Race a dinghy on Sunday morning, and celebrate my birthday in the afternoon.

        1. 3

          I’m pretty sure that JSONB column is my fault, happy to see you’ve found a way to get rid of it!

          Never again

          1. 2

            After that, burying my work computer in the garden while I take a week off.

            Understandable :)

          1. 4

            using SQLAlchemy (which makes it hard for developers to understand what database queries their code is going to emit, leading to various situations that are hard to debug and involve unnecessary operational pain, especially related to the above point about database transaction boundaries)

            I’m also a fan of raw SQL. If I would be asked for mistakes I have made, I would as well name the database abstraction (in my case peewee). Two things that annoy me: The database is now mixed into my models and peewee has a very global connection management. I would much rather like the model to be totally independent from the database layer and do explicit connection management. E.g.

            # no active record, no implicit connection tied to the model
            my_object = Customer(name='John Doe')
            # CustomerDbLayer must contain all the logic to understand
            # how to store a Customer into the DB
            # (as well as how to read one from the DB)
            cdb = CustomerDbLayer(db_connection)
            cdb.save(my_object)
            

            It’s a bit surprising that at the end of the article they mention GraphQL and Kubernetes as some of their good decisions, two technologies which I would definitely not list in the section boring. At least a few years ago both seemed quite the hype to me.

            1. 4

              Kubernetes reminds me a lot of way back in the day when I was first learning about sendmail and was like “wait, this thing’s config is so complex people have to use and configure another tool just to generate the config file for it?” Except with Kubernetes it’s often multiple layers of that going on.

              The specific case with SQLAlchemy there is one that I’ve seen be a problem at a previous employer (where control of exactly when, what, and in what order queries ran was important for regulatory compliance), but is more an issue with the session/unit-of-work/magic-optimization approach than with ORM or DB libraries in general. Plenty of ORMs and DB libraries give you much more fine-grained control of when queries run, when things get flushed to DB, etc.

              1. 2

                I’ve found out about https://github.com/nackjicholson/aiosql last week and it hits is the perfect balance imho. I knew about a similar lib in clojure from few years back ( https://github.com/krisajenkins/yesql) but I hadn’t found anything equivalent for python.

                1. 2

                  Have you seen PugSQL?

                2. 1

                  GraphQL is just an old school RPC protocol. How is it not boring?

                  1. 4

                    GraphQL doesn’t have a well-defined story for caching or error handling, both of which are otherwise boring HTTP built-ins.

                1. 13

                  Author here. Ask me anything!

                  1. 9

                    just wanted to say congratulations on what you’ve got so far; it was interesting for sure, and I’m looking forward to future parts.

                    it makes me wonder if you’re basically modeling parallel dataflow; linear types probably means each stack can be executed pretty safely in parallel, yes? the moving values into a new stack feels very “spawn a thread” to me.

                    1. 5

                      Thanks!

                      Yes, that’s absolutely correct. Operations on separate stacks could trivially be parallelized by a properly designed runtime. Eventually, high-level Dawn code could be compiled to OpenCL or CUDA, similar to in Futhark. Alternatively, a compiler to assembly for a superscalar CPU could interleave operations on the different stacks to take advantage of instruction-level parallelism.

                    2. 6

                      This looks very interesting! I’ll eagerly await the next installment and look forward to playing with the language.

                      Reading the post, I wondered how the stack names are scoped. In the example code, one can reasonably (I think?) guess that $x, $y, and $z are scoped to the function whose {spread} operation created them, or maybe they are bound to the current stack. But looking at the comment you linked to in an earlier discussion about I/O, it seemed like there were references to named stacks like $stdout from some external scope.

                      Perhaps I’m wrong to assume that “scope” is even a relevant concept. But presumably there has to be some way to avoid stack name collisions.

                      (Edit) …Huh, maybe I am wrong to assume that scope matters. I am still trying to wrap my brain around it, but maybe if the language can guarantee linearity across the whole application, you don’t even have to care about name collisions because if there is already a $x stack from some calling function, it doesn’t matter if you reuse it and push values onto it as long as you’re guaranteed to have removed them all by the time your caller looks at it again (assuming your caller wasn’t expecting you to leave behind a value on that stack, but then “leaves behind one value on $x” would be part of your function signature).

                      I’m not quite sure that actually works, but now I’m even more eager to read the next post.

                      1. 4

                        But presumably there has to be some way to avoid stack name collisions.

                        Yes, there is. Take for example the expression: {$a swap} where swap is defined as

                        {fn swap => $a<- $b<- $a-> $b->}
                        

                        Expanded, this becomes

                        {$a $a<- $b<- $a-> $b->}
                        

                        Or, equivalently,

                        {$a {$a push} {$b push} {$a pop} {$b pop}}
                        

                        Without some mechanism to handle the stack collision between the inner and outer $a stack, this would not behave properly. Since we want functions to behave the same way regardless of what stack context they are executed from, that would be unacceptable. So this is handled by checking for nested stack name collisions and renaming the inner stack. So the latter would effectively be rewritten to

                        {$a {$$a push} {$b push} {$$a pop} {$b pop}}
                        

                        In the existing type checker prototype, renamed stacks are distinguished by a prefix of more than one $. Then, if one of these temporary stack names escapes up to the inferred type for a user-defined function, an error is raised. This ensures expected behavior while ensuring we don’t need to monomorphize each function to operate on different stacks.

                      2. 4

                        Your introduction to Dawn was clear and compelling, really excited to follow along with Dawn’s development.

                        {spread $x $y $z} is syntactic sugar

                        Do you intend to expose a macro system to Dawn programmers?

                        Conceptually, these named stacks can be considered to be different parts of one big multi-stack.

                        How are Dawn’s named stacks represented internally? I don’t have much familiarity with stack-based languages, but it seems like it would be straightforward to understand how the machine runs a program just by reading the source. Is that lost with the introduction of named stacks, or is there a mental mapping that can be made?

                        Dawn is really exciting!

                        1. 2

                          Thanks!

                          I’m undecided on syntactic macros, but the plan is absolutely to provide meta-programming. I haven’t prototyped this at all yet, but I hope and expect for compiler passes to be implemented in something I’ve been calling meta-Dawn in my notes—a form of staged compilation. The plan is for Dawn to be its own intermediate representation. We’ll see how that works out in practice, of course.

                          1. 2

                            And to answer your other questions, in the existing interpreter the named stacks are a Map String (Stack Val). The first compiler, which will be through translation to C, will use static analysis (basically the inferred types) to turn all stack slots, regardless of which stack they are on, into function parameters.

                            Eventually, I plan to write an assembler in/for Dawn, in which stack names will correspond to architecture-specific registers.

                          2. 2

                            I’m looking forward to the rest of the series. Is there any change you could put an rss/atom feed on the site? It’s a much nicer experience than subscribing to a mailing list.

                              1. 1

                                Thanks!

                              2. 2

                                Thanks for the suggestion. I’ll take a look at what that entails.

                              3. 1

                                First of all, it’s a really interesting concept! One question about the linear typing aspect though: What happens if multiple functions want to access the same read-only data structure? I assume that for small values clone just copies by value, but what if the data structure is prohibitively expensive to copy? Are some values cloned by reference? If yes, don’t you need to track how many readers / writers access the data so that you know when to free the memory?

                                I guess another way of phrasing this question would be: How do you handle the more complex cases of borrowing data with the semantics described in the post? Rust’s borrow checker is of course useful even for simple cases of stack-allocated values, but it really shines (and this is where its complexity lies) in cases of more complex heap-allocated data structures. (And of course, even Rust with its borrow semantics needs Rc/RefCell as escape hatches for situations where these guarantees cannot be statically checked, is there something comparable in Dawn?)

                                1. 1

                                  Great question! I’m going to get to that in a future post, but there is a solution, and it doesn’t require a separate borrow checker.

                                2. 1

                                  Nice! Also looking forward to future parts.

                                  a way to bind values to local named variables. Unfortunately, this breaks one of the primary advantages of concatenative notation: that functions can be trivially split and recombined at any syntactic boundary—i.e. their factorability.

                                  You gave an example of how Dawn still has this but can you give an example of why Factor or Kitten do not? Or more generally, why a concatenative with bind values to local named variables cannot.

                                  Here’s a example of Fibonnaci from Flpc (Disclaimer: I’m the author).

                                      [ 1 return2 ] bind: base-case
                                      [ newfunc1 assign: i
                                        pick: i pushi: 3 < pushf: base-case if
                                        pick: i 1 - fib pick: i 2 - fib + return1
                                        ] bind: fib
                                  

                                  Any part of the body can be split off. For example

                                      [ 1 return2 ] bind: base-case
                                      [ pick: i 1 - ] bind: one-less
                                      [ pick: i 2 - ] bind: two-less
                                      [ newfunc1 assign: i
                                        pick: i pushi: 3 < pushf: base-case if
                                        one-less fib two-less fib + return1
                                        ] bind: fib
                                  
                                      [ 1 return2 ] bind: base-case
                                      [ pick: i s21 - ] bind: recurse-fib
                                      [ newfunc1 assign: i
                                        pick: i pushi: 3 < pushf: base-case if
                                        1 recurse-fib 2 recurse-fib + return1
                                        ] bind: fib
                                  

                                  For your example, this would be

                                  [ newfunc3 assign: z assign: y assign: x
                                    pick: y square pick: x square + pick: y abs - return1 ] bind: f
                                  

                                  (z needs not be dropped explicitly since return1 takes care of that.)

                                  1. 1

                                    In your example of splitting up fib, what happens if you reuse one-less in another function? Does the source essentially get inlined, so that pick: i refers to the assign: i? If so, then this appears to be similar to Dawn, but without the linearity restriction.

                                    In Factor and Kitten, I don’t believe local variables work like that, though. I believe they behave more like they do in most existing languages, e.g. Python, where the pick: i in one-less would be an undefined variable error.

                                    1. 2

                                      In your example of splitting up fib, what happens if you reuse one-less in another function? Does the source essentially get inlined, so that pick: i refers to the assign: i?

                                      Yes, that’s exactly right.

                                      In Factor and Kitten, I don’t believe local variables work like that, though. I believe they behave more like they do in most existing languages, e.g. Python, where the pick: i in one-less would be an undefined variable error.

                                      Oh I see. I’m wondering if it’s because of scoping. With what I’m doing and maybe what you’re doing, you can’t (easily) get lexical scoping whereas the other way you could.

                                1. 2

                                  Completing my Ruby bindings to macOS Big Sur’s Virtualization.framework. Then I can hopefully get started on a Vagrant provider.

                                  1. 9

                                    I would love to read a thread where someone explains why the app is so gigantic to begin with. It does not do a whole lot, so what is all this code doing?

                                    1. 19

                                      The author explains this in a reply here: https://mobile.twitter.com/StanTwinB/status/1337055778256130062

                                      The app looks simple from the user perspective. But on the global scale the business rules are insanely complicated. Every region has custom rules/regulations. Different products have different workflows. Some regions have cash payments. Every airport has different pick rules…

                                      See also the people replying to that discussion, they talk about how unreliable networking means they can’t do this stuff on the backend.

                                      1. 3

                                        they talk about how unreliable networking means they can’t do this stuff on the backend.

                                        Uber is infamous for having over 9000 microservices. They even had to “re-invent” distributed tracing, because NIH syndrome or something (Jaeger, now open-tracing). I really, really doubt they do it all on the client. They have to make sure the transaction is captured anyway and without network, how can you even inform drivers?

                                        1. 4

                                          Presumably they verify everything on the backend anyway because you can’t trust the client. My guess is that the network is reliable enough to submit a new trip, but not reliable enough to make every aspect of the UI call out to the backend. Imagine opening the app, waiting for it to locate you, then waiting again while the app calls out to the network to assess regulations for where you are. Same thing picking a destination. That latency adds up, especially perceptually.

                                      2. 12

                                        Yeah that’s the part that gets me, it’s like the dril tweets about refusing to spend less on candles. Just write less code! Don’t make your app the Homer Simpson car! What is the 1mb/week of new shit your app does good for anyways? It’s easy, just play Doom instead of typing in Xcode.

                                        I don’t have hypergrowth mentality though, I guess that’s why Uber rakes in all the profits.

                                        1. 6

                                          I guess that’s why Uber rakes in all the venture capital dollars to subsidize their product

                                          Fixed it for ya ;-)

                                          1. 1

                                            I would love to see Chuck Moore’s (RIP) take on 1 MB/week. He might just have a seizure.

                                            1. 5

                                              Thankfully, Chuck Moore is alive.

                                              1. 6

                                                Ah, man, well I have egg on my face. I feel quite silly. I’m going to slink into a corner somewhere.

                                          2. 5

                                            I recall a few years ago someone disassembled Facebook’s (I think) Android app and found it had some stupendously huge number of classes in it. Which led to a joke about “I guess that’s why their interviews make you implement binary trees and everything else from scratch, every developer there is required to build their own personal implementation every time they work on something”.

                                            Which is cynical, but probably not too far off the mark – large organizations tend to be terrible at standardizing on a single implementation of certain things, and instead every department/team/project builds their own that’s done just the way they want it, and when the end product is a single executable, you can see it in the size.

                                            1. 4

                                              I wonder if there could also be some exponential copy-paste going on, where instead of refactoring some existing module to be more general so it satisfies your need, you copy-paste the files and change the bits you need. Then somebody else comes along and does the same to the package that contains both your changed copy and the original package, so there are now 4 copies of that code. The cancerous death of poorly managed software projects.

                                              1. 2

                                                Scary… and probably very true.

                                          1. 2

                                            https://theinternate.com

                                            Looks more or less exactly how it’s made: Markdown on a Solarized Light background

                                            1. 2

                                              I like it! I’d put “overflow: auto” on “figure.highlight pre” to make the scrollbars only appear when required on code snippets. Otherwise cool idea :)

                                            1. 1

                                              I’m in

                                              1. 5

                                                My experience so far with Clojure’s spec library is that it provides exactly the experience Michael describes here. Start at the REPL to get a sketch together, then “finish” it with specifications of arguments/return values and property-based testing.

                                                1. 3

                                                  My Struggle Volume I by Karl Ove Knausgaard. Really enjoying it so far.

                                                  1. 16

                                                    This is a mix of great advice, things that are hopefully obvious, and a few terrible ideas just to spice things up. In particular the example of (comp str/capitalize str/trim) being labeled as “hard to read” compared to the lambda version is just bizarre, and the convention of marking reference types with * “because it resembles C/C++ pointers” would be a huge red flag if I saw it during code review. Adding in extra newlines sounds like a great idea if you always work plugged into an external display or two but shows a disregard for your colleagues who may prefer to work from a laptop.

                                                    1. 6

                                                      I thought the * prefix was interesting in that it offers up a solution to naming references. I find it difficult to name references something other than the thing they point to.

                                                      1. 2

                                                        I think you could make a case for a special naming convention for top-level def reference types, but it’s not helpful when it’s a local, which hopefully is the more common case.

                                                      2. 5

                                                        I had similar thoughts. It seems peculiar to offer a recommendation “Avoid higher-order functions” in a functional language.

                                                        The article did put me in mind of Stuart Sierra’s series of posts Clojure Dos and Don’ts. I also like How to Name Clojure Functions.

                                                        Regarding the recommendation to replace the threading macro with a let: an advantage (for me) of the threading macro, is that it’s immediately clear that the result of a function is not used again. It’s a temporary value that doesn’t matter after the next function in line has been called. In a let you have to scan the surrounding code to see how the value is used.

                                                        As an aside: I quite dislike that the code examples are images without alt text.

                                                        1. 1

                                                          Also, avoiding the threading operator just to obscure the computation going on with some (probably badly) named steps (remember, naming is hard) is not the best.

                                                          At work we used ->> a lot, because it both frees you from giving strange names to intermediate computation results and allows you to plug in whatever function you want to intercept the computation without having to rewrite your code.

                                                        1. 12

                                                          Gitless has no staging area

                                                          I frequently use git add -p to stage chunks of diffs, so that I can make separate commits of unrelated changes in a single file. I would miss that feature greatly.

                                                          1. 9

                                                            Gitless has a --partial flag that can be used to interactively select segments of a file to commit.

                                                            1. 6

                                                              As durin42 mentioned elsewhere, hg gets arround this (without having a staging area) by having support for interactive commit selection with hg commit --interactive.

                                                              1. 3

                                                                For non-trivial changes, how do you know for sure that your code still compiles afterwards? Is this not a problem in practice?

                                                                1. 7

                                                                  For non-trivial changes, I don’t know. But if it gets to the point where I have non-trivial changes in my working copy, I’ve already lost.

                                                                  In my experience, a common pattern is:

                                                                  • start editing some function
                                                                  • realise this function would be easier/safer/simpler with a change to the helper function it’s based on.
                                                                  • go off and edit the helper function, write some tests, make sure it’s solid
                                                                  • come back and finish the outer function, to make sure that the changes to the helper function actually accomplish the thing I needed
                                                                  • git add -p the changes to the helper function and the tests, which I’m confident can stand alone because the existing tests still pass, plus my new ones
                                                                  • commit the changes to the outer function

                                                                  …times as many stack-levels of yak-shaving as you can stand.

                                                                  1. 1

                                                                    Once you have made small commits, you can do an interactive rebase to build every version and make edits if necessary. With e.g. Gerrit and Jenkins you can automatically get feedback whether each commit builds/passes tests.

                                                                    1. 1

                                                                      You can use git stash -k to stash unstaged changes, run your compiler / test suite, and commit them if they all pass.

                                                                    2. 1

                                                                      I think I could make do with their --partial flag, but the lack of git rebase -i is a bit of a deal-breaker for me.

                                                                    1. 3

                                                                      Just started The Confusion by Neal Stephenson, and it’s really fun

                                                                      1. 4

                                                                        I really loved Anathema. Been meaning to read Snow Crash as I’ve heard it’s his best…

                                                                        1. 5

                                                                          IMO “diamond age” is his best; “snow crash” had a lot of nice ideas but the novel as a whole was a bit weak

                                                                          1. 3

                                                                            Hah, I disagree with that completely. I thought Snow Crash was a great novel, and The Diamond Age was the one with a lot of nice ideas but a bit weak as a novel. Especially the ending, but of course Stephenson is weak at endings.

                                                                      1. 4

                                                                        Feature flags are technical debt, they’re inseparable. You’re trading for a number of benefits now for some complexity that will be dealt with later.

                                                                        1. 6

                                                                          If you haven’t used Smalltalk before, it can be eye-opening just how good the programming environment is (and Pharo looks great).