Threads for piranha

    1. 1

      Great article!

      I really appreciate it when people take the time to write up their builds. Looking at someone else’s decision tree can be very helpful in coming to your own conclusions for building solutions like this.

      Also related there’s this IMO pretty good list of alternatives to ngrok in this and similar spaces - awesome tunneling on Github.

      1. 2

        I should add to article that this list was where I started :)

        1. 1

          That’s a great idea. I think it would help people who wanted to dig a bit deeper and understand the other options.

    2. 11

      It’s simple! I create a wildcard domain (something like *, xxx is for “real domain is none of your business” 😁), and then reverse-proxy everything through an SSH tunnel from a server to my laptop.


      The first problem is that handling * in Server-Caddy makes Caddy request a wildcard certificate.

      You and many others likely know this, but for anyone who doesn’t, every request for a TLS certificate in the past several years follows this workflow:

      1. Your certificate authority validates your right to get a signed certificate for the given domain (TXT records, email, whatever)
      2. Your CA issues a precertificate – a standard TLS certificate with a special “poison extension” that essentially says “no way in hell can you use this in a browser!”
      3. The CA sends this to a certificate transparency server (or several)
      4. The CT server returns a signed certificate timestamp – basically a blob of data that says “I see that you plan to issue this TLS certificate with the following parameters, I saw it at this time, and I’m signing it with my key”
      5. Your CA revokes the precertificate and issues a leaf cert, almost always including the SCT as an extension

      This allows any user to take a TLS certificate, go to a CT search system, and look up the precert. This is an essential part of the security model of CAs; you can guarantee that a given TLS cert was not issued in advance, and you can guarantee that your CA issued exactly what it told the CT servers it would.

      This also means that your security model must not depend on hiding your DNS. Looking up a domain in CT logs is a cheap, easy way to turn bugs into bounty dollars.

      1. 3

        Right, it’s not a security model, it’s just I don’t want my wildcard domain published on the internet. :)

    3. 8

      Another way to do this is Tailscale. If you want to share your laptop server generally, you put Caddy and Tailscale on the server and then put Tailscale on your laptop and tell Caddy to reverse proxy to your laptop’s Tailscale address. You can also connect directly to your laptop from your phone on LAN if you put Tailscale on the phone.

      1. 2

        That’s true, it’s a replacement for ssh tunnel + local Caddy, but you’ll need to run your projects on on your laptop (or on Tailscale’s ip). Plus there is still a need to map domain name to port, so instead of wildcard DNS on the server there is a need for imports (like import tmpl dev 8000).

        So the general theme is still applicable. :)

    4. 5

      I used to use ssh tunnels in the past, but I am now just using wireguard. It is a lot more robust and also a lot faster. No crazy ssh invocation magic, just use IP:port. You could even have DNS entries for the IPs.

      1. 3

        I personally found Wireguard incredibly hard to get running. Between the client and server configurations and the relevant firewall/port configuration it was just too much for me to be willing to use on an ongoing basis.

        I hear it’s gotten better and I know a lot of people use it but Tailscale is such incredible secret sauce I can’t see myself ever doing anything else :)

        1. 2
          1. 1

            That’s REALLY great to know about. Thank you!

            I don’t see myself throwing out Tailscale for my personal stuff because it’s just sooo convenient and works across mobile and everything with zero extra work on my end, but there are certainly places where stock Wireguard probably makes more sense and having a way to implement that without my head exploding is a definite win :)

            1. 2

              This has also been on my radar for a while as an alternative to dsnet

              1. 1

                One of the things that keeps me gravitating back to Tailscale is the fact that it’s more than just a point to point VPN link. It creates a virtual network layer that bridges all my infrastructure - mobile, desktop, cloud, home VMs/containers together into one seamless whole that I can access from anywhere.

                I can understand why folks run away from the commercial aspect (although as a pragmatist I’m NOT personally averse to throwing money at someone to solve problems for me).

                However I do appreciate the desire for a 100% FLOSS infrastructure setup, so I plan to look into Headscale in the future. The project has legs and in fact has contributions from Tailscale employees.

                1. 3

                  I don’t really understand where the difference is supposed to be. On my wireguard network all hosts can reach each other too, so it is a network, not a point-to-point connection. My server in another country can ping my raspi at home, my phone can ping my laptop etc. It is a real network. I also have DNS names for the hosts. I never bothered to look into tailscale too deeply as it is closed source, so I may be missing something obvious here.

                  1. 1

                    One difference I see is that Tailscale does automatic NAT traversal so I don’t need to poke holes in my firewall/NAT to make it work.

      2. 2

        I have to try replacing ssh with wireguard (or tailscale fwiw)!

    5. 1

      I take some quibble with what the author presents as the “naive solution” - I would have done something like <textarea oninput=" = value.length;"></textarea><span></span> for this - but the general principle of queuing up changes and only processing once is very good.

      In my desktop gui library, I used to have it automatically recalculate layout when any widget is added to the tree. After someone complained about bad performance in a loop, I changed it to something very similar here - on next idle loop, recalculate if needed. Now the loop is fine. But on the other hand, if you ask for width after adding something but before it recalculates, you will get an invalid thing. I suppose I could make that a property getter that forces eager eval as well. But meh still I’m a LOT happier with it this way and 99% of the time, all users see is the better performance.

      1. 1

        That eager getter is kind of like style recalculations on property access in DOM API, btw. :)

        1. 1

          Yeah, and it’s a performance cliff. You may have a fast DOM modification loop, add one new widget that happens to query styles, and suddenly performance tanks.

          It’s especially painful when you build your UI out of components. The abstraction barrier makes it basically impossible to coordinate DOM access across components, and you may easily end up with lots of interleaved DOM reads and writes.

          In one app I’ve worked on we’ve had to redesign the whole component architecture and tear components apart to let them schedule DOM reads and writes, so instead or WRWRWRW loops, we could have a global scheduler to turn it into WWWWRRRR.

    6. 10

      Throughput, cpu and memory usage are all nice, but the main thing to test is latency. It would be cool to see this compared, since going to GPU theoretically makes it worse.

      1. 3

        Yeah, latency is something you win on by keeping it simple. And throughput seems irrelevant to me - you can’t read text flying by that fast anyway, so there’s no real benefit to even trying to show it vs just settling for a slower update speed. The only point of showing anything is to let the user know it hasn’t frozen up and you don’t need to bother showing the actual screen in real time to get that across.

        1. 2

          Terminal latency is not about reading text. It’s about typing text.

          1. 2

            Yeah, I know. The first sentence is about [input] latency being a win. Then I changed the topic to [output] throughput, which I think is irrelevant.

      2. 3

        Yah, it’s possible to have just as good latency with GPU rendering as CPU rendering but you need to be careful about exactly how you present frames. Terminals also vary greatly in how much they buffer input from the shell before rendering a frame, which can have a big impact on latency.

        It’s definitely a cool project, the compute shader rendering is neat. But when I think about how I use a terminal the only two things where I’ll notice speed are input latency and when I print megabytes of plain text, so I don’t care too much about efficiency of escape code handling since that’ll probably be a fraction of a millisecond per frame.

        I’ve found that Kitty beats Alacritty in latency at least on macOS:

        1. 1

          Yeah I’m using because of this, see for measurements

      3. 2

        I don’t agree beyond a certain point. It’s many years since I used a terminal (except over remote X11 and one written by someone for fun in JavaScript) where the latency from keypress to character appearing was large enough that I noticed it at all. At a 60Hz refresh of most LCD monitors, you’ve got over 15ms to render the update, maybe 5ms if another process is running when the key press comes in and you don’t get scheduled immediately. On a modern system, 5ms is an astonishing amount of compute and I still probably wouldn’t notice if it took several screen refresh cycles before the character appeared. It’s hard for me to imagine anyone doing such a bad job at implementation that I’d notice.

        In contrast, I have in the last six months hit a case where I typed a command on a remote SSH session that generated a few MBs of output and then had to wait for the terminal to consume it. The command took under a second to produce all of the output, it took a few seconds for ssh to transfer all of it and then a minute for the terminal to finish scrolling. That prevented me from doing any work for a minute and so is something I really, really care about.

        1. 3

          and then a minute for the terminal to finish scrolling

          See, that’s an absurd situation. What my terminal does there is just… not scroll. It sees that a lot has changed, updates its internal data structures, then prints out the result. There’s just no benefit in showing scrolling when it knows there’s a bunch more data already in the buffer.

        2. 1

          I still probably wouldn’t notice if it took several screen refresh cycles before the character appeared. It’s hard for me to imagine anyone doing such a bad job at implementation that I’d notice.

          You’d be surprised, but I see it with my eyes in iTerm2 and Alacritty. It’s not only noticeable, it’s extremely irritating.

      4. 1

        I’ve been trying to wrap my head around this, that “going to GPU theoretically makes it worse”. What do you mean by that? As longs as the frame is complete before the 16.67ms deadline, wouldn’t that be the minimal possible latency on an LCD without tearing (assuming no gsync or freesync or similar)?

    7. 2

      Marques Brownlee a.k.a. MKBHD, an early YouTube-famous tech reviewer, did a video several years ago in which he responded to someone asking how to get started making review videos. I can’t find it now quickly, but I’ll bet it’s in a playlist about helping YouTubers.

      In this video, Marques talks about the equipment he used early on. It was some low-end, early model selfie cam that had a decent mic. IIRC, he moved to a nice webcam and a USB mic after a few videos got relatively popular and he developed some fans. It wasn’t until he was at something like 100k subscribers that he bought the really professional equipment to produce the incredibly high-quality videos he’s now known for and he’s upgraded a lot since then. I think he’s using a RED camera nowadays and that’s out of reach for just about anyone not making a living from the content they produce and living well off it.

      This article is good. It lists some great, high-quality equipment. Frankly, as someone who streams talks online and ran an online conference in 2020, it makes me salivate to have had the budget for it for myself and all of my presenters.

      However, that equipment comes at a significant price. This is a setup to build up to, not one to get started on. If you want to get started livecoding, get a $40 Blue Snowflake USB mic, use your crappy built-in camera or buy anything you can find right now (Goodwill?), and start producing quality content. The audio is important but webcam video isn’t. It’ll be 240-360 pixels high at the most in your stream! Once you realize you enjoy livecoding or you’re gaining followers to put you on a trajectory to make some money from it, then enjoy your $500 mic and mixer setup, studio lighting, and prosumer DSLR camera with amazing bokeh.

      1. 1

        It’s a hobby. Do I need to make money from it to make a setup I enjoy?

        What to do with my photo hobby then? It’s been dead for some time, but before that I’ve spent 4x on a nice camera (Fuji X-T1, which I mention in article) and for a few years enjoyed making photos a lot. Made zero cents on that, even spent some on Flickr Premium…

    8. 3

      I, uh… I don’t want to see programmers when I watch coding videos. I want to see the code.

      1. 1

        Do you watch coding streams?

        1. 1

          Sometimes. I find most of them pretty awful. It’s pretty clear in most that people are more interesting in “being like game streamers” than they are about creating valuable content.

          1. 1

            What do you mean “being like game streamers”? I’m genuinely interested!

            If you have examples of both (good and bad) I’d really appreciate it!

            1. 1

              I like my coding videos a lot more like you see on egghead or udemy than just a solid unedited recording where you have to sit through every boring real-time minute.

    9. 1

      This is just a bunch of product reviews.

      Also, it’s a little weird seeing somebody drop hundreds of dollars in kit without even knowing what they’re gonna be streaming.

      Finally…when the hell did we get to this coding as performance art thing? This dramatization and theatrical approach just runs me the wrong way. When did we decide that production value and presentation matter more than, you know, awesome or useful or clever code?

      1. 4

        Finally…when the hell did we get to this coding as performance art thing? This dramatization and theatrical approach just runs me the wrong way. When did we decide that production value and presentation matter more than, you know, awesome or useful or clever code?

        This is cynical and unfair. The two don’t have to be mutually exclusive.

        With all that’s going on in the world these days, let’s not let the nihilists win.

      2. 4

        This is just a bunch of product reviews.

        I’m sorry you feel that way, I tried to write down most of the things I learned from watching countless youtube videos (since this stuff is rare to read in written form), but I guess it’s not perfect… But really, try to forget you read that and go buy a setup. What do you buy? How do you even put a camera on your desk?

        Finally…when the hell did we get to this coding as performance art thing? This dramatization and theatrical approach just runs me the wrong way. When did we decide that production value and presentation matter more than, you know, awesome or useful or clever code?

        When I was 20-ish years old, I was sure that good code and great functionality is all that a world needs. And look at me now! I’m using Ecamm Live, because interface of OBS is ugly. Design and experience and marketing are crazy important.

        Also, it’s not a performance art in my case, I’m just doing the usual stuff I do, but I hate those videos where you sound like a dying dog inside of a steel pipe filmed through a peephole. Honestly, I’m not doing that just to fill YouTube servers with useless data, I hope some people will watch that stuff — and nobody will watch low quality content. Nobody watches it right now, actually, hahaha. :))

        Also, it’s a little weird seeing somebody drop hundreds of dollars in kit without even knowing what they’re gonna be streaming.

        Do you have a car? :) When I sold my Z4 after owning it for 20 months — and I bought it really cheap — I calculated it costed my $550/month. My brother’s WRX amortized over 6 or 7 years and after selling turned out to be $400/month. I mean that little hobby is just peanuts compared to what cars cost. And I knew I’m going to stream! I had no idea if it’ll make any sense or if I’ll like it, but that’s another thing, right? You won’t know without trying. Also, now my Zoom game is over the top, which I just love. :-)

        1. 1

          Well, for most of us, $550 is a (good chunk of) rent/mortgage. I’m fortunate enough to be able to live without a car, but for many people it’s a necessity - that I’m sure most try to keep well under $550/month, and $300 on a hobby we’re not sure about also isn’t going to happen :)

          So, I admit to being icked out as well. Your story contains a lot of product photos, even of things you didn’t buy, so it looks a lot like an advertorial. It’s only the lack of affiliate links that makes me realize it’s not.

      3. 3

        Finally…when the hell did we get to this coding as performance art thing?

        If someone would have told me 2 years ago that one of my current favorite ways to spend my time was to watch people playing video games on Youtube, I’d have laughed in their faces.

        Turns out, if you find the right “content creator”, it’s actually very nice!

        Now, I don’t watch coding streams, but I’ve seen it become a thing online, and I figure, these people are enjoying it, their finding an audience, what’s the harm?

        Now, this particular post might be off-topic for this site, but that’s a separate issue from the phenomenon of coding streaming in general.

    10. 2

      Hm this looks cool! What does the IR look like?

      I was thinking of doing something similar with a common table expression as the base unit of the IR. I don’t know if that’s how ORMs already work, or if that kind of IR is optimized efficiently by engines, etc.

      Do ORMs actually have an SQL IR or are they more like text processors?

      1. 2

        There isn’t exactly an IR. The Preql-Ast is compiled into “interactive” objects that construct the Sql-Ast on the run, by being applied to each other. Then the Sql-Ast is compiled into actual SQL text, according to the target database. That’s basically how SQLAlchemy does it, and I imagine most other ORMs, except that they start with the interactive objects.

        Using CTEs is pretty convenient, but it’s actually bad for performance. Many database engines, for example Postgres, optimize the individual queries inside the WITH, but won’t optimize across the WITH. So unless you’re using it to eliminate repetition, your query will optimize better as a single huge select.

        1. 5

          Since Postgres 12 CTE are not an optimization fence anymore, so it should perform more or less like inlined query.

            1. 3

              Note that that’s only true for CTEs that are used once, if I recall correctly! Then they will be inlined. Otherwise they’re still an optimization fence.

        2. 1

          OK interesting, thanks for the info.

          I’d be interested if sqlite can optimize across CTE’s… That is my likely target. if anyone knows or has a reference, please share :)

    11. 2

      Don’t really want to rain on your parade but you basically built a 720p without microphone which has high chances of failure (SDs are not the most reliable components out there).

      While that is definitely a great learning project (I did something similar myself before for fun), you can get 720p with microphone, longer durability and warranty for peanuts nowadays (around $30 or so). Also you can find some 1080p webcam (with microphone obviously) for about $55 on Amazon today.

      So, I’m not sure your setup is necessarily economically smart, especially after including work and future maintenance. Kudos for making things yourself though! 😊

      1. 4

        With a much better sensor and lens though. I’d be concerned about latency and agree it would be a sensible extension to figure out onboard audio.

        The thing I keep hoping to find in a webcam replacement is full 3D LUT color correction.

        1. 1

          The thing I keep hoping to find in a webcam replacement is full 3D LUT color correction.

          Hm, what would you use it for?

    12. 11

      I always felt like SPAs were created to make data on pages load faster, while simplifying for mobile web, but they ended up making development more complicated, introduced a bunch of bloated frameworks, required tooling to trim the bloat, and ultimately tried to unnecessarily rewrite HTTP.

      1. 13

        Yeah, we started with “no need to load data for the header twice” and ended up with bloated multi-megabyte javascript blobs with loading times in tens of seconds. :(

        1. 6

          I think the focus shifted more from “need to load data faster” to “need to be able to properly architecture out frontend systems”.

          Even though I still “just use jQuery like it’s 2010”, I can’t deny there aren’t problems with the ad-hoc DOM manipulation approach. One way to see this is that the DOM is this big global state thing that various functions are mutating, which can lead to problems if you’re not careful.

          So some better architecture in front of that doesn’t strike me as a bad thing as such, and sacrificing some load speed for that is probably acceptable too.

          That being said, “4.09 MB / 1.04 MB transferred” for the homepage (and that’s the minified version!) is of course fairly excessive and somewhat ridiculous. I’ve always wondered what’s in all of that 🤔

          1. 5

            “need to be able to properly architecture out frontend systems”

            An absolute shitload of websites are built with React that could be built entirely with server rendered HTML, page navigations, and 100 lines of vanilla JS per page. Not everything is Google Docs.

            Recent example: I recently was apartment hunting. All the different communities had SPAs to navigate the floor plans, availability, and application process. Fancy pop up windows when you click a floor plan, loading available apartments only when clicking on “Check Availability” and so on.

            But why? The pop up windows just made it incredibly obnoxious to compare floor plans. They were buggy on mobile. The entire list of available units for an apartment building could have been a few kilobytes of server rendered HTML or embedded JSON.

            Every single one of those websites would have been better using static layouts, page navigations, and regular HTML forms.

            1. 5

              One reason for a lot of that is that people want to build everything against an API so they can re-use it for the web frontend, Android app, iOS app, and perhaps something else. I wrote a comment about this before which I can’t find right now: but a large reason for all of this SPA stuff is because of mobile.

              Other than that, yeah, I feel most websites would be better with just server-side rendering of HTML with some JS sprinkled on top. But often it’s not just about the website.

              1. 6

                I don’t think an API and server-side rendering have to be incompatible, you could just do internal calls to the API to get the data to render server-side.

                1. 5

                  That’s what we’re doing. We even make a call to API without HTTP and JSON serialization, but it’s still a call to an API which will be used by mobile app.

                2. 1

                  Having done this, I feel this is the way to go for most apps. Even if the backend doesn’t end up calling a web API, just importing a library or however you want to interface is fine too, if not preferable. I’m a big fan of 1 implementation, multiple interface formats.

            2. 1

              I’ve worked in that industry. A large portion of it is based on the need to sell the sites. Property management companies are pretty bad at anything “technical,” and they will always choose something flashy over functional. A lot of the data is already exposed via APIs and file shipment, too, so AJAX-based data loading with a JavaScript front end comes “naturally.”

      2. 3

        I agree. So, to answer the titular question, I would answer: websites and native applications.

        Developing “stuff” that feels like a website fitting the HTTP paradigm is mostly straightforward, pleasant, inexpensive, and comparatively unprofitable.

        Developing “stuff” that feels like an application fitting the native OS paradigm is relatively straightforward, occasionally pleasant, often expensive, and comparatively unprofitable.

        If we’re limiting our scope to a technical discussion, it seems straightforward to answer the question. Of course, for better or worse, we don’t live in a world where tech stack decisions are based on those technical discussions alone; the “comparatively unprofitable” attribute eclipses the other considerations.

      3. 1

        That’s how I remember it. I also remember building SPAs multiple years before React was announced, although back then I don’t recall using the term SPA to describe what they were.

    13. 2

      Starship has a bunch of this built in, including conditional username display. I love that the username for starship only shows up conditionally:

      It also has command runtime for slow commands, colours for exit codes, AND it runs incredibly fast. It’s one of the best things I’ve added to my CLI.

      1. 2

        You could argue that those things described are also built-in in zsh. :))

        1. 1

          Not OP, but one of the things I like about starship is that I can use it from different shells, and it looks the same. So at my work, where I can’t use fish, I can still use starship.

    14. 4

      I have had pretty good experience so far using Starship. It doesn’t have any significant slowdown, and it offers a wide array of informational and formatting options.

      1. 3

        +1 to starship - it’s nice having the prompt character change color and shape when the previous command fails:

        dotfiles on main
        △ false
        dotfiles on main
        × true
        dotfiles on main
        1. 2

          Changing shape is easy enough with the code from the article, no need for separate binaries:

          1. 1

            good point! I have been tempted to ditch starship for some native fish shell goodness … someday :)

    15. 7

      I wonder why people seem to generally dislike two line prompts? I’ve fell in love with idea as soon as I’ve realized it is possible:


      Having command at a fixed offset form the terminal edge makes scanning history of commands much easier, full path gives essential context, and time & git status are just nice touches.

      Am I missing some reasons why single line prompts are more convenient?

      1. 5

        Those are my reasons: predictability/ease of scanning. Most of my prompt is built to be as visually quiet as possible to help me focus. It’s a micro-optimization, but I love the feeling of it.

        The only real issue with two line prompts that I know of is that fish has a few open bugs around handling of redraws in the presence of those. But that’s about it.

      2. 5

        For me it’s mostly about reducing visual noise, everything I want in a prompt (directory and git status) fits comfortably in one line, optionally showing the exit status if it’s different from 0.

      3. 2

        I just stick all that crap into RPROMPT, why waste two lines with optional information when the right prompt can deal with it and get overwritten if what you type gets longer?

        1. 1

          I‘ve tried the right prompt, but two lines work better for me, amusingly, for exactly same reason :)

          Vertical space is cheap win infinite scroll, horizontal spaces feels significantly more crowded.

          1. 2

            We’ll have to disagree I guess then. My prompt in $HOME is literally:

            $ .......a long way over to the right ->~

            The right fills up with git status and dir information as I chdir around but otherwise I can’t stand my prompt taking up a bajillion characters based on the directory I’m in. I want all the crap i type to be at index 2 always. But thats just my weirdness really.

            Also means less lines to delete when copy/pasting my history.

      4. 2

        My prompt is 2 lines, but the second line has nothing. It’s really nice to start commands at column 0, and nudges me to use multi-line commands more.

      5. 1

        Mine is two lines as well, which kind of freaks out some people who don’t know it can even do that. This is mine, but I’d like to check out the return value stuff:

        # define the unprinting start and end escape sequences, so bash doesn't
        # count these as taking up room on the command line
        # Setup color variables
        export PS1="*** $LCYAN\@$NEUTRAL *** $YELLOW\w$NEUTRAL  ***\n"

        I add the \h for host name in my work shell, because I ssh to a lot of places

        1. 2

          You can replace all that color stuff with %F{...} syntax, it’s going to be more readable.

          1. 1

            That just shows you how long I have dragged this along! And is that true as well for bash?

            1. 1

              Honestly, idk. I discovered %F stuff a week ago, before that my config had a lot of vars with ANSI escapes too! :)

      6. 1

        Like you I prefer two line prompts primarily for the ease of scanning. My informational line does not drastically differ between locations and projects, but having a set size/location for my commands makes it very easy for me to scan.

    16. 4

      Indicating the return status of the last command is really useful, but I’ve found that just colour is too subtle most of the time. If you do this, you may want to experiment with putting the error code in the prompt if it’s not zero. Something like


      to print (1) if the exit status was 1, for example.

      1. 1

        Riiight, I tried doing that and didn’t like it. So far that color-coded prompt seems to be noticeable enough for me, but maybe that’s because of novelty, I’ll see. :)

        1. 1

          Fwiw, I just use:

          setopt print_exit_value

          That way my prompt never changes, and I get the actual return code printed out blatantly like so:

          zsh: exit 129   ~/

          I’d rather know the actual return code, and do nothing if everything is ok and not clutter the prompt at all.

          1. 2

            That’s exactly what I replaced with colorful >. :-) I’ve been living with print_exit_value for a long time but it’s nice when command output is exactly command output. :)

            1. 1

              Heh, I’d rather have the extra lines in this case so its obvious that something exited non zero. To each their own!

      2. 1

        for those who care about a visible exit code, I found

        setopt PRINT_EXIT_VALUE

        a nice personal solution. It does not go in your prompt

        1. 1

          I gotta refresh my tabs more often heh jinx but I agree entirely.

      3. 1

        IMO, I’m not a huge fan of this, as the fact that the exit status is not zero does not always mean an error has occurred. I suppose it really comes down to workflows and what tools you use all the time. But my prompt turning red because my compile failed is not any more informative than the three pages of errors I got prior to that. :)

        1. 2

          …exit status is not zero does not always mean an error has occurred.

          Indeed. I just use it as a useful point of information for those commands whose error codes are siginificant or I can’t tell actually failed. It’s always a personal preference thing, though.

    17. 5

      As someone with only tangential exposure to databases, the way offsets affect performance was new to me. (I thought the point of alternatives was simply a matter of offering some stability to the result set.)

      When using a pointer to a specific row as the basis of the pagination, how do you normally handle having to remove the row? Do you soft-delete it, or just sacrifice bookmarked URLs referencing it?

      As for using an ordinal aspect, such as a timestamp, one thing that can happen — and has coincidentally happened to me recently with a tool I’m using — is ending up with more than N records sharing a timestamp, where N is the number of results per page, thus compromising the pagination.

      1. 5

        The row doesn’t need to exist, it’s just a boundary for a WHERE clause. This query will work correctly regardless of what exists in the DB:

        SELECT * FROM stuff WHERE key > 100 LIMIT 10

        This pattern is also applicable to non-SQL stores like S3, which provides the start-after parameter to ListObjectsV2 for the same functionality.

      2. 1

        more than N records sharing a timestamp

        In this case you need this “tie breaker” column, so you order by timestamp, id or something like that, and then use a pair of [timestamp, id] as a cursor.

    18. 3

      Nice, I wish there was a benchmark as suggested elsewhere. The article confused me a little because of how it uses the word cursor (I thought it was referencing database cursors, which is an alternative for offset pagination as well, but it’s just talking about row IDs…). To me this is more keyset pagination, no?

      I’ve used keyset pagination instead of simple offset with great results. Although it was tricky to implement in cases… when you start adding multi sorts and forward / reverse pagination, it gets a little complicated making sure the query is properly bounding with the “previous” result set.

      Also, being able to use id as a bound for the next page only works sometimes (if you use v1 UUIDs for example, which have time encoded in them, it can work, but v4 UUIDs it can break).

        1. 2

          Direct link to the benchmark results embedded in that article:

      1. 1

        Here’s a page about a similar pagination technique it calls “seek”, with a benchmark showing that it is much faster than “offset” pagination when visiting later pages:

      2. 1

        To me this is more keyset pagination, no?

        It is a keyset pagination, somehow this term never got to my conscious. :)

        Also, being able to use id as a bound for the next page only works sometimes (if you use v1 UUIDs for example, which have time encoded in them, it can work, but v4 UUIDs it can break).

        Yeah, id in this case was just a simple example, I assumed a simple autoincrement integer field. :)

    19. 2

      What are the actual gains in performance? A benchmark would be really interesting.

      1. 4

        We ran benchmarks at my place of employment. The read time increases linearly in relation to the offset, and is problematic for some of our customers with large numbers of entities. We have a small minority of calls that take over 500ms due to large offsets, which is terrible. This ruins our p999 times. The benchmarks were run sequentially & randomly, doesn’t seem to affect the performance much (Postgres)

        On the other hand, using a cursor is constant in relation to the offset.

        Unfortunately we’re going to have to go through a deprecation process now to sort this out :(

      2. 2

        Here is more information on the topic including a reference to slides with benchmarks (see page 42 for the comparison).

      3. 2

        I added some really simple comparison and a link to Markus’ article to a post.

        1. 1

          Great! Thank you :)

    20. 6

      I solved this problem by lifting their query DSL into types and making it (as much as I could, anyhow) impossible to construct an invalid query:

      1. 2

        My problem was less invalid queries and more that code which built queries to ES from user’s input wasn’t very clear. In fact, it was hard to read and hard to update.

        So I think types may help, but the overall approach is what’s more important.

        1. 2

          more that code which built queries to ES from user’s input wasn’t very clear. In fact, it was hard to read and hard to update.

          Yes, that’s why I wrote Bloodhound. I know people that don’t use Haskell that still use Bloodhound anyway to generate complicated queries, using the Haskell code as a nicer, more maintainable template in effect. Look at the tests for example:

          I mentioned invalid queries because that’s the harder problem to solve. Just making something that’ll at least tidy up the API is the first step. Tightening it up so you eliminate opportunities for users to make query structures that don’t make sense is where it starts to really come together.

          The types are the interface that make it self-documenting and easier to maintain. I’ve been using ES off and on since pre-1.0 and I hated the string blob templates that I had in Python before. After I learned Haskell, I had the idea that you could use types to reify the query DSL into an interface. And I was right, it works great.

          1. 2

            Sorry, I fail to see how types are more maintainable than just plain maps. This thing:

                  let query = TermQuery (Term "user" "bitemyapp") Nothing

            is not better than just {:term {"user" "bitemyapp"}}. I’d argue it’s worse since you have to know the mapping rather than just writing this stuff directly.

            What I’m talking about is one step higher: it’s a design of a query builder. Not of a query itself, this thing is awful and ElasticSearch receives a lot of heat online for its query language design, and not for nothing. But building those queries has nothing to do with types. Invalid queries were not my problem.

            1. 4

              is not better than just {:term {“user” “bitemyapp”}}.

              It is better because Elasticsearch changes their API with some regularity and with types when we update the type in Bloodhound to match the new API structure, you’ll get a list of type errors everywhere in your code where you need to fix it before your stuff will work. You can get migrations done a lot faster. This isn’t hypothetical: we have production users that love this. This is also a facility of types in general.

              You probably aren’t aware but I was a Clojure user before Haskell and maintained some libraries like Korma. I know what it’s like to maintain production Clojure code.