1. 2

    I think query builders are pretty nifty tech in that their surface area is super small, and help out a bit (though tbh manual joins when you have a lot of relational data is a pain)

    What I still haven’t seen but want is some way for us to have client-side plan builders. Instead of giving Postgres an SQL string and for it to pass through an optimiser, I would love to be able to describe a query plan which sidesteps the planner and gives the executor the “ideal” way of handling things for the shape of my data.

    I actually know what my data looks like, and in some cases I know that if I could convince the planner to do a thing, it would go much faster. But this is rarely given to us. Given how many people love dropping down to C/assembler for stuff, I’m surprised to have never seen someone try to circumvent SQL to build nicer plans.

    1. 1

      Do you experince a lot of query planner inefficiency? I’ve noticed that (at least for MySQL/Percona) it provides decent results as long as you don’t have any low cardinality indexes.

      1. 2

        So for 95% of stuff this is absolutely no problem.

        There’s a couple really tricky queries where I have chosen to optimise for maintainability of the query/just relying on the ORM over trying to get the machine to do “what I want”. This often ends with me reaching for subqueries that just conceptually can’t be aware of the data properties I’m aware of. Stuff like “well for rows with this property I don’t actually need this subquery result”. But this stuff is hard

    1. 4

      In my experience, there has always been more nuance between when to use ORMs/query builders/raw queries. There’s no reason why you can’t use an ORM for inserts/updates/simple selects and drop down to raw SQL for more complex queries. Query builders vs. raw SQL is another tradeoff: with query builders you have to learn new syntax, and it makes things more difficult to debug in production ( especially when paged at 3AM ;) ). With raw SQL though, you can write broken/insecure queries.

      1. 6

        One thing that happens with ORM, at least in the 20 odd years I’ve been grappling with them, is that they encourage object thinking, which is almost without fail a gigantic detriment to the design of the database. ORM thinking is closely correlated with the “nobody will ever access this data without the gigabytes of runtime” school of thought, and tends to impose severely suboptimal decisions onto the database, which is treated as a simple persistence layer.

        1. 1

          In my experience that has less to do with ORMs and more to do with people designing their objects before (or instead of) designing their tables. Hand-rolling all of your SQL won’t save you if you didn’t put the effort into the table-design (although it might alert you to how dogshit your tables are to work with, that assumes you didn’t farm all of the boilerplate queries out to some junior who doesn’t know any better and doesn’t complain, and you’re still stuck refactoring your objects if you started the wrong-way around). It’s the upfront table design that’s truly the important thing, not the mechanics of how queries get written afterwards.

          With proper upfront Table design, I’ve found ORMs useful for eliminating a lot of boilerplate scut-work, while still allowing direct access to SQL whenever you need to get smarter with your queries.

          1. 2

            But I don’t deny that actually modelling your data before getting started is critical.

            1. 1

              Well, sure. But ORMs push the pain further out until it’s probably too late to fix the catastrophe.

            2. 1

              Biggest complaint for me is ORMs for updating information on data objects that are deeply nested.

              From a performance standpoint, what should be a one-off update_one or UPDATE turns into the clown car of loading in the object, fetching associations, eventually setting a value, and shoving it all back into the DB.

              Granted, I also hate deeply-nested objects, soooooo…

              1. 1

                Relatedly, transactionally updating a single field via an ORM is a deadlock footgun (since it’ll fetch without using for update, then try to lock the record).

          1. 2

            How does this book compare to the other books about competitive programming? Eg. https://cpbook.net/ or https://cses.fi/book.html ?

            1. 4

              I’d like to know what people are using for the diagrams I always see in these books, and in technical presentations.

              The ones where we have a nice diagram of a data frame and its fields, or of a stack and how the frames are aligned in order.

              Are people just preparing these in GIMP or is there a simpler answer for vector graphics like these?

              1. 2

                I think people typically use LaTex for these sorts of diagrams. You can also use a tool like plantuml to generate a certain class of diagrams.

                1. 2

                  In LaTeX there’s TikZ; in groff there’s pic; in markdown/html you can try ditaa or some later implementations of a similar idea (quite a lot of reinvented wheels, most of them interesting). Other than that there’s dia or other standalone apps for drawing diagrams. Then, there’s Inkscape, Adobe Illustrator, and other generic vector drawing programs. And finally, you can fall back to raster drawing.

                  1. 1

                    I’m not sure what people are using in general, but the author is using Monodraw for some of his diagrams.

                    1. 1

                      A lot of Mac users use omnigraffle and I bought a license this year. On the surface it doesn’t appear to be extraordinarily feature ridge for a vector drawing tool, but it is incredibly good and fast for drawing illustrations for technical documents.

                    1. 2

                      A link to something more substantial than Amazon would be appreciated.

                      1. 1

                        Looks like @amontalenti managed to find a talk by the author about the book in the comments.

                      1. 3

                        Have you read it? Would you mind reviewing it? Who should read it and why?

                        1. 3

                          I’m about halfway through right now. I’d recommend it to anyone writing software; it’s like a shorter, more concise version of “Code Complete” but with better examples and with some unconventional wisdom (such as long methods aren’t bad if they encapsulate complex logic from start to finish)

                        1. 2

                          Interesting article!

                          The method getRestaurantMenus, when simultaneously invoked by many coroutines, will result in one of the coroutines winning the race condition and successfully entering the body to execute fetchMenuFromRemoteCacheOrDatabase.

                          It looks like this is solving the cache stampede problem with the locking approach, but using deferred coroutines for the locking. Couple of questions for the author:

                          1. Have you considered working with a CDN cache to eliminate stampedes? With a one second cache, DoorDash should be able to reduce the number of incoming requests to a single menu to the number of CDN PoPs per second.
                          2. For the other requests that are waiting, do they serve stale data and return, or just wait until the winning coroutine’s database read completes?
                          1. 2

                            Hey, If you look closely we are using the deferred not as a using mechanism but as a grouping mechanism. The best part about this approach is the late comers. So if your reads are expensive the readers coming towards the end (when Deferred is about to be fulfilled); see lesser latency. To answer your question:

                            1. The above mentioned scenario is just used as an example, of-course one can use CDN to prevent for this scenario. We have done something similar at places where it was applicable. We use this techniques at various places including identity systems, where putting up such information will be a bad idea.
                            2. Other coroutines just wait for winning coroutine to complete it’s read. You can have all sort of variations on top of it, e.g. have some sort of timeout and return stale data if scenario permits or start your own DB read. The gist resides in using promises to avoid repeated reads.
                          1. 4

                            Have you considered leaving voicemails directly on someone’s phone without a missed call? It’s possible to dial into voice mailboxes directly (eg. SlyDial).

                            1. 2

                              Hm, that’s a possibility we haven’t explored yet, thanks for the idea!

                              However, accessing your voice mail while on roaming costs money, doesn’t it?

                              1. 1

                                However, accessing your voice mail while on roaming costs money, doesn’t it?

                                Can cost money even when not roaming depending on your contact, it’s only been the last few years in the last 15 I’ve had a contract that hasn’t charged extra for voicemail access. (Although I guess if I’m in my residential country I could just get emails over data, so that’s less of a concern.)

                            1. 9

                              I wouldn’t write off desktop mail clients just yet. Both Pine (without Maildir patch) and Eudora use the mbox format, which is prone to corruption as it is essentially all of your messages concatenated in one plaintext file. For large volumes of email, it’s better to use the Maildir format, which stores each message in a separate file.

                              1. 6

                                Maildir is certainly better than mbox (which is an abomination), but it is still susceptible to the problems outlined in the OP.

                                1. 1

                                  Doesn’t thunderbird store messages in a SQLite DB?

                                  1. 1

                                    I dunno. My thunderbird corrupted (again) recently and i gave up.

                                    1. 1

                                      It uses the mbox format too. I wonder indeed why they don’t use something more robust like SQLite.

                                  1. 4

                                    Or… just stick with short options and reduce the clutter. After all, help is just a manpage away.

                                    1. 7

                                      Two counter-arguments:

                                      1. When reading commands you’ve written (in a shell script, for example), long opts are much more self-explanatory.
                                      2. Sometimes you need more than 26 flags

                                      Edit with a third: when two common ops occupy the same alphabetical space, such as --version and --verbose.

                                      1. 7

                                        Everybody knows -V for version and -v for verbose. :)

                                        1. 2

                                          Except for the cases where it’s the other way around, and whenever there isn’t a verbose switch -v is generally used for version. Sort of annoying when you have to look up the version flag in the manpage (if it exists) if you don’t want to program to actually start doing whatever.

                                          1. 2

                                            I’m also very not-a-fan of having to guess whether the syntax for “more verbose” is -vvv, -v3, -v=3, -v 3, –verbose=3, –verbose 3, …

                                            But this is a lost cause.

                                            1. 4

                                              This is the chapter of the Unix-Hater’s Handbook that has held up the best; X11, while still atrocious, has gotten less warty and terrible. C++ is a much better language than it was in 1991. sendmail is functionally extinct. But the state of Unix terminal software is still utter pants.

                                              1. 1

                                                I would have guessed that “pants” as an intensifier had positive valence, rather than negative. Does the manpage…? :)

                                                Heh, anyway, as you say. It’s nice to at least be able to look at progress in some areas.

                                    1. 2

                                      If anyone is interested in more advanced RE exercises, I have found wapiflapi’s exrs to be a very good series.