You can, but should you?

    1. 9

      It depends!

      For something like sum(), average(), max(), or the like, the answer’s an emphatic yes: despite doing “more” on the SQL server, you’ll actually consume less RAM (tracking a single datum of a sum/average/max/min, rather than building up a full response list) and send less over the network. That’s a definite win/win.

      For some of the other stuff in the article, the answer’s murkier. String operations in general, including the ones they’re mentioning (e.g. GROUP_CONCAT, which is specific to MySQL, but has equivalents in other databases) are fast in MySQL, but not necessarily in other databases. On the reverse side, some complex but amazingly useful queries (such as subselects) that are fast in SQLite and PostgreSQL are slow in MySQL, because MySQL’s design requires the generation of temporary tables (or at least still did as of roughly a year ago). PostgreSQL can likewise do some complex JSON ops server-side highly efficiently, and SQL Server can do similar stuff for XML, but I’m not sure I’d recommend doing that in, say, SQLite, even if you mechanically can (via extension methods), because it can defeat the query optimizer if not very carefully implemented.

      So: should you? Sometimes! Unfortunately, you need to learn your database to know the answer.


        Also, speed concerns aside, declarative languages are nice to read and write, as the author points out at the beginning of the article.


          String operations in particular seem to vary widely in how easy or hard they are between database engines. I remember being amazed at how few string functions MS SQL Server has - IIRC, substring and indexOf and that’s about it. Super clumsy to do anything but the most basic things. On the other hand, PostgreSQL has a full regexp find and replace engine and enough functions to do just about anything you could want.


          Another benefit is that you can put all your SQL code into stored procedures, and then the rest of your code can just call that functionality. Use multiple languages and it’s still all standardized in your stored procedures.


            Aren’t stores procedures a config management night mare? As in you distribute your business logic between code and and the db instance.


              I would argue that no business logic belongs in the stored procedures, but I’d further argue that making statements about your relations isn’t business logic. If you’re using a database as a source of truth, that’s basically all it should be worrying about: what is true about your domain, and what may be true about your domain (constraints).

              As for versioning- no, stored procedures aren’t a config management nightmare. It’s just that few organizations bothered to put any config management around them for decades. It’s not hard to implement versioned update scripts which roll the database schema forward or backwards. Honestly, it’s easier than some deployment solutions I’ve seen for application code.


            If there are going to be multiple consumers of the data, then it’s probably a good idea to make sure that the data as stored in the DBMS is correct under the definition of correctness you’re using. That can be a strong reason to pull much of the logic around the data into the DBMS itself. There are other concerns that pull in the opposite direction, of course, and perhaps you simply are using your database as a simple persistence layer. But if you expect ad-hoc reporting, for instance, or systems that depend on the data in the database being canonical, you are definitely a strong candidate for moving the logic into the DBMS.


              For small data that you are more frequently reading than writing (and you care about performance), you shouldn’t.

              Because then, it makes sense to cache it. Read everything once on a blue moon (whenever it changes) instead of doing many small reads in your fast path. Your own internal datastructures will easily outperform SQL: Simple lookups are the low-hanging fruit; then if you need the power of a relational database, I made a simple binary search based framework for left-joining tables of tuples in C++, and so can you.

            1. 3

              Deliberate Git by Stephen Ball remains my favorite talk on the subject.

              1. 6

                Unfortunately I don’t use Clojure. My setup for generative “art”* is Rust and Cairo. I recently tried using Python and Cairo and found it a lot easier to work because it was more forgiving.

                Tyler Hobs talks about using Quil for artwork here, and Ben Kovach mentions some things about infrastructure and tooling for generative art that you might find interesting.

                Quil looks really cool so I might try that out next :)

                * I’m putting quotes around art because I don’t think what I make is art. I just have fun by writing simple programs and adding some randomness. There’s not a lot of purpose or emotion behind what I do, since a lot of it is for learning rust.

                1. 2

                  Do you have any examples to share? I’m interested in generative art but never really have the gumption-time to do it.

                1. 14

                  I’m all for unionizing, not out of “unionize all the things”, but because I’m interested in how different power structures might effect different results: experimentation, if you will, but not at the cost of free association.

                  I grew up in a family of public school teachers. I’ve seen unions mess up badly.

                  1. 24

                    My wife is a teacher in the public system and we both have mixed feelings about the union representing teachers. It does some wonderful things, but then goes out of its way to defend some absolute garbage people simply because they are in the union. And heaven forbid it if you have any critcisms of the how the union operates. Neither of us think it’s so much a union thing as it is the lack of care put into building a large organization since similar problems exist in companies.

                    1. 14

                      I’m not super-attached to the idea of unions, but it’s pretty obvious to me that we are getting exploited by the companies–especially startups–that we work for.

                      I’m not sure that a full-blown union system is the answer, mostly because I trust the soft skills and systems thinking of engineers about as far as I can thrown them, but we need to start organizing as a class of labor on some basic things that keep screwing up the market for all of us:

                      • Forced arbitration
                      • Broad NDAs
                      • Broad non-competes
                      • Broad assignments of invention and other IP
                      • Lack of profit sharing
                      • Bad equity for early-mid stage engineers
                      • Uneven salary systems

                      Every company and startup gets some of these wrong, and few (if any of them) right, but because it’s accepted as “standard practice” we all end up having to endure them.

                      I don’t think we can find a one-size-fits-all solution for, say, salary ranges or other more esoteric issues, but my belief is that those specific things enumerated above are both achievable and universally beneficial for developers. They would benefit both the folks that think they can be the smartest engineering in the company and somehow make out like in the 90s, and the lifers who just quietly and competently do their jobs and switch companies when it’s time.

                      We need to push for them.

                      1. 2

                        “I trust the soft skills and systems thinking of engineers about as far as I can thrown them”

                        I was a bit surprised to read that. I know engineers are infamous for falling short on “soft” skills but isn’t systems thinking supposed to be a forte of engineers?

                        1. 2

                          One would think so!

                          In my experience the first thing most smart (note: not wise, just smart) engineers reach for when consulted with a misbehaving situation, especially involving humans, is a system. They have this idea that some intricate set of deterministic protocols and social customs will save them from the ickiness and uncertainty of dealing with other sentient rotting meat. They’re invariably wrong.

                          Outside of dealing with other people in meatspace, my current work in web stuff has similarly colored my opinion of “systems thinking”, to the point where I basically don’t trust anybody to reliably engineer anything larger than a GET route backed by a non-parameterized query to a sqlite database–they tend to want to add extra flexibility, containers, config files, a few ansible scripts for good measure, maybe some transpiler to the mix to support a pet stage 1 language feature, and all this other nonsense.

                          So, sadly, I’m reluctant to trust those folks who overengineer and underempathize to successfully build and manage a union.

                          1. 2

                            Engineers are famous for thinking that a new bit of technology could revolutionize systems which include human social behaviors.

                            I’ve met 2-3 engineers in the past decade who I would call ‘systems thinkers’. I’d like to make it onto my own list, someday.

                        2. 6

                          I have on my reading list https://www.press.uillinois.edu/books/catalog/47czc6ch9780252022432.html, which talks about the self-organized unions in 1930s that preceded NLRB, and the ways in which they were more democratic and more responsive to membership.

                          1. 1

                            I eagerly await your synopsis of it and maybe I’ll pick it up myself. I enjoy your writing!

                            1. 0

                              Taft-Hartley in the 1950s had a terrible effect on unions, partly by banning wild-cat strikes and boycotts both of which forced union leaders to be responsive to members.

                          1. 1

                            @SirCmpwn This is pretty neat! I’m always interested in this stuff; I was an early user of and contributor to Gitlab and Gitbucket.

                            I signed up and poked around. One thing I was kind of expecting is some kind of a list of projects or users currently on sr.ht, specifically git.sr.ht. Is discoverability of projects, users, or organizations on the service a goal?

                            1. 1

                              Thanks for signing up! These sorts of social features are something I’m avoiding for the time being, to focus on delivering good engineering tools first.

                              1. 1

                                Excellent goal! Keep up the good work.

                                One last question: what are your plans for code review?

                                If I may offer something, I’d love to see more web interfaces for the git-appraise way of doing reviews, which stores the review data in the git repo itself.

                                1. 1

                                  Thanks for the kind words!

                                  I intend to build this on top of mailing lists, actually, based on the workflow used by projects like Linux and git itself. Today lists.sr.ht provides minimal code review tools, like simple patch highlighting:


                                  This will be fleshed out to allow you to participate from the web with the same kinds of emails being sent underneath.

                                  1. 1

                                    I’ll be watching to see how the workflow works out.

                            1. 3
                              • Another contribution to an OSS project on behalf of work was accepted, but this time, a release is months away instead of days away. So, we’re scoping out the effort necessary to maintain an internal fork with a faster release cadence.
                              • Abstractions conference planning is really ramping up - we got our first sponsor last week and should hear from a few more this week before commencing a big push to get on 2019 budgets.
                              • I’m investigating build a CLI conference sponsorship contract generation tool because I’ve really come to dislike Google Docs for contract writing. I’d welcome any pointers to tools that exist that could facilitate this. I’ll probably bang out something with Ruby + ERB & LaTeX + Pandoc in an evening…
                              1. 1

                                optar is another option, better for data that’s more than ~2 KiB.

                                1. 3

                                  Not really ready-made tools but some good overview of the theory behind audio segmentation:

                                  Maybe a NN/GRU-based approach like what has been integrated in Opus 1.3 may be a good practical starting point: https://people.xiph.org/~jm/opus/opus-1.3/

                                  1. 1

                                    Neat, thanks for the links!

                                  1. 2

                                    Something you might want to try instead is looking for an open-source closed-captioning tool. If you have a variety of news sources / speeches / etc etc that you run CC on, then you can create a big database of speech transcriptions. Later, when you have a “suspicious” recording, you could CC that too and then do an index lookup against your DB to see how closely it matches original context.

                                    It’s not the same as detecting edits “from nowhere”, but it could be useful for quickly spotting shady editing for high-profile sources like political figures and celebrities.

                                    1. 1

                                      I’d considered automatic transcription, then fall back to the harder stuff, but I’m concerned that the quality might not be enough. However, presumably, the transcription system would output the same text for the same input, even if cut up, right?

                                    1. 1

                                      If I remember right, the most common way to do this is to build a spectrogram of your sampled audio (which is basically an FFT over time) and look at what spectrograms of reference audio it appears to be a subset of. There’s no reason why you couldn’t adapt another implementation to report not just a match, but also where the match was found. You might find it needs tuned because there’s more information carried in music than speech, or that the overall approach doesn’t work too well, but it’s what I know for now.

                                      As an aside, what does “silence on the waveform” actually mean? A zero crossing point? A number of samples all at 0? This might be a worthwhile step forward but it’s trivially defeated by overlaying small amounts of noise, or carefully putting the two subsections back together after removing a word, etc.

                                      1. 1

                                        what does “silence on the waveform” actually mean

                                        Forgive me, I’m still building my vocabulary in this context! I think I mean 0 for an extended time: Audacity shows the waveform flat at 0 when zoomed really far in. In the candidate haystacks, there’s virtually none of that.

                                        trivially defeated

                                        Yeah, it would be. Detection beyond a sloppy “copy and paste” job is out of scope right now.

                                        1. 1

                                          Cool. I think it’s a worthwhile and probably interesting project anyway. As mentioned, one approach would be to create a spectrogram, and then identify features in the time-frequency-intensity space, and look for those same features in other places.

                                          There’s probably useful research on this in computer vision, where they instead view it as X-Y-intensity for b/w images.

                                      1. 2

                                        A question: how could you tell who did the audio edit?

                                        A splice by the creator of the podcast should be indistinguishable from a malicious one from someone else.

                                        1. 1

                                          should be indistinguishable

                                          It should be, but in comparing one target needle with a few candidate haystacks, the haystacks weren’t ever silent: there was always a light hum or some background noise when someone wasn’t talking, i.e. the active speaker was taking a moment to pause.

                                        1. 4

                                          This seems to be a lot of, erm, work. Is there a spin of OpenBSD or another BSD that just kinda… works out of the box?

                                          1. 2

                                            OpenBSD-based distros show up periodically but usually disappear. I occasionally Google them to see what’s out there hoping we eventually get an Ubuntu or Mint… even a fraction of that focused on critical things… based on it. The ones I remember finding were Anonym.OS, OliveBSD (link’s dead), and MirOS. Last one still has a website up.

                                            1. 1

                                              TrueOS, nee PC-BSD, based upon FreeBSD?

                                              1. 1

                                                This is close to what I was looking for. I see Project Trident is a spin-off of that…

                                                1. 2

                                                  GhostBSD too.

                                                  TrueOS itself was a ready to use desktop, but now they’re moving towards just being a fork with some differences (LibreSSL, OpenRC etc.)

                                              2. 1

                                                I’ve been very happy with NixOS for that. I’m using it at work for +6months and when I received my new xps13, it was so straightforward to have something close to what I’m used to, that it would be very hard for me to go back to something else…

                                              1. 4

                                                Several of my peers at Arcadia.io’s Pittsburgh office are hiring.

                                                If you see “data ingestion pipeline” in any other job description, that’s my team. I’m building a pipeline right now for a start sometime later in Q1 next year on my team doing stuff in Scala, Groovy, Ruby, and Rust.

                                                1. 1

                                                  I don’t suppose you’re looking for Remote engineers or sponsor Visas? Couldn’t find anything in the posting.

                                                  1. 1

                                                    We’d consider remote for someone who fits the position perfectly or is willing to move near Pittsburgh or Boston within a few months.

                                                    My team is actually “local remote”: we all live in the greater Pittsburgh area but only go into the office once a week. However, the other teams are the opposite: they work in the office daily but remotely one day per week.

                                                    We do visa sponsorship for the right candidate.

                                                    1. 2

                                                      Sounds good. I’ll give it a shot :-)

                                                1. 5

                                                  I’m picking up my election day materials. I’m probably one of the few Crustaceans that’s in elected office: Judge of Elections for my district! It’s a start on learning more about how elections are conducted. The election on Tuesday will be my second in my position.

                                                  TL;DR VOTE TUESDAY NOVEMBER 6 and you’ll see me if you live within ~500 yards of me, haha.

                                                  1. 3

                                                    I mean, it’s right on https://commonsclause.com that anything bearing the Commons Clause does not meet the definition of open source.

                                                    Is this “Open Source”?


                                                    “Open source”, has a specific definition that was written years ago and is stewarded by the Open Source Initiative, which approves Open Source licenses. Applying the Commons Clause to an open source project will mean the source code is available, and meets many of the elements of the Open Source Definition, such as free access to source code, freedom to modify, and freedom to re-distribute, but not all of them. So to avoid confusion, it is best not to call Commons Clause software “open source.”

                                                    Emphasis mine.

                                                    I proffer that anyone pretending that their copyrighted Commons Clause licensed software is deluding themselves and fraudulently representing their software to be open source when it’s really not. That’s the part that’s not OK to me.

                                                    1. 4

                                                      I was laid off last week[1], so I should be applying for jobs but instead I’m taking a moment to collect myself.

                                                      [1]: Company wide lay offs, unfortunately.

                                                      1. 2

                                                        What kind of gig will you be looking for?

                                                        1. 1

                                                          Not really sure, so that’s why I’m also not rushing into anything.

                                                          Heck, I might even get a minimum wage job for the time being. It’s nice not having to code or meet hard deadlines.

                                                          1. 2

                                                            Yeah it’s probably a good time to take some time off if you’ve gotten a decent severance.

                                                            1. 1

                                                              if you’ve gotten a decent severance

                                                              I got three weeks of severance, so probably will need to figure out something money-wise fairly soon.

                                                      1. 2

                                                        The first round of Abstractions conference sponsorship pitch emails are going out later this week after getting a verbal commitment from what may be our top sponsor late last week!

                                                        The thing I’ve been building for work for the last 16 months is going into customer hands later this week! There are still some bugs to work out, though.

                                                        I counted up my hard drives and I’ve got nearly 30 drives with nearly 36 TB of storage, but only 16 TB of that is powered with only about 8 TB of 12 TB in RAID6 actually occupied. All of the drives are out of warranty now as in the NAS I’ve used for nearly 10 years so I need to figure out something to do with these drives that totally still work while also gearing up for buying a new NAS and populating it.

                                                        1. 10

                                                          Funny how accountability never translates into bonuses.

                                                          1. 3

                                                            Yeah without the ability to provide financial incentives it severely limits the manager’s toolbox. It’s like driving with your elbows.

                                                            1. 2

                                                              OTOH, opportunity sometimes lurks when there are lapses in accountability.

                                                              1. 1

                                                                While I think this assertion is generally true, there’s something to be said about the personal merits of undertaking projects exactly like what is described in the post. I turned a bet on a porting project in Q2 2014 into a leadership position in Q4 2014 (got a small raise) and then into a new product with me as its architect and lead dev in Q3 2015, which led to a new job building a similar product from scratch in Q2 2017 (massive raise). There was another porting project I undertook during that time that failed!

                                                                It all started because I wanted to see if I could make tests for this app run on a Mac instead of having to boot a Linux or Windows VM or push to CI for a red-green-refactor cycle.

                                                              1. 8

                                                                I might actually get around to setting up a Matrix Voice Pi-hat that I’ve had for a few months now. I’d like to have a voice assistant that is more customizable than Google Home!

                                                                I’ve had a Blue Yeti microphone that needs its USB port replaced. I got some USB ports in from China a couple of weeks ago and might try my hand at desoldering and resoldering for the first time in a long time.

                                                                I’m probably going to harvest the rest of my garden before it gets frosty next week. I’ve also got a gaming rig that was hit by a Windows update that broke its USB ports. I got some PS/2 equipment and just need to allow the time to fire it all up.

                                                                1. 1

                                                                  If I understand it correctly, Matrix Voice is essentially the hardware for an Amazon Echo like device. You still depend on the Amazon cloud for the voice recognition? Could you use some Open Source non-cloud software, e.g. snips.ai?

                                                                  1. 2

                                                                    Yes. There are some articles on hackster for setting up Matrix Voice or Matrix creator with Amazon Alexa, Google assistant, or Snips.AI. I also found that the Mycroft project has something called Picroft that maybe adaptable.

                                                                    I followed the Google Assistant tutorial last night. it went pretty smoothly and works fairly well, although I realize that I don’t have an unpowered speaker in my possession right now!

                                                                    Today, I’m going to set up something that can be chromecasting receiver. where I plan on putting this device, that is the Holy Grail thing that I really want because I have this couple hundred dollars worth of Raspberry Pi, niche microphone board, accessories, and a 5.1 audio system from 2005 connected to a Bluetooth receiver that I almost never use… all to avoid paying $35 for a Chromecast audio or $50 for a Google home Mini. I’m basically building my own Google home Max but with a speaker system that can shake my house.

                                                                    Edit 10/20 morning: TIL after about 2 hours of research that Google doesn’t want people making their own Chromecasts basically at all. I might be able to use it as an Airplay server, though, and that might be good enough for my usage.

                                                                    Edit 10/20 evening: I got Shairport-sync working but it broke my Google Assistant setup. Some kernel header weirdness.

                                                                1. 3

                                                                  Something that might take more battery than a purely e-ink solution but perhaps be easier to drive and find would be to use display like what was on the original OLPC. It was a color screen that could go down to a 4 bit, unlit mode that looks absolutely gorgeous in any light for a grayscale display. I wonder if that would be any easier to come by but still accomplish a lower battery usage than a regular display.

                                                                  1. 9

                                                                    The generic name of that technology is “sunlight-readable LCD”, and it’s definitely still available. Also much faster, somewhat cheaper, and more colorful than e-ink. Uses more power, but transflective displays work without backlight, though at a relatively narrow viewing angle. Some of the old Blackberries had them.