1. 25

    First off, I’m not out to belittle Ken Thompson’s efforts here. Writing an assembler, editor and basic kernel in three weeks is highly respectable work by any standard. It’s also a great piece of computer lore and fits Blow’s narrative perfectly - especially with Kernighan’s little quip about productivity in the end. Of course, we don’t know how “robust” Thompson’s software was at this stage, or how user friendly, or what kind of features it had. I’m going to boldly claim it would’ve been a hard sell today, even if it did run on modern hardware.

    Ooh, I can answer this! Ken Thompson’s first version of Unix was just the barebones he needed to run Space Traveler, a game he wanted to port from MULTICS. It wasn’t anything close to what we’d recognize as a Unix.

    Despite that, I’m willing to bet a few bucks there are more people around today (including youngsters) who can program in C than ever before, and that more C and assembly code is being written than ever before.

    I was thinking the same thing. Maybe in relative numbers there are fewer people who understand the low level things, but the absolute numbers are magnitudes higher.

    1. 10

      Maybe in relative numbers there are fewer people who understand the low level things, but the absolute numbers are magnitudes higher.

      This is very powerful thinking and unlocks a lot of counter narratives to popular beliefs.

      I remember when the Wii came out and people were worried that games would stop being “good” cuz so many casual games existed now (see also mobile game stuff). The ratio with sales was super weird but ultimately we were looking at bigger pie stuff.

      I think a similar thing has happened with software sales as well relative to mobile applications

      1. 3

        Ooh, I can answer this! Ken Thompson’s first version of Unix was just the barebones he needed to run Space Traveler, a game he wanted to port from MULTICS. It wasn’t anything close to what we’d recognize as a Unix.

        Not only that, you can actually run it in SIMH’s PDP-7 emulator! The original source code is available, including the source for the game Space Travel. The ~3k lines kernel is bundled with ~18k user space programs (assembler, debugger, text editor, disk/file management utilities, some games, etc.). Maybe less barebones than you’d expect. To me this inititial UNIX version is akin to prototype to see if the approach will work (later evolved and refined into Research UNIX of course).

        1. 2

          it wasn’t anything close to what we’d recognize as a Unix

          Do you have any recommendations for philosophical/historical/narrative reading produced by these folks or their contemporaries? I’ve consumed plenty of their writing on the technical side, but most of the philosophical/historical/narrative accounts I’m aware of seem to have gone through the filters of other people involved with later/divergent parts of the historical trajectory (GNU, Linux, FSF, and so on).

          1. 2

            Have a look around on multicians.org. I’ve enjoyed some of Tom Van Vleck’s pieces like his history of electronic mail.

            1. 3

              Unix Hater’s Handbook http://web.mit.edu/~simsong/www/ugh.pdf

              Lion’s Commentary on Unix https://cs3210.cc.gatech.edu/r/unix6.pdf

        1. 42

          In my day it was called HACKING and it documented the code as it stood two years ago, if you were lucky.

          1. 9

            Where I’ve seen this done well, it has always been rolled into CONTRIBUTING as a lightweight place to point new developers toward any non-obvious logical entry points, rather than as a place to bother with articulating high-level architectural decisions.

            That is to say, it firmly sticks to the “what” instead of the “why” and makes no pretense of being a comprehensive document. As you point out, the “why” is probably out of date, but “why” also doesn’t really matter to anyone who hasn’t already wrapped their brain around the whole mental model of the application.

            No ready examples immediately jump to my mind, but I know I recently saw a good page in a go project noting the rough equivalent of a main() that was neither the primary entry point of the application itself nor a cleanly separated area of concern (i.e. module). One could argue this already suggests a poorly designed architecture, so I wish I had this example ready at hand - this is where I’d make some kind of argument about not letting “perfect” get in the way of “practical”, but of course I see the silliness of that point when I’m already speaking at a purely theoretical level.

          1. 2

            Python 2 support has been dropped by pip the Python Packaging Authority (PyPA) at the Python Packaging Index (PyPI), which is the default configuration for every distribution of pip I’m aware of. If you have critical dependencies on Python 2 packages and are unwilling to migrate to Python 3, set up your own package index or pull the libraries you depend on directly into the vcs for your legacy project (which is definitely more work than migrating to Python 3, but is a choice you have).

            Another easier alternative would be to pull the libraries you depend on directly into the vcs for your legacy project.

            1. 8

              This is a change to pip, not to pypi. You can still use an older version of pip on Python 2 to install packages from PyPI. (That might change in the future, but it doesn’t seem to be under consideration in the maintainers threads on the issue.)

              1. 2

                That wasn’t what I read the change to mean, so absolutely needed your clarification. Cheers.

              2. 3

                Vendoring dependencies is pretty mechanical. Migrating to Python 3 is only partially mechanical. So I’m not sure why you would say the former is more work. It doesn’t seem like it to me. By far.

                1. 1

                  I wrote the parenthetical statement in reference to setting up your own index, then went back to add the alternative option (of vendoring dependencies), so that was an unintended misstatement. Thanks for the clarification; you are completely correct. Original comment now reflects the correction above.

                  1. 2

                    Interesting, okay. Depending on circumstances, setting up your own index could also be easier than migrating to Python 3 though too. Migrating to Python 3 can be exceptionally difficult. I’ve lived through it.

                    1. 1

                      I talked a friend through the decision at his $work, and they decided the site deployment wasn’t large enough to warrant going in that direction, since it presented an ongoing cost for onboarding future people and maintaining infrastructure. They ultimately pulled all libraries into their own repo, not even vendoring, after going through the process and discovering none were being actively maintained anyway.

                2. 2

                  Python 2 support has been dropped by pip the Python Packaging Authority (PyPA) at the Python Packaging Index (PyPI)

                  I don’t think that this statement is correct. Pip dropping support for Python 2.7 doesn’t inherently have any implications for what packages can be uploaded or downloaded from PyPI. I’ve not heard a peep about dropping support for any version of Python on PyPI.

                  Vendoring your unmaintained Python 2.7 dependencies may be useful for other reasons: while you’ll still be able to use Pip 2.3.x for a while it will eventually atrophy, like everything else in the 2.7 ecosystem. Vendoring may ease any sustaining engineering you do, and it’ll help avoid use of an unmaintained client application. However, the version of Pip shipped with your Linux distribution will doubtless continue to be supported (meaning: receive security updates to the TLS stack) for years, and PyPI is unlikely to break it.

                1. 17

                  Looks like people more and more people are realising that the next usability iteration on terminal is seeing the result as you type. More applications keep implementing this workflow, probably shells will implement it in a general fashion in years to come.

                  Some years ago I hacked together a small curses program that accepted a command with a placeholder and presented a prompt that would re-run the command with the new input on each krypress. I never published it because it was very hacky and quite dangerous if you’re not careful.

                  1. 7

                    This is very true for text editors as well IMO, which is why I use kakoune which shows the incremental results as you preform complex combinations of actions or select based on a regex.

                    1. 3

                      I think you can sort of do this with https://github.com/lotabout/skim#as-interactive-interface – perhaps even integrate that into the shell itself

                      fzf may have a similar option

                      the problem is with process spawning overhead in my opinion – doing it for every keystroke needs debouncing and at that point the UI starts to lag. If apps have native support for it they can do something more efficient

                      1. 1

                        Yes. It did essentially that that you linked. The UI doesn’t need to get unresponsive, text input is decoupled from external process execution.

                        1. 2

                          I don’t mean that the textbox itself becomes unresponsive but rather that the preview will have to endure the cost of process startup and starting from scratch and depending on the operation that is previewed that can be expensive. I have had this experience with the exact ag example… ag takes time to search things, but depending on previous preview it may not have to search the entire space again.

                          st is very good at dampening the impact but things can be better with a different architecture

                      2. 2

                        I’m interested to see what kinds of things Jupyter might inspire in shells. The notebook workflow mostly fits tasks with requirements halfway between an interactive shell and an executable file (as in exploratory data analysis and the like), but the concept has already made its way over to the text editor side (in VSCode you can use a magic comment command to delineate and execute individual code cells within a file to view output while still editing, as if it were a notebook). I wonder what that might conceptually look like if taken to the shell side instead of the editor side.

                        1. 0

                          Aren’t you describing fish? :)

                        1. 2

                          Interesting overview of the Nix approaches, so thanks for sharing. My curiosity has been piqued by all the NixOS posts I’ve been seeing even though I don’t run it myself. Any reports from folks running a bunch of devices like the one from this post on their home network? What are you using them to do?

                          My home router/firewall/dhcp/ipsec server (atom x86-64) and file/media/proxy/cache/print server (celeron x86-64) are OpenBSD, so when I bought a Beaglebone Black (ARMv7) to toy with, it was fun to go down a rabbit hole pretending I was @tedu (his 2014 post) to learn about diskless(8) and pxeboot(8) and how to netboot via uboot. This ended being pure experimentation since the actual parallelized work I do at home is on a single beefy Linux workstation (hard requirement on Nvidia GPU for now) and I’m not a professional sysadmin. The BBB sits disconnected in a drawer, but the setup lives on as the mere handful of config line changes required to set up tftpd(8) on the file server and point dhcpd(8) to it from the router, so I gained a more complete understanding of those as a neat side effect of experimenting. At some point in the next couple years I’m going to want to play with a RISC-V SoC, but that’s going to mean looking at Linux again unless I magically become competent to write my own drivers.

                          1. 8

                            I just converted my last non-NixOS machine yesterday, so I’ll share my experience =]

                            I currently have 5 machines running NixOS and deployed using NixOps (to a network called ekumen):

                            • laptop, ThinkPad T14 AMD (odo)
                            • workstation: Ryzen 9 3900X, 128GB RAM (takver)
                            • compute stick: quad core Intel Atom, 2GB RAM (efor)
                            • rpi: 3B+, 1 GB RAM (gvarab)
                            • chromebox: i7, 16GB RAM (mitis)

                            I set up the workstation and chromebox as remote builders for all systems, just as @steinuil did in the post. I’m using the rpi for running Jellyfin (music) and Nextcloud (for sharing calendars and files with my spouse), and setting up the chromebox to be an IPFS node for sharing research data. The laptop and workstation are using home-manager for syncing my dev environment configurations, but I do most of the dev/data analysis in the workstation (which has gigabit connections to the internet), and while the laptop is often more than enough for dev, my home connection is way too slow for anything network-intensive (so, it serves as a glorified SSH client =P)

                            They are all wired together using zerotier, and services running in the machines are bound to the zerotier interface, which ends up creating a pretty nice distributed LAN.

                            I don’t have my configs in public (booo!), because I’ve not been too good on maintaining secrets out of the configs. But @cadey posts are a treasure trove of good ideas, and I also enjoyed this post and accompanying repo as sources of inspiration.

                            1. 1

                              I don’t really see the value nixops provides over nixos-rebuild which can work over ssh.

                              1. 1

                                That’s a fair point. Part of using nixops was about exploring how to use it later for other kinds of deployment (clouds), and it is a bit excessive for my use case (especially since I use nixops to deploy locally in the laptop =P).

                                A lot of my nix experience so far is seeing multiple implementations of similar concepts, but I also feel like I can refactor and try other approaches without borking my systems (too much).

                            2. 2

                              On the Pi from the post I run:

                              • syncthing
                              • Navidrome so I can listen to my music library on my phone
                              • twkwk, a small program that serves my TiddlyWiki instance
                              • synapse, a torrent client which is actually the Rust program I mentioned in the post
                              • some SMB shares with Samba that serve the two drives I use for torrents and music
                            1. 14

                              I’ve been really tempted to buy a remarkable2. But the reviews I see say it’s great for note taking but not so great for just reading PDFs. Mostly I want to read PDFs. I’m still on the fence.

                              1. 14

                                As long as your PDFs don’t require color, it is 100% worth it. Definitely one of my favorite devices at the moment.

                                1. 5

                                  Same. In the month or so I’ve had one, it hasn’t caused me a single frustration (and I’m the kind of person who gets annoyed at the user interfaces of my own Apple products). It works exactly as advertised. Anyone who thinks it might be worth the price tag should watch a third party review video and check out the official and awesome list projects. It has been awhile since I’ve stayed this excited about a new device so long after buying it.

                                2. 12

                                  I picked one up recently hoping that I could migrate a lot of my ebooks and pdfs to it. I don’t plan on returning it, but I wouldn’t recommend it.

                                  I was a huge fan of the kindle dx, but I’ve managed to break the buttons on a couple which renders them practically useless. I was on the fence with the first remarkable device but figured I’d given the latest iteration a shot. I figured it’d be a good DX substitute. It’s not. I want to like it, the physical design is really good, but the software sucks.

                                  I have a large collection of documents (epub/pdfs) that I was looking forward to getting on the device. Largely a mix of books published in electronic formats from regular publishers (O’Reilly, Manning, PragProg, etc.) as well as a few papers and docs I’ve picked up here and there.

                                  First, the reMarkable desktop/mobile app that you have to rely on for syncing is a little wonky. Syncing between the device and mobile/desktop versions of the app works, but leaves a little to be desired. Second, I have yet to load a pdf or epub that isn’t brutally slow to navigate (just page by page). If the document has images or graphics (even simple charts and illustrations) it will affect navigation performance. Occasionally a document will load relatively quickly, and navigate reasonable well, only to slow down after a few page turns. Epubs tend to be a little more difficult to work with - particularly if you decide to change the font. All I have to compare this device to is my broken DX, which, everything considered, positively smokes the reMarkable.

                                  It’s usable. It works alright for PDFs, less so for epubs. On the positive side, the battery life is quite good.

                                  1. 3

                                    I agree with your analysis in most regards. Syncing a lot of ebooks and pdfs to it is not something at which it would excel by default. I have a large Calibre library, and I haven’t synced it over for that reason. However, it’s something I’m looking forward to investigating with KOReader, which supports the reMarkable.

                                    I haven’t experienced the lag that you talk about, but can understand that that would be bothersome – though I definitely have experienced the “wonkiness” of the companion apps.

                                    1. 1

                                      My understanding is that epubs are converted to PDF before being synced? Is that actually the case?

                                      1. 4

                                        It renders the epub to pdf for display but that’s all in-memory. It’s still an epub on disk.

                                        1. 1

                                          I don’t know. I’ve got a couple books that are both pdf and ePub, and the pdf version behaves a little better. You can also resize and change fonts for ePub doc, but not for PDFs.

                                          1. 1

                                            Along these lines, another interesting observation I’ve made has to do with the way some kinds of text get rendered. In particular, I’ve encountered epubs with code listings that render fine in other apps and on other devices, but render horribly on the remarkable2 device. Interestingly, in some of those cases I will also have a publisher provided PDF that renders just fine.

                                            Further, epubs and PDFs are categorized differently in both the app and the device. With epubs you can change the justification, page margins, line spacing, fonts, and font size. With PDFs you have fewer options, but you do have the ability to adjust the view (which is great for papers since you can get rid of the margins).

                                          2. 2

                                            I don’t think so – from my playing around with ssh, there are definitely some epubs stored on device. I actually think the browser extension generates epubs, rather than pdfs which was surprising.

                                            1. 2

                                              Huh. Cool. Hmmm. The real reason I shouldn’t get one is that I always fall asleep with my e-reader and it often bounces off my face.

                                              1. 3

                                                That’s a pro, for the device, it weighs next to nothing. I’ve damn near knocked myself out dropping an iPad Pro on my head when reading in bed.

                                                1. 1

                                                  For me, it’s more the fact that the Kobo then ends up falling onto the floor. I’m not crazy with that with a $120 device, so …

                                        2. 7

                                          I own Gen 1 and Gen 2. I love the simplicity and focus of the device. It’s an amazing… whiteboard.

                                          Note taking is not suuuper great. Turns out marking up a PDF to take notes actually isn’t that great because the notes quickly get lost in the PDF. It’s not like in real life, where you can put a sticky note to jump to that page. The writing experience is fantastic though. I have notebooks where I draw diagrams/ideas out. I like it for whiteboarding type stuff.

                                          Reading is terrible. I mean, it works. Searching is painfully slow. The table of contents doesn’t always show up (even though my laptop PDF reader can read the TOC just fine). When you do get a TOC, the subsections are flattened to the top level, so it’s hard to skim the TOC. PDF links don’t work. Text is often tiny, though you can zoom in. EPUBs appear to get converted to PDFs on the fly and their EPUB to PDF conversion sucks. Though, I’ve found doing the conversion myself in Calibre is way better.

                                          Overall, I like the device for whiteboarding. But it’s kinda hard to recommend.

                                          1. 2

                                            Marking up PDFs works better in color, since you can pick a contrasting ink color. I do it in Notability on my iPad Pro (which is also great for whiteboarding / sketching.)

                                            I was tempted by reMarkable when the first version came out, but I couldn’t see spending that kind of money on something that only does note taking and reading. I’m glad it’s found an audience though, it’s a cool device.

                                            1. 1

                                              Turns out marking up a PDF to take notes actually isn’t that great because the notes quickly get lost in the PDF. It’s not like in real life, where you can put a sticky note to jump to that page.

                                              So far the best experience I’ve seen for this is LiquidText on an iPad Pro. While you can write on the PDF as any other annotator, there’s also a lot of more hypertext type of features, like collecting groups of notes in an index, or writing separate pages of notes that are bidirectionally hyperlinked to parts of the document they refer to. Or do things like pull out a figure from a paper into a sidebar where you attach notes to it.

                                              The main downside for me is that you do more or less have to go all-on on LiquidText. It supports exporting a workspace to flat PDFs, but if you used the hypertext features in any significant way, the exported PDFs can be very confusing with the lack of expected context.

                                              1. 1

                                                Agreed that it is hard to find notes. There should be a way to jump to pages that have notes on them (this is how Drawboard PDF works, for example).

                                                1. 1

                                                  What is the advantage over drawing on a piece of paper or on a whiteboard, then taking a photo of what you’ve drawn, if needed?

                                                  1. 1

                                                    I tried paper note books, but I’m too messy and make too many mistakes. Erasing, moving, and reordering is hard on paper.

                                                    A whiteboard is pretty good for temporary stuff and erases better than paper. But, it can be a bit messy.

                                                    I also tried Rocketbook for a while. I got the non-microwaveable (yes you read that right) one. That was okay. A little meh for me.

                                                    And of course, you can’t read PDFs on any of these.

                                              1. 15

                                                For short and ephemeral text and images, check out the “note to self” feature of Signal. It appears as the name of one of your contacts. This requires your devices be linked, but approximates the lazy email-it-to-yourself approach with an added layer of reasonable privacy.

                                                1. 3

                                                  Signal is my usual go to. What I’m sending is often long untypable passwords, so I keep the disapearing messages set to 5 minutes as well.

                                                  1. 2

                                                    Yep! I’ve been using this too but it felt clunky still.

                                                    1. 1

                                                      How so? What would you change?

                                                      1. 1

                                                        I don’t think there’s much I could change but it isn’t as seamless as iOS Universal Clipboard or Airdrop

                                                    2. 2

                                                      I use this feature all the time. It’s great for sending non-url things to other devices. For URL things, I uuse Firefox’s “Send to Device” function that works when you have browser sync enabled.

                                                      Edit to Add: Slightly OT, in F-Droid there is an app called Exif-Scrambler. It’s a wonderful tool for scrubbing metadata out of pictures. Share to Exif-Scrambler, then E-S will scrub metadata and present you with another share option, at which point I use Note to Self on Signal.

                                                      1. 1

                                                        Does it bother you that Signal for desktop is not encrypted?

                                                        1. 1

                                                          I assume you mean “not encrypted at rest”? Doesn’t bother me personally (if you control my user account you have ~everything anyways).

                                                      1. 13

                                                        I’ve really enjoyed reading this blog over the last few weeks. He has a great perspective and explains the legal side well. Seems like there is an “Open Source Industrial Complex” where lots of money is made selling products and having conferences about “open source”.

                                                        1. 5

                                                          You’ll hear people who work in the field joke about a “compliance-industrial complex”. I think that started back in the early 2000s, after big companies started permitting use of open source in masse. Salespeople for nascent compliance solutions firms would fly around giving C-level officers heartaches about having to GPL all their software. My personal experience of those products, both for ongoing use and for one-off due diligence, is that they’re way too expensive, painful to integrate, just don’t work that well, and only make cost-benefit if you ingest a lot of FUD. Folks who disagree with me strongly on other issues, like new copyleft licenses, agree with me here.

                                                          That said, I don’t mean to portray what’s going on in the open source branding war as any kind of conspiracy. There are lots of private conversations, private mailing lists, and marketing team meetings that don’t happen in the open. But the major symptoms of the changing of the corporate guard are all right out there to be seen online. That’s why I walked through the list of OSI sponsors, and linked to the posts from AWS and Elastic. It’s an open firefight, not any kind of cloak-and-dagger war.

                                                          1. 7

                                                            Agreed. I’m getting increasingly tired by some communities’ (especially Rust’s) aggressive push of corporate-worship-licenses like BSD, MIT (and against even weak copy-left licenses like MPL).

                                                            1. 17

                                                              I’m saying this with all the respect in the world, but this comment is so far detached from my perception of license popularity that I wanna know from which niche of the tech industry this broad hatred of Rust comes from. To me it seems like one would have to hack exclusively on C/C++/Vala projects hosted on GNU Savannah, Sourcehut or a self-hosted GitLab instance to reach the conclusion that Rust is at the forefront of an anti-copyleft campaign. That to me would make the most sense because then Rust overlaps with the space you’re occupying in the community much more than, say, JavaScript or Python, where (in my perception) the absolute vast majority of OSS packages do not have a copyleft license already.

                                                              1. 3

                                                                Try shipping any remotely popular library on crates.io and people heckle you no end until they get to use your work under the license they prefer.

                                                                Lessons learned: I’ll never ship/relicense stuff under BSD/MIT/Apache ever again.

                                                                1. 2

                                                                  this broad hatred of Rust comes from

                                                                  Counter culture to the Rust Evangelism Strike Force: Rust evangelists were terribly obnoxious for a while, seems like things calmed down a bit, but the smell is still there.

                                                                  1. 1

                                                                    I think it’s beneath this site to make reactionary nonsense claims on purpose.

                                                                    1. 2

                                                                      How is criticizing a (subset) of a group for their method of communication “reactionary”?

                                                                      1. 1

                                                                        I’m saying soc’s claim about Rust pushing for liberal licensing is nonsense and probably reactionary to the Rust Evangelism Strike Force if @pgeorgi’s explanation is true. My point is that “counter culture” is not an excuse to make bad arguments or wrong claims.

                                                                        1. 2

                                                                          OK, that makes a bit more sense.

                                                                      2. 2

                                                                        reactionary nonsense claims

                                                                        like talking about some “broad hatred of Rust” when projects left and right are adopting it? But the R.E.S.F. is really the first thing that comes to my mind when thinking of rust, and the type of advocacy that led to this nickname sparked some notable reactions…

                                                                        (Not that I mind rust, I prefer to ignore it because it’s just not my cup of tea)

                                                                  2. 7

                                                                    I won’t belabor the point, but I’d suggest considering that some of those project/license decisions (e.g. OpenBSD and ISC) may be about maximizing the freedom (and minimizing the burden) shared directly to other individual developers at a human-to-human level. You may disagree with the ultimate outcome of those decisions in the real world, but it would be a wild misreading of the people behind my example as “corporate worshipping”.

                                                                    As I have said before: “It’s important to remember that GNU is Not Unix, but OpenBSD userland is much more so. There isn’t much reason to protect future forks if you expect that future software should start from first principles instead of extending software until it becomes a monolith that must be protected from its own developers.”

                                                                    Not all software need be released under the same license. Choosing the right license for the right project need not require inconsistency in your beliefs about software freedoms.

                                                                    1. 6

                                                                      The specific choice of MIT/Apache dual-licensing is so unprincipled and weird that it could only be the result of bending over backwards to accommodate a committee’s list of licensing requirements (it needs to compatible with the GPL versions 2 and 3, it needs a patent waver, it needs to fit existing corporate-approved license lists, etc). This is the result of Rust being a success at all costs language in exactly the way that Haskell isn’t. Things like corporate adoption and Windows support are some of those costs.

                                                                      1. 3

                                                                        I can’t speak directly to that example, as I don’t write Rust code and am not part of the Rust community, but it would not surprise me if there were different and conflicting agendas driving licensing decisions made by any committee.

                                                                        I do write code in both Python and Go (languages sharing similar BSD-style licensing permissiveness), and my difficult relationship to the organization behind Go (who is also steward of its future) is not related in any way to how that language has been licensed to me. Those are a separate set of concerns and challenges outside the nature of the language’s license.

                                                                1. 24

                                                                  Data tech is a massive and intertwined ecosystem with a lot of money riding on it. It’s not just about compute or APIs, that’s a fairly small part.

                                                                  • What file formats does it support?
                                                                  • Does it run against S3/Azure/etc.?
                                                                  • How do I onboard my existing data lake?
                                                                  • How does it handle real-time vs batch?
                                                                  • Does it have some form of transactions?
                                                                  • Do I have to operate it myself or is there a Databricks-like option?
                                                                  • How do I integrate with data visualization systems like Tableau? (SQL via ODBC is the normal answer to this, which is why it’s so critical)
                                                                  • What statistical tools are at my disposal? (Give me an R or Python interface)
                                                                  • Can I do image processing? Video? Audio? Tensors?
                                                                  • What about machine learning? Does the compute system aid me in distributed model training?

                                                                  I could keep going. Giving it a JavaScript interface isn’t even leaning in to the right community. It’s a neat idea, for sure, but there’s mountains of other things a data tech needs to provide just to be even remotely viable.

                                                                  1. 6

                                                                    Yeah this is kinda what I was going to write… I worked with “big data” from ~2009 to 2016. The storage systems, storage formats, computation frameworks, and the cluster manager / cloud itself are all tightly coupled.

                                                                    You can’t buy into a new computation technology without it affecting a whole lot of things elsewhere in the stack.

                                                                    It is probably important to mention my experience was at Google, which is a somewhat unique environment, but I think the “lock in” / ecosystem / framework problems are similar elsewhere. Also, I would bet that even at medium or small companies, an individual engineer can’t just “start using” something like differential dataflow. It’s a decision that would seem to involve an entire team.

                                                                    Ironically that is part of the reason I am working on https://www.oilshell.org/ – often the least common denominator between incompatible job schedulers or data formats is a shell script!

                                                                    Similarly, I suspect Rust would be a barrier in some places. Google uses C++ and the JVM for big data, and it seems like most companies use the JVM ecosystem (Spark and Hadoop).

                                                                    Data tech also can’t be done without operators / SREs, and they (rightly) tend to be more conservative about new tech than engineers. It’s not like downloading something and trying it out on your laptop.

                                                                    Another problem is probably a lack of understanding of how inefficient big data systems can be. I frequently refer to McSherry’s COST paper, but I don’t think most people/organizations care… Somehow they don’t get the difference between 4 hours and 4 minutes, or 100 machines and 10 machines. If people are imagining that real data systems are “optimized” in any sense, they’re in for a rude awakening :)

                                                                    1. 3

                                                                      Believe that andy is referring to this paper if anyone else is curious.

                                                                      (And if you weren’t let me know and I’ll read that one instead. :] )

                                                                      1. 3

                                                                        Yup that’s it. The key phrases are “parallelizing your overhead”, and the quote “You can have a second computer once you’ve shown you know how to use the first one.” :)

                                                                        https://www.usenix.org/system/files/conference/hotos15/hotos15-paper-mcsherry.pdf

                                                                        The details of the paper are about graph processing frameworks, which most people probably won’t relate to. But it applies to big data in general. It’s similar to experiences like this:

                                                                        https://adamdrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html

                                                                        I’ve had similar experiences… 32 or 64 cores is a lot, and one good way to use them all is with a shell script. You run into fewer “parallelizing your overhead” problems. The usual suspects are (1) copying code to many machines (containers or huge statically linked binaries), (2) scheduler delay, and (3) getting data to many machines. You can do A LOT of work on one machine in the time it takes a typical cluster to say “hello” on 1000 machines…

                                                                      2. 1

                                                                        That’s a compelling explanation. If differential dataflow is an improvement on only one component, perhaps that means that we’ll see those ideas in production once the next generation of big systems replaces the old?

                                                                        1. 2

                                                                          I think if the ideas are good, we’ll see them in production at some point or another… But sometimes it takes a few decades, like algebraic data types or garbage collection… I do think this kind of big data framework (a computation model) is a little bit more like a programming language than it is a “product” like AWS S3 or Lambda.

                                                                          That is, it’s hard to sell programming languages, and it’s hard to teach people how to use them!

                                                                          I feel like the post is missing a bunch of information: like what kinds of companies or people would you expect to use differential dataflow but are not? I am interested in new computation models, and I’ve heard of it, but I filed it in the category of “things I don’t need because I don’t work on big data anymore” or “things I can’t use unless the company I work for uses it” …

                                                                      3. 2

                                                                        The above is a great response, so to elaborate on one bit:

                                                                        What statistical tools are at my disposal? (Give me an R or Python interface)

                                                                        It’s important for engineers to be aware of how many non-engineers produce important constituent parts of the data ecosystem. When a new paper comes out with code, that code is likely to be in Python or R (and occasionally Julia, or so I’m hearing).

                                                                        One of the challenges behind using other great data science languages (e.g. Scala) is that there may be an ongoing and semi-permanent translation overhead for those things.

                                                                        1. 1

                                                                          all of the above + does it support tight security and data governance?

                                                                        1. 11

                                                                          Built a desktop machine (Ryzen 5600) and installed OpenBSD on it because FreeBSD wasn’t supporting my wireless and ethernet cards. Did a quick online search to find out that OpenBSD supports the ethernet card. Was pleasantly surprised to find out that it does support my wireless card too without any extra hassle.

                                                                          Now, I need to set up cwm so that it is closer in behaviour to a tiling window manager, and install rakubrew so that I can build Raku versions easily.

                                                                          After that’s done, I’ll need to restore some data from a backup. What I am most interested in is to copy over my ~/.ssb folder so that I can get back on the Scuttleverse after a break of a few months.

                                                                          1. 3

                                                                            Welcome to the fun! Don’t overlook afterboot(8) or that your existing preferred wm might be there in packages (though cwm is great too).

                                                                            My own workstation is the only system not on OpenBSD (some current projects require my nvidia GPU), but always interested to read postmortems from new switchers on recent hardware.

                                                                            1. 2

                                                                              Hey thanks for the link! Checking it out.

                                                                              I hope to write something after a few weeks of using this as my daily driver.

                                                                          1. 27

                                                                            With respect and love, @pushcx, this ain’t it.

                                                                            In my experience moderating internet forums there are precisely two kinds of people that are interested in moderating:

                                                                            • people motivated by a deep desire to improve the communities they participate in and who plan to moderate with the lightest possible touch in order to grow the community and allow it to express its norms and standards
                                                                            • petty fucking assholes who plan to wield their power to grind axes, antagonize enemies, and reshape the community as they themselves see fit

                                                                            These posts bring out both of these kinds of people, but unfortunately you’ll be lucky if you get a 1:10 ratio of good to bad.

                                                                            Now, you’ve said you plan to announce the new moderator slate. That means, at best, you and @irene plan to try to separate the wheat from the chaff. My $.02: don’t. Pick ten people that each of you have interacted with and think you can live with as moderators. Then ask them directly. If you are lucky you’ll get two of them to accept and only reluctantly so. Then you’ll have found good moderators.

                                                                            1. 11

                                                                              I spent the last year trying that, though I contacted nine users rather than ten. None were both interested and available, though one maybe came around a few days ago and I believe applied earlier today.

                                                                              1. 3

                                                                                None were both interested and available, though one maybe came around a few days ago and I believe applied earlier today.

                                                                                I hope like hell they did, and wish you the absolute best of luck finding a second (or more).

                                                                              2. 8

                                                                                I am deeply distressed by the direction these comment threads took for many reasons, but not least because I had believed, perhaps naively, that this community was distinct in how its members practiced a cautious self-moderation to avoid making - or even cast votes behind - statements they lacked expert authority to make, knowing those statements could (and likely would), be seen by actual experts who could be engaged in earnest discussion.

                                                                                I post under my real name because it means I have to stand by the things I say, and it forces me to pause to consider the effect my words will have on the people I say those things to. Posting anonymously or pseudonymously removes the first obligation one has to oneself, but nothing removes the second obligation one has to others.

                                                                                I hope from the bottom of my heart that these threads are uniquely a product of the generally elevated blood pressure of all people at this particular moment, and is not representative of anything else.

                                                                                I have many things to say about the subject matter discussed in these threads, but this is not the place I will say them.

                                                                                1. 6

                                                                                  I can’t agree any more strongly with this post. This (request for moderators) is the way to get toxic moderators. @owen is absolutely right about how to get good moderators, pick them based on their existing community actions and reach out to them. Most of who you reach out to are wonderful, sane people so they will decline… continue down the list. I have moderated communities for a couple of decades and the path laid out by @owen is the only one I have had any success with.

                                                                                  1. 3

                                                                                    You are extremely correct. As a member of the third category, I understand that moderation is a powerful tool which I would almost certainly misuse and with which I should not be trusted. And I am simply not a nice person. But, with that said, I at least have an excellent record of antifascist posting.

                                                                                    1. 2

                                                                                      This is good advice, and in other communities I’ve been a part of it was exactly how moderators were selected. Some served for 5-10+ years, all part of the same initial friend group / community that seeded it.

                                                                                      I did assume that @pushcx would still have the ultimate say in getting rid of any false positives, so to speak, considering we have a public moderation log and it would be somewhat obvious if someone was abusing their power, so it might be OK still.

                                                                                    1. 7

                                                                                      The lack of what this proposal describes as package level enforcement at construction time has been a source of much verbosity and a frustrating need to write obtuse validation code in my own go projects (or rather, to remember the need to).

                                                                                      I am completely unqualified to comment on this solution as it pertains to programming language design, but I can affirm that the motivation reflects something I’ve found wanting. So, thank you for sharing.

                                                                                      1. 10

                                                                                        I search lobsters for that thing I saw once a few months ago that I thought was really cool and kept open in a tab on one of my phones for a while until I ended up closing it thinking “I’ll remember how to get that.”

                                                                                        1. 4

                                                                                          Given how completely your description fits my own browsing behavior, it might be worth sharing that I’ve been largely successful at sticking to a low-friction system without relapsing to this previous approach.

                                                                                          I liberally use the “save” feature here, the “star” feature on GitHub, and the “favorite” feature on the orange site. The latter two are publicly viewable.

                                                                                          I make no categorization or organizational attempt upon saving items, nor do I pretend that there is any specific plan to return to those items. These lists merely serve as a smaller subset of items I am able to manually look over when I want to recall something I once found of interest. It has been highly effective for me, even given how low my bar is for adding something to those lists; they’re not curated, merely a log of my gut reaction that something might be of interest.

                                                                                        1. 43

                                                                                          Is this a paid position?

                                                                                          1. 57

                                                                                            Rather the opposite for you, I have a stack of past-due therapy bills you’re delinquent on.

                                                                                            1. 6

                                                                                              There’s always the option for cathartic revenge by assigning him the Victor Frankenstein hat.

                                                                                              1. 9

                                                                                                Please, let’s be reasonable here. He should delegate assigning the hat to one of the new mods.

                                                                                          1. 24

                                                                                            It is safe to say that nobody can write memory-safe C, not even famous programmers that use all the tools.

                                                                                            For me, it’s a top highlight. My rule of thumb is that if OpenBSD guys sometimes produce memory corruption bugs or null dereference bugs, then there is very little chance (next to none) than an average programmer will be able to produce a secure/rock solid C code.

                                                                                            1. -1

                                                                                              My rule of thumb is that if OpenBSD guys sometimes produce memory corruption bugs or null dereference bugs, then there is very little chance (next to none) than an average programmer will be able to produce a secure/rock solid C code.

                                                                                              Why do you think “the OpenBSD guys” are so much better than you?

                                                                                              Or if they are better than you, where do you get the idea that there isn’t someone that much better still? And so on?

                                                                                              Or maybe let’s say you actually don’t know anything about programming, why would you trying to convince anyone else of anything coming directly from a place of ignorance? Can your gods truly not speak for themselves?

                                                                                              I think you’re better than you realise, and could be even better than you think is possible, and that those “OpenBSD guys” need to eat and shit just like you.

                                                                                              1. 24

                                                                                                Why do you think “the OpenBSD guys” are so much better than you?

                                                                                                It’s not about who is better than who. It’s more about who has what priorities; OpenBSD guys’ priority is security at the cost of functionality and convenience. Unless this is average Joe’s priority as well, statistically speaking OpenBSD guys will produce more secure code than Joe does, because they focus on it. And Joe just wants to write an application with some features, he doesn’t focus on security that much.

                                                                                                So, since guys that focus on writing safe code sometimes produce exploitable code, then average Joe will certainly do it as well.

                                                                                                If that weren’t true, then it would mean that OpenBSD guys security skill is below average, which I don’t think is true.

                                                                                                1. 5

                                                                                                  OpenBSD guys’ priority is security at the cost of functionality

                                                                                                  I have heard that claim many times before. However, in reality I purely use OpenBSD for convenience. Having sndio instead of pulse, having no-effort/single command upgrades, not having to mess with wpa_supplicant or network manager, having easy to read firewall rules, having an XFCE desktop that just works (unlike Xubuntu), etc. My trade-off is that for example Steam hasn’t been ported to that platform.

                                                                                                  So, since guys that focus on writing safe code sometimes produce exploitable code, then average Joe will certainly do it as well.

                                                                                                  To understand you better. Do you think average Joe both will use Rust and create less mistakes? Also, do you think average Joe will make more logic errors with C or with Rust? Do you think average Joe will use Rust to implement curl?

                                                                                                  I am not saying that you are wrong - not a C fan, nor against Rust, quite the opposite actually - but wonder what you base your assumptions on.

                                                                                                  1. 3

                                                                                                    I’d also add that there is deep & widespread misunderstanding of the OpenBSD philosophy by the wider developer community, who are significantly influenced by the GNU philosophy (and other philosophies cousin to it). I have noticed this presenting acutely around the role of C in OpenBSD since Rust became a common topic of discussion.

                                                                                                    C, the existing software written in C, and the value of that existing software continuing to be joined by new software also written in C, all have an important relationship to the Unix and BSD philosophies (most dramatically the OpenBSD philosophy), not merely “because security”.

                                                                                                    C is thus more dramatically connected to OpenBSD than projects philosophically related to the “GNU is Not Unix” philosophy. Discussions narrowly around the subject of C and Rust as they relate to security are perfectly reasonable (and productive), but OpenBSD folks are unlikely to participate in those discussions to disabuse non-OpenBSD users of their notions about OpenBSD.

                                                                                                    I’ve specifically commented about this subject and related concepts on the orange site, but have learned the lesson presumably already learned many times over by beards grayer than my own: anyone with legitimate curiosity should watch or read their own words to learn what OpenBSD folks care about. Once you grok it, you will see that looking to that source (not my interpretation of it) is itself a fundamental part of the philosophy.

                                                                                                    1. 1

                                                                                                      If that weren’t true, then it would mean that OpenBSD guys security skill is below average, which I don’t think is true.

                                                                                                      At least not far above average. And why not? They’re mostly amateurs, and their bugs don’t cost them money.

                                                                                                      And Joe just wants to write an application with some features, he doesn’t focus on security that much.

                                                                                                      I think you’re making a straw man. OpenBSD people aren’t going to make fewer bugs using any language other than C, and comparing Average Joe to any Expert just feels sillier and sillier.

                                                                                                      1. 3

                                                                                                        What’s your source for the assertion ‘They’re mostly amateurs’?

                                                                                                        1. 2

                                                                                                          What a weird question.

                                                                                                          Most openbsd contributors aren’t paid to contribute.

                                                                                                          1. 2

                                                                                                            What a weird answer. Would you also argue that attorneys who accept pro bono work are amateurs because they’re not paid for that specific work?

                                                                                                            Most of the regular OpenBSD contributors are paid to program computers.

                                                                                                            1. 1

                                                                                                              because they’re not paid for that specific work?

                                                                                                              Yes. In part because they’re not paid for that specific work, I refuse to accept dark_grimoire’s insistence that “if they can’t do it nobody can”.

                                                                                                            2. 1

                                                                                                              You seem to be using the word “amateur” with multiple meanings. It can mean someone not paid to do something, aka “not a professional”. But when I use it in day to day conversation I mean something more similar to “hobbyist”, which does not tell much about ability. Also saying they are amateurs, thus do not write “professional” code, implies anyone can just submit whatever patch they want and it will be accepted, which is very far from the truth. I assume with reasonable certainty that you never contributed to OpenBSD yourself, to say that. I am not a contributor, but whenever I look at the source code, it looks better than much of what I saw in “professional” work. This may be due to the focus on doing simple things, and also very good reviews by maintainers. And as you said, the risk of loosing money may be a driver for improvement, but it is certainly not the only one (and not at all for some people).

                                                                                                              1. 1

                                                                                                                You seem to be using the word “amateur” with multiple meanings,

                                                                                                                I’m not.

                                                                                                                as you said, the risk of loosing money may be a driver for improvement, but it is certainly not the only one

                                                                                                                So you do understand what I meant.

                                                                                                        2. -1

                                                                                                          nailed it

                                                                                                    1. 23

                                                                                                      I used to think this was the case until I realized that Google funds Firefox through noblesse oblige, and so all the teeth-gnashing over “Google owns the Internet” is still true whether you use Chrome directly or whether you use Firefox. The only real meaningful competition in browsers is from Apple (God help us.) Yes, Apple takes money from Google too, but they don’t rely on Google for their existence.

                                                                                                      I am using Safari now, which is… okay. The extension ecosystem is much less robust but I have survived. I’m also considering Brave, but Chromium browsers just gulp down the battery in Mac OS so I’m not totally convinced there yet.

                                                                                                      Mozilla’s recent political advocacy has also made it difficult for me to continue using Firefox.

                                                                                                      1. 18

                                                                                                        I used to think this was the case until I realized that Google funds Firefox through noblesse oblige, and so all the teeth-gnashing over “Google owns the Internet” is still true whether you use Chrome directly or whether you use Firefox.

                                                                                                        I’m not sure the premise is true. Google probably wants to have a practical monopoly that does not count as a legal monopoly. This isn’t an angelic motive, but isn’t noblesse oblige.

                                                                                                        More importantly, the conclusion doesn’t follow–at least not 100%. Money has a way of giving you control over people, but it can be imprecise, indirect, or cumbersome. I believe what Google and Firefox have is a contract to share revenue with Firefox for Google searches done through Firefox’s url bar. If Google says “make X, Y and Z decisions about the web or you’ll lose this deal”, that is the kind of statement antitrust regulators find fascinating. Since recent years have seen increased interest in antitrust, Google might not feel that they can do that.

                                                                                                        1. 9

                                                                                                          Yes, I agree. It’s still bad that most of Mozilla’s funding comes from Google, but it matters that Mozilla is structured with its intellectual property owned by a non-profit. That doesn’t solve all problems, but it creates enough independence that, for example, Firefox is significantly ahead of Chrome on cookie-blocking functionality - which very much hits Google’s most important revenue stream.

                                                                                                          1. 4

                                                                                                            Google never has to say “make X, Y and Z decisions about the web or you’ll lose this deal,” with or without the threat of antitrust regulation. People have a way of figuring out what they have to do to keep their job.

                                                                                                          2. 17

                                                                                                            I’m tired of the Pocket suggested stories. They have a certain schtick to them that’s hard to pin down precisely but usually amounts to excessively leftist, pseudo-intellectual clickbait: “meat is the privilege of the west and needs to stop.”

                                                                                                            I know you can turn them off.

                                                                                                            I’m arguing defaults matter, and defaults that serve to distract with intellectual junk is not great. At least it isn’t misinformation, but that’s not saying much.

                                                                                                            Moving back to Chrome this year because of that, along with some perf issues I run into more than I’d like. It’s a shame, I wanted to stop supporting Google, but the W3C has succeeded in creating a standard so complex that millions of dollars are necessary to adequately fund the development of a performant browser.

                                                                                                            1. 2

                                                                                                              Moving back to Chrome this year because of that, along with some perf issues I run into more than I’d like. It’s a shame, I wanted to stop supporting Google, but the W3C has succeeded in creating a standard so complex that millions of dollars are necessary to adequately fund the development of a performant browser.

                                                                                                              In case you haven’t heard of it, this might be worth checking out: https://ungoogled-software.github.io/

                                                                                                              1. 1

                                                                                                                Except as of a few days ago Google is cutting off access to certain APIs like Sync that Chromium was using.

                                                                                                                1. 1

                                                                                                                  Straight out of the Android playbook

                                                                                                            2. 4

                                                                                                              Mozilla’s recent political advocacy has also made it difficult for me to continue using Firefox.

                                                                                                              Can you elaborate on this? I use FF but have never delved into their politics.

                                                                                                              1. 16

                                                                                                                My top of mind example: https://blog.mozilla.org/blog/2021/01/08/we-need-more-than-deplatforming/

                                                                                                                Also: https://blog.mozilla.org/blog/2020/07/13/sustainability-needs-culture-change-introducing-environmental-champions/ https://blog.mozilla.org/blog/2020/06/24/immigrants-remain-core-to-the-u-s-strength/ https://blog.mozilla.org/blog/2020/06/24/were-proud-to-join-stophateforprofit/

                                                                                                                I’m not trying to turn this into debating specifically what is said in these posts but many are just pure politics, which I’m not interested in supporting by telling people to use Firefox. My web browser doesn’t need to talk about ‘culture change’ or systemic racism. Firefox also pushes some of these posts to the new tab page, by default, so it’s not like you can just ignore their blog.

                                                                                                                1. 6

                                                                                                                  I’m started to be afraid that being against censorship is enough to get you ‘more than de-platformed’.

                                                                                                                    1. 10

                                                                                                                      Really? I feel like every prescription in that post seems reasonable; increase transparency, make the algorithm prioritize factual information over misinformation, research the impact of social media on people and society. How could anyone disagree with those points?

                                                                                                                      1. 17

                                                                                                                        You’re right, how could anyone disagree with the most holy of holies, ‘fact checkers’?

                                                                                                                        Here’s a great fact check: https://www.politifact.com/factchecks/2021/jan/06/ted-cruz/ted-cruzs-misleading-statement-people-who-believe-/

                                                                                                                        The ‘fact check’ is a bunch of irrelevant information about how bad Ted Cruz and his opinions are, before we get to the meat of the ‘fact check’ which is, unbelievably, “yes, what he said is true, but there was also other stuff he didn’t say that we think is more important than what he did!”

                                                                                                                        Regardless of your opinion on whether this was a ‘valid’ fact check or not, I don’t want my web browser trying to pop up clippy bubbles when I visit a site saying “This has been officially declared by the Fact Checkers™ as wrongthink, are you sure you’re allowed to read it?” I also don’t want my web browser marketer advocating for deplatforming (“we need more than deplatforming suggests that deplatforming should still be part of the ‘open’ internet.) That’s all.

                                                                                                                        1. 15

                                                                                                                          a bunch of irrelevant information about how bad Ted Cruz and his opinions are

                                                                                                                          I don’t see that anywhere. It’s entirely topical and just some context about what Cruz was talking about.

                                                                                                                          the meat of the ‘fact check’ which is, unbelievably, “yes, what he said is true, but there was also other stuff he didn’t say that we think is more important than what he did!”

                                                                                                                          That’s not what it says at all. Anyone can cherry-pick or interpret things in such a way that makes their statement “factual”. This is how homeopaths can “truthfully” point at studies which show an effect in favour of homeopathy. But any fact check worth its salt will also look at the overwhelming majority of studies that very clearly demonstrate that homeopathy is no better than a placebo, and therefore doesn’t work (plus, will point out that the proposed mechanisms of homeopathy are extremely unlikely to work in the first place, given that they violate many established laws of physics).

                                                                                                                          The “39% of Americans … 31% of independents … 17% of Democrats believe the election was rigged” is clearly not supported by any evidence, and only by a tenuous interpretation of a very limited set of data. This is a classic case of cherry-picking.

                                                                                                                          I hardly ever read politifact, but if this is really the worst fact-check you can find then it seems they’re not so bad.

                                                                                                                          1. 7

                                                                                                                            This article has a few more examples of bad fact checks:

                                                                                                                            https://greenwald.substack.com/p/instagram-is-using-false-fact-checking

                                                                                                                          2. 7

                                                                                                                            Media fact-checkers are known to be biased.

                                                                                                                            [Media Matters lobby] had to make us think that we needed a third party to step in and tell us what to think and sort through the information … The fake news effort, the fact-checking, which is usually fake fact-checking, meaning it’s not a genuine effort, is a propaganda effort … We’ve seen it explode as we come into the 2020 election, for much the same reason, whereby, the social media companies, third parties, academic institutions and NewsGuard … they insert themselves. But of course, they’re all backed by certain money and special interests. They’re no more in a position to fact-check than an ordinary person walking on the street … — Sharyl Attkisson on Media Bias, Analysis by Dr. Joseph Mercola

                                                                                                                            Below is a list of known rebuttals of some “fact-checkers”.

                                                                                                                            Politifact

                                                                                                                            • I wanted to show that these fact-checkers just lie, and they usually go unchecked because most people don’t have the money, don’t have the time, and don’t have the platform to go after them — and I have all three” — Candace Owens Challenges Fact-Checker, And Wins

                                                                                                                            Full fact (fullfact.org)

                                                                                                                            Snopes

                                                                                                                            Associated Press (AP)

                                                                                                                            • Fact-checking was devised to be a trusted way to separate fact from fiction. In reality, many journalists use the label “fact-checking” as a cover for promoting their own biases. A case in point is an Associated Press (AP) piece headlined “AP FACT-CHECK: Trump’s inaccurate boasts on China travel ban,” which was published on March 26, 2020 and carried by many news outlets.” — Propaganda masquerading as fact-checking

                                                                                                                            Politico

                                                                                                                            1. 4

                                                                                                                              I’m interested in learning about the content management systems that these fact checker websites use to effectively manage large amounts of content with large groups of staff. Do you have any links about that?

                                                                                                                              1. 3

                                                                                                                                The real error is to imply that “fact checkers” are functionally different from any other source of news/journalism/opinion. All such sources are a collection of humans. All humans have bias. Many such collections of humans have people that are blind to their own bias, or suffer a delusion of objectivity.

                                                                                                                                Therefore the existence of some rebuttals to a minuscule number of these “fact checks” (between 0 and 1% of all “fact checks”) should not come as a surprise to anyone. Especially when the rebuttals are published by other news/journalism/opinion sources that are at least as biased and partisan as the fact checkers they’re rebutting.

                                                                                                                                1. 1

                                                                                                                                  The real error is to imply that “fact checkers” are functionally different from any other source of news/journalism/opinion.

                                                                                                                                  Indeed they aren’t that different. Fact-checkers inherit whatever bias that is already present in mainstream media, which itself is a well-documented fact, as the investigative journalist Sharyl Atkisson explored in her two books:

                                                                                                                                  • The Smear exposes and focuses on the multi-billion dollar industry of political and corporate operatives that control the news and our info, and how they do it.
                                                                                                                                  • Slanted looks at how the operatives moved on to censor info online (and why), and has chapters dissecting the devolution of NYT and CNN, recommendations where to get off narrative news, and a comprehensive list of media mistakes.
                                                                                                                          3. 5

                                                                                                                            After reading that blog post last week I switched away from Firefox. It will lead to the inevitable politicization of a web browser where the truthfulness of many topics is filtered through a very left-wing, progressive lens.

                                                                                                                            1. 22

                                                                                                                              I feel like “the election wasn’t stolen” isn’t a left- or right-wing opinion. It’s just the truth.

                                                                                                                              1. 15

                                                                                                                                To be fair, I feel like the whole idea of the existence of an objective reality is a left-wing opinion right now in the US.

                                                                                                                                1. 5

                                                                                                                                  There are many instances of objective reality which left-wing opinion deems problematic. It would be unwise to point them out on a public forum.

                                                                                                                                  1. 8

                                                                                                                                    I feel like you have set up a dilemma for yourself. In another thread, you complain that we are headed towards a situation where Lobsters will no longer be a reasonable venue for exploring inconvenient truths. However, in this thread, you insinuate that Lobsters already has become unreasonable, as an excuse for avoiding giving examples of such truths. Which truths are being silenced by Lobsters?

                                                                                                                                    Which truths are being silenced by Mozilla? Keep in mind that the main issue under contention in their blog post is whether a privately-owned platform is obligated to repeat the claims of a politician, particularly when those claims would undermine democratic processes which elect people to that politician’s office; here, there were no truths being silenced, which makes the claim of impending censorship sound like a slippery slope.

                                                                                                                                    1. 4

                                                                                                                                      Yeah but none that are currently fomenting a coup in a major world power.

                                                                                                                                2. 16

                                                                                                                                  But… Mozilla has been inherently political the whole way. The entire Free Software movement is incredibly political. Privacy is political. Why is “social media should be more transparent and try to reduce the spread of blatant misinformation” where you draw the line?

                                                                                                                                  1. 5

                                                                                                                                    That’s not where I draw the line. We appear to be heading towards a Motte and Bailey fallacy where recent events in the US will be used as justification to clamp down on other views and opinions that left-wing progressives don’t approve of (see some of the comments on this page about ‘fact checkers’)

                                                                                                                                    1. 7

                                                                                                                                      In this case though, the “views and opinions that left-wing progressives don’t approve of” are the ideas of white supremacy and the belief that the election was rigged. Should those not be “clamped down” on? (I mean, it’s important to be able to discuss whether the election was rigged, but not when it’s just a president who doesn’t want to accept a loss and has literally no credible evidence of any kind.)

                                                                                                                                      1. 2

                                                                                                                                        I mentioned the Motte and Bailey fallacy being used and you bring up ‘white supremacy’ in your response! ‘White Supremacy’ is the default Motte used by the progressive left. The Bailey being a clamp down on much more contentious issues. Its this power to clamp down on the more contentious issues that I object to.

                                                                                                                                        1. 6

                                                                                                                                          So protest clamp downs on things you don’t want to see clamp downs on, and don’t protest clamp downs on things you feel should be clamped down on? We must be able to discuss and address real issues, such as the spread of misinformation and discrimination/supremacy.

                                                                                                                                          But that’s not even super relevant to the article in question. Mozilla isn’t even calling for censoring anyone. It’s calling for a higher degree of transparency (which none of us should object to) and for the algorithm to prioritize factual information over misinformation (which everyone ought to agree with in principle, though we can criticize specific ways to achieve it).

                                                                                                                                          1. 4

                                                                                                                                            We are talking past each other in a very unproductive way.

                                                                                                                                            The issue I have is with what you describe as “…and for the algorithm to prioritize factual information over misinformation”

                                                                                                                                            Can you not see the problem when the definition of ‘factual information’ is in the hands of a small group of corporations from the West Coast of America? Do you think that the ‘facts’ related to certain hot-button issues will be politically neutral?

                                                                                                                                            It’s this bias that i object to.

                                                                                                                                            This American cultural colonialism.

                                                                                                                                            1. 3

                                                                                                                                              Can you not see the problem when the definition of ‘factual information’ is in the hands of a small group of corporations from the West Coast of America?

                                                                                                                                              ReclaimTheNet recently published a very good article on this topic

                                                                                                                                              https://reclaimthenet.org/former-aclu-head-ira-glasser-explains-why-you-cant-ban-hate-speech/

                                                                                                                                              1. 3

                                                                                                                                                That’s an excellent article. Thank you for posting it.

                                                                                                                                                1. 3

                                                                                                                                                  You’re welcome. You might be interested in my public notes on the larger topic, published here.

                                                                                                                                  2. 3

                                                                                                                                    Out of interest, to which browser did you switch?

                                                                                                                              2. 2

                                                                                                                                if possible, try vivaldi, being based on chromium, it will be easiest to switch to f.e. you can install chromium’s extensions in vivaldi. not sure about their osx (which seems to be your use-case), support though, so ymmv.

                                                                                                                              1. 5

                                                                                                                                If I ever thought I might need the benefits of NoSQL, I would begin the project with PostgreSQL then start to use its NoSQL features after the need for those benefits started appearing.

                                                                                                                                1. 4

                                                                                                                                  You can index into jsonb objects already! That’s sometimes all you need.

                                                                                                                                  1. 5

                                                                                                                                    Be aware that PostgreSQL doesn’t gather statistics for jsonb columns, so despite having indexes it can generate pretty bad plans in some cases. Always check your query plans :) I wrote a detailed post about my experience with them

                                                                                                                                    1. 2

                                                                                                                                      Ooh, good point. If you really need solid planning, might want to use a generated column instead?

                                                                                                                                    2. 4

                                                                                                                                      jsonb with proper indexing is my favorite Mongo replacement.

                                                                                                                                      If one were really, really kinky, one could probably implement support for Mongo queries using plv8 with the appropriate side-loaded helper script. If you only one had a parser.

                                                                                                                                      :^)

                                                                                                                                  1. 3

                                                                                                                                    Thanks for posting. This articulates something my own mind has been circling since reading those same threads (though I had connected fewer of the dots, so this was illuminating).

                                                                                                                                    I’d be curious to hear from anyone who concurs with the central premises of this post but has replaced their use of a more traditional shell (e.g. ksh, bash, dash, zsh, or their predecessors) with something like fish, xonsh, oil, etc.

                                                                                                                                    If you are not particularly concerned with needing the shell as a tool readily at your disposal when you connect to unknown and arbitrary remote systems, but only as a tool to converse with your primary workstation (as a “power user” first and a sysadmin only by its practical usefulness), how does the ground shift around what is valuable to invest your time and knowledge into?

                                                                                                                                    1. 4

                                                                                                                                      I’d be curious to hear from anyone who concurs with the central premises of this post but has replaced their use of a more traditional shell (e.g. ksh, bash, dash, zsh, or their predecessors) with something like fish . . . how does the ground shift around what is valuable to invest your time and knowledge into?

                                                                                                                                      👋 I’ve been using Fish for about 8 years, now, I guess. I’ve always had an intuitive understanding of “the shell” that’s aligned with the description in this post. That is: a terse way to orchestrate interactions with the OS — typically, one interaction at a time. But I can’t say that I make a deliberate effort to learn anything about it, because I’m almost always task-driven.

                                                                                                                                      My usage is usually iterative: run this test. Okay, now run it with the following env vars set, to change it’s behavior. Now again, capturing profiles, running pprof, and rg’ing for total CPU time used. Now again, but add a memory profile. Now again, but output all of the relevant information in a single line with printf. Now again, but vary this parameter over these options. Now again, but vary this other parameter over these other options. Now again, repeating everything 3 times, tabulate the output with column -t, and sort on this column. Oops, tee to a file, so I can explore the data without re-running the tests.

                                                                                                                                      Each of these steps is hitting up-arrow and editing the prompt. Fish is a blessing because it makes this so nice: the actual editing is pleasant, and the smart history means even if I don’t save this stuff in a file, I can easily recall it and run it again, even months later, with no effort.

                                                                                                                                      I don’t know if this actually answers your question… maybe it does?

                                                                                                                                      1. 2

                                                                                                                                        During my last year as an undergrad, 2004-2005, I used a Perl-based shell. It was very much like a REPL: both a REPL for Perl and a REPL for Linux. I loved it. I don’t recall why I stopped using it, though the reason is probably as simple as “I lost my primary workstation to a burglar who took almost everything of value”. I was also starting to migrate away from Perl at the time. At the time, it was great, because my deep knowledge of Perl was directly translatable to shell use.

                                                                                                                                        What I’d really love is scsh with interactive-friendly features.

                                                                                                                                        1. 4

                                                                                                                                          What I’d really love is scsh with interactive-friendly features.

                                                                                                                                          Hi, I’m the author of Rash. It’s a shell embedded in Racket. It’s designed for interactive use and scripting. It’s super extensible and flexible. And I need to rewrite all of my poor documentation for it.

                                                                                                                                          Currently the line editor I’m using for interactive Rash sessions leaves a lot to be desired, but eventually I plan to write a better line editor (basically a little emacs) that should allow nice programmable completion and such.

                                                                                                                                          Also, salient to the OP, job control is not yet supported, though that has more to do with setting the controlling process group of the host terminal. You can still run pipelines in the background and wait for them, you just can’t move them between the foreground and background and have the terminal handle it nicely.

                                                                                                                                          Replying more to the parent post, for the few scripts that I really need to run on a system that I haven’t installed Rash on, I write very short scripts in Bash. But realistically I just treat Rash as one of the things I need to get installed to use my normal computing environment with extra scripts. And a lot of scripts are intended to run in a specific context anyway – on some particular machine set up for a given purpose where I have already ensured things are installed correctly for that purpose. Writing scripts in Rash instead of Bash is nice because my scripts can grow up without a rewrite, and because as soon as I hit any part of the program that can benefit from a “real” programming language I can effortlessly escape to Racket. Using Rash instead of plain Racket (you could substitute, say, Python instead if you want) is nice because I can directly type or copy/paste the commands I would use interactively, with the pipeline syntax and everything. In practice, Rash scripts end up being a majority of normal Racket code with a few lines of shell code – most scripts ultimately revolve around a few lines doing the core thing you were doing manually that you want to automate, with the bulk of the script around it being the processing of command line arguments, munging file names, control flow, error handling… lots of things that Bash and friends do poorly.

                                                                                                                                          1. 1

                                                                                                                                            Thanks for making this, and for pointing it out here!

                                                                                                                                          2. 3

                                                                                                                                            I came to open source and the scripting world first through Perl, and that journey taught me about, and more importantly to think “in” data structures such as arrays and hashes, and combinations of those. For that I’ll be ever grateful - plus the community was absolutely wonderful (I attended The Perl Conference back in the day, and was a member of London Perl Mongers). Now I’m discovering more about the shell and related environment, such as Awk and Sed, I’m looking at Perl again through different eyes (as in some ways it’s an amalgam of those and other tools).

                                                                                                                                          3. 1

                                                                                                                                            Thanks, yes this has been brewing for a while in my head, and I finally found the opportunity to write it down. I would also be curious to hear from folks about what you say above, for sure. Always learning. Cheers.

                                                                                                                                          1. 11

                                                                                                                                            My current work is inside the largest local government in the US, which employs ~325,000 people.

                                                                                                                                            The team my role exists within has a mission to proactively find and correct what this link might describe as “production failures”; instead of serving requests for city services, we proactively seek out residents who would qualify for available social services and “case manage” them through accessing those services.

                                                                                                                                            We target residents that historical data shows go underserved, e.g. contacting residents recently released from correctional facilities to guide them towards accessing public health insurance options. We also respond to executive calls for special projects, e.g. activating a task force to contact nursing homes in order to confirm they acknowledge and have publicly posted COVID safety guidelines.

                                                                                                                                            A friend of mine - a former USMC helicopter pilot turned data scientist - has likened this to the preflight checklists they now use in another complex system: mechanical human flight.

                                                                                                                                            Everything we do is redundant, but it also extends beyond established boundaries of departmental/agency mission ownership. This comes with the same set of challenges you might expect if you’ve had your own experiences of the political boundaries between conflicting departments of any large private sector organization.

                                                                                                                                            I believe what our team does is a Good Thing, so I share it as an example of what I believe a version of “better government” looks like to anyone who found the shared link compelling.

                                                                                                                                            But the technical team responsible for all data infrastructure and analysis supporting our team’s activities is 2 people: myself and one report. I have 5 other roles that sit budget-approved but unfilled. The pandemic-related fiscal crisis triggered a hiring freeze, but those roles had sat unfilled well prior to the pandemic. I share this as context for one closing thought.

                                                                                                                                            If you are feeling lost and powerless in these challenging times, please consider taking your technical skills to a critical complex system in a constant state of failure: your local government.

                                                                                                                                            Such work might provide you with deeper meaning (if that is what you are searching for), but you should also be aware that it comes with tremendous challenges and personal costs. In full disclosure, I am not sure how much longer I can tolerate aspiring to effectively do this work while so dramatically under resourced, but I will never lose the perspective I have gained that these complex systems are made up of real people trying their best.

                                                                                                                                            1. 23

                                                                                                                                              Part I starts with a faulty premise. This means that our explanations might not fulfill the explainable-AI requirements. Why? Because the discovery of the Higgs boson was made by theories about why particles have mass having various implications, and those implications being followed through with formal logic. In this arena of reasoning, we are not doing statistical analysis on lots of pseudo-continuous data, but instead our inputs are proof statements.

                                                                                                                                              If I had to explain to a lay reporter why we expected the Higgs boson, I would start my attempt with something like, “Electricity and magnetism are two ways to look at the same single happenstance in physics, where negatively- and positively-charged objects interact with each other. When we look at particle physics, we see a way to combine the behaviors we see, with atoms splitting and joining, with electromagnetism. This helps us build our models. But imagine that we switched ‘positive’ and ‘negative’. Look up the history with Ben Franklin, it’s a great story. The point is that those labels could have been switched, and that gives us a sort of symmetry for particles. The Higgs boson is a particle that should exist according to what we know about particle symmetries, and we call it ‘Higgs’ after one of the people who first noticed that it should exist.”

                                                                                                                                              Crucially, this explanation uses a real example as an analogy to bootstrap intuition about the unknown. Rather than handwaving, poisoning the well, or appealing to authority; the explanation lays out a context, including specific symbols (keywords) which can link to further details. The explanation does not rely on a vague statistical argument made using many weak indicators, but uses one main concept, symmetry, as its focus.

                                                                                                                                              Now, having said all this, I strongly suspect that the author might reasonably reply that the question they wanted to ask was more like, “what pattern in experimental data prompted the theoretical work which led to the proposal of the Higgs mechanism?” This does sound like something that could be answered with a data-driven correlation. And, indeed, that is what happened; the models of that time were faulty and predicted that certain massive particles should be massless. But the actual statistical techniques that were used were the standard ones; the explanation could begin and end with a t-test.

                                                                                                                                              All of this context is necessary to understand what will happen to poor Steve. Historically, Steve’s last name might be the most important input to the algorithm, or possibly their ethnic background if the algorithm can get hold of it more directly. And Steve’s inability to participate in society is explained away by the reassurance that there are millions of parameters which might have influenced the decision. This is exactly the sort of situation that explainable-AI proponents are trying to avoid.

                                                                                                                                              But in both cases, the reasoning is not simple, there’s no single data point that is crucial, if even a few inputs were to change slightly the outcome might be completely different, but the input space is so fast it’s impossible to reason about all significant changes to it.

                                                                                                                                              I don’t agree with this. Specifically, I don’t think that those millions of parameters are actually used much. Instead, I think that NNAEPR and there are only a handful of parameters which account for almost all of the variance in loan amounts, and that the error of the remaining parameters is subject to roundoff. Similarly, only one measurement, mass, needed to be wrong to provoke the development of the Higgs mechanism in theory.

                                                                                                                                              The explanation in part III is not a valid proof, because correlation is not transitive. I do appreciate the exploration here into epistemology and the nature of justification. But I can’t ignore the fact that the maths are incorrect; if an AI can generate English which successfully bullshits people, then is it really explaining or just lying? In a hypothetical world where AIs have civil rights, we would expect AI explanations to be just as cross-examinable as human explanations, and thus to stand up under scrutiny. What difference is there between an opaque AI loan officer and an opaque human loan officer?

                                                                                                                                              As we have explored here before, we must be extremely skeptical of the argument that it is simply too hard to explain ourselves to others, in the context of the immense social harm which results from being judged by opaque corporations. Specifically, when they claim that they cannot be legible to outsiders, they are trying to find ways to be less responsible for their own actions; be assured that the corporation is legible to itself.

                                                                                                                                              1. 10

                                                                                                                                                we must be extremely skeptical of the argument that it is simply too hard to explain ourselves to others, in the context of the immense social harm which results from being judged by opaque corporations

                                                                                                                                                Just want to say that I think this is a really thoughtful and true thing, beyond the rest of your commentary. Ultimately the worth of these tools, surely, must be measured in how beneficial they are to society.

                                                                                                                                                If a neural net loan officer saves society a few tens of of thousands human-labor-hours a year, subsequently making loans slightly cheaper and more profitable, that’s good. But if they do that while also making it impossible to answer the question “why was this loan denied”, then well, the net effect is that you made the world worse and more open to exploitation and your approach should be curtailed.

                                                                                                                                                1. 5

                                                                                                                                                  Back in 1972 John Kemeny (co-developer of BASIC) was warning about uninterrogable decision-making (in Man and the Computer):

                                                                                                                                                  I have heard a story about the design of a new freeway in the City of Los Angeles. At an open hearing a number of voters complained bitterly that the freeway would go right through the midst of a part of the city heavily populated by blacks and would destroy the spirit of community that they had slowly and painfully built up. The voters’ arguments were defeated by the simple statement that, according to an excellent computer, the proposed route was the best possible one. Apparently none of them knew enough to ask how the computer had been instructed to evaluate the variety of possible routes. Was it asked only to consider costs of building and acquisition of property (in which case it would have found routing through a ghetto area highly advantageous), or was it asked to take into account the amount of human suffering that a given route would cause? Perhaps the voters would even have agreed that it is not possible to measure human suffering in terms of dollars. But if we omit considering of human suffering, then we are equating its cost to zero, which is certainly the worst of all procedures!

                                                                                                                                                  (This message brought to you by the Campaign for the Rehabilitation of BASIC.)

                                                                                                                                                  1. 3

                                                                                                                                                    You raise an important point about model interpretability. All models that predict the future by training on historical data propagate historical bias. This is an effect, not a side-effect.

                                                                                                                                                    A simple example can be found in natural language processing, where words become numbers to be usable as model features. With a model trained on a corpus of human-written documents, you’ll be able to “subtract” the word “woman” from the word “king” to get the result of “queen” and think yourself quite clever. Then, you’ll subtract the word “woman” from the word “doctor” and find yourself uncomfortable to discover the result is “nurse”.

                                                                                                                                                    An additional example drawing from the above comment: if it is illegal and unethical to deny a loan on the basis of race, but you build an opaque model to predict loan outcome that (under the hood) incorporates e.g. census block as a feature, you will still have built a redlining AI that reinforces historical racial segregation.

                                                                                                                                                  2. 1

                                                                                                                                                    I don’t agree with this. Specifically, I don’t think that those millions of parameters are actually used much. Instead, I think that NNAEPR and there are only a handful of parameters which account for almost all of the variance in loan amounts, and that the error of the remaining parameters is subject to roundoff. Similarly, only one measurement, mass, needed to be wrong to provoke the development of the Higgs mechanism in theory.

                                                                                                                                                    Okay, How do you explain why you believe the parameters necessary are smaller? If you want to counter his argument based on the maths being wrong you have to explain why you think the maths are wrong. And that in some sense is playing straight into his argument.

                                                                                                                                                    1. 4

                                                                                                                                                      I strongly suggest that you spend some time with the linked paper. From a feature-based POV, polynomial regression directly highlights the relatively few robust features which exist in a dataset. Neural nets don’t do anything desirable on top of it; indeed, they are predicted and shown to have a sort of collinearity which indicates redundancy in their reasoning and can highlight spurious features rather than the robust features which we presumably desire.

                                                                                                                                                      Even leaving that alone, we can use the idea of entropy and surprise to double-check the argument. It would be extremely surprising if variance in Steve’s last name caused variance in Steve’s loan qualifications, given the expectation that loan officers do not discriminate based on name. Similarly, it would be extremely surprising if variance in Steve’s salary did not cause variance in Steve’s loan qualifications, given the expectation that salaries are correlated with ability to pay back loans. This gives us the ability to compare an AI loan officer with a human loan officer.

                                                                                                                                                    2. 1

                                                                                                                                                      This means that our explanations might not fulfill the explainable-AI requirements. Why? Because the discovery of the Higgs boson was made by theories about why particles have mass having various implications, and those implications being followed through with formal logic. In this arena of reasoning, we are not doing statistical analysis on lots of pseudo-continuous data, but instead our inputs are proof statements.

                                                                                                                                                      If I had to explain to a lay reporter why we expected the Higgs boson, I would start my attempt with something like, “Electricity and magnetism are two ways to look at the same single happenstance in physics, where negatively- and positively-charged objects interact with each other. When we look at particle physics, we see a way to combine the behaviors we see, with atoms splitting and joining, with electromagnetism. This helps us build our models. But imagine that we switched ‘positive’ and ‘negative’. Look up the history with Ben Franklin, it’s a great story. The point is that those labels could have been switched, and that gives us a sort of symmetry for particles. The Higgs boson is a particle that should exist according to what we know about particle symmetries, and we call it ‘Higgs’ after one of the people who first noticed that it should exist.”

                                                                                                                                                      I think we might have different intuitions for what “explanation” means here.

                                                                                                                                                      The above is the kind of news-conference explanation that is not at all satisfying. It’s a just-so story, not something you’d want from, e.g., an emergency system controlling the launch of nuclear missiles… or even from an organ transplant model that decides who the most likely to benefit patients are.

                                                                                                                                                      Maybe, if you actually know physics yourself, try to answer a question like:

                                                                                                                                                      “Why is the mass of the higgs boson not 125.18 ± 0.15 GeV/c^2 instead of 125.18 ± 0.16 (as per Wikipedia) ?”

                                                                                                                                                      or

                                                                                                                                                      “What would it take for the mass to be 125.18 ± 0.15 ? How would the theory or experiments have to differ ?”

                                                                                                                                                      Those are the kind of explanations that, I presume, a physicist working on the Higgs boson could give (not 100% sure how accessible they are, maybe anyone with a physics PhD could given a bit of digging). But the likelihood of me understand them is small, you can’t make a “just-so story” to explain those kinds of details.

                                                                                                                                                      Yet ultimately it’s the details that matter, the metaphysics are important but explaining those does not give a full picture…. they are more like intuition pumps to begin learning from.

                                                                                                                                                      I don’t agree with this. Specifically, I don’t think that those millions of parameters are actually used much. Instead, I think that NNAEPR and there are only a handful of parameters that account for almost all of the variance in loan amounts, and that the error of the remaining parameters is subject to roundoff. Similarly, only one measurement, mass, needed to be wrong to provoke the development of the Higgs mechanism in theory.

                                                                                                                                                      This I actually agree with, and my last 4 months of leisure time research have basically been spent on reading up on parameter pruning and models that use equations that are quick to forward prop but take long to fit (quick to forward prop is often ~= easy to understand, but I prefer to focus on it since it has industry application).

                                                                                                                                                      That being said it’s not all clear to me that this is the case, it could be the case in some scenarios but (see III.2 and IV in the article) it’s probably an NP-hard problem to distinguish those scenarios. And maybe my intuition that most models are over-parameterized and using backprop-optimized operations is just wishful thinking.

                                                                                                                                                      Again, not to say our intuitions differ here, but I tried to go against my intuitions when writing this article, as state in it.