1. 9

I really think ‘package manager for ${LANGUAGE}’ is one of the most insidious anti-features that’s eating software engineering at the moment. First, it assumes that all of your dependencies are written in the same language. Modern software is written in multiple languages (Chome and Firefox were both at around 30 languages last count), how well do all of these compose? Integrating them with the build system makes things worse. CMake’ support for Objective-C is still very new and has bugs (hopefully they’ll reopen some of the ones that I filed that were closed because Objective-C was not supported) and Objective-C can be built with exactly the same compiler and compiler driver and linked with the same linker as C and C++. For anything else, it is even worse. Even if you do have all of your dependencies in the same language (for example, if you’re a Go developer and don’t want to suffer through the nightmare that is Go’s C interop layer), now you are causing pain for packagers and folks doing security audits. Consider a simple example: How would the FreeBSD package infrastructure respond to a bug in OpenSSL? I presume most Linux distros have similar machinery but this is the one I’m most familiar with. A quick look at the package graph finds everything that (indirectly) requires the OpenSSL package as a build or run-time dependency. This gives a safe over-approximation of everything that needs fixing. The VuXML entry is published with the vulnerability and so pkg audit can automatically notify any of the users of affected packages. When there’s a fix, the OpenSSL package is updated and everything that depends on it is rebuilt (typically, in fact, everything is rebuilt because it only takes about 24 hours on a big beefy machine and it’s easier than figuring out what needs incremental building). Now what happens if your program has used some language-specific package manager to include OpenSSL? If you’ve done dynamic linking, you may pick up the packaged version but if you’ve statically linked then you won’t and so the package maintainer needs to separately update your program. Not too bad for one program, but imagine if 10,000 packages (around a third of the total) did it. Now spare a thought for whoever is responsible for the security process for a large deployment. After Heartbleed, everyone on *NIX systems could check if they were vulnerable by querying their package manager and if they had a package set from after the fix was pushed out they were fine. With things pulling in dependencies that the package manager has no visibility into, this is much worse. Now imagine that you want your language-specific package manager to integrate with the platform-specific package manager. Now you have an N:M problem, where every language builds a package manager and needs an interop layer for every target OS. What I really want is a portable format for expressing dependencies (canonical name, version, any build options that I need enabled, and so on) and querying any package manager to provide them. On Windows or macOS, this could just grab NuGet packages or CocoaPods that I bundle with my program, on platforms with a uniform package management story it could simply drive that. 1. 2 That’s why Bazel and Nix is very important, both are language agnostic package managers. I use Bazel in a setup where you have C dependencies, Python dependencies and Swift dependencies, and it works like a charm. That has been said, these language agnostic package managers have to integrate with the rest of other language specific package managers to really take off unfortunately. 1. 1 Now what happens if your program has used some language-specific package manager to include OpenSSL? If you’ve done dynamic linking, you may pick up the packaged version but if you’ve statically linked then you won’t and so the package maintainer needs to separately update your program. Not too bad for one program, but imagine if 10,000 packages (around a third of the total) did it. I think the answer to this is: if you’ve done something over and above what your operating system’s package manager provides, you’re responsible for it and your operating system is not. Language specific package managers were around and in active use at the time of Heartbleed and didn’t pose any special problems. I worked on a big public website at the time and we updated the operating system packages, then the relevant Ruby packages and then regenerated certs. It was an operational nightmare for many reasons but I have to say that, from memory, finding and updating the relevant software two times was not one of them. The problem seems to revolve around static linking (and containerisation too - which has the same properties as static linking for this discussion) and not language-specific package managers. 1. 4 Anyone used Bazel’s rules_docker for building OCI and deploy? I am interested in that, but not sure if it is a popular option and how the support looks like. 1. 3 Currently using this where I work. We use Docker format rather than OCI but I suspect the experience would be the same- haven’t been bit by too many things and I find the performance pretty good. Two things I would call out: 1. if you are pushing multiple images, that requires multiple bazel run invocations unless you build a container_bundle+push-all [0]. This can have some overhead if you do bad things to your analysis cache. 2. (Haven’t investigated this too much yet but) there appears to be an issue with detecting in a remote repo (AWS ECR) whether an image has actually changed and choosing to always repush. Suspect it has something to do with stamping [1]. 1. 5 I guess the most REPL-y experience most programmers have is bash… it even has two namespaces a la common lisp (./myfile and cat myfile) 1. 2 It also has a persisted objects system, a.k.a. files :) 1. 3 Thanks for submitting. I was pleasantly surprised to see the author recommending a very simple database schema. I don’t like SQL, but I think I would be content to use it for the incremental update and random access benefits if the schema were simple. It’s easy for the OpenDocument case to just say “use plain text files, images and git”, but I’m convinced that using sqlite or designing your own careful file format is a good solution for applications. 1. 5 I’ve been using SQLite as the application format for various projects. It is easy to use because it is ubiquitous. I can package SQLite myself; I can use SQLite from the distro; I can update as I go; I don’t need to worry about saving took long time. As long as I stick with the basics, it works everywhere. That has been said, it is interesting that in many places, I do find the application format end-up as a simple key-value pairs, or a web (tree?) of hierarchical objects. Both SQLite can model, however, both are not tabular per-se. Something like HDF5 seems model better for my hierarchical objects. However, packaging HDF5 with various wrappers is a challenge by itself, and SQLite has APIs I am already familiar / enjoying with. 1. 1 both are not tabular per-se Did you try sqlite lsm extension? It is available in sqlite3 tree and you can use it with a sqlite database (or standalone). 1. 21 A long time ago I worked on Transactional NTFS, which was an attempt to allow user controlled transactions in the filesystem. This was a huge undertaking - it took around 8 years and shipped in Vista. The ultimate goal was to unify transactions between the file system and SQL server, so you could have a single transaction that spans structured and unstructured data. You can see the vestiges of that effort on docs.microsoft.com, although if you click that link, you’ll be greeted with a big warning suggesting that you shouldn’t use the feature. One of the use cases being mentioned early in development was atomic updates to websites. In hindsight, I’m embarrassed to not reflexively call “BS” then and there. Even if we could have had perfectly transactional updates to a web server, there’s no atomicity with web clients who still have pages in their browser with links that are expected to work, or are even actively downloading HTML which will tell them to access a resource in future. If the client’s link still works, it implies a different type of thinking, where resources are available long after there are no server side links to them, which is why clouds provide content addressable blob storage which is used as the underpinnings for web resources. Stale resources are effectively garbage collected in a very non-transactional way. Once you have GC deleting stale objects, you also don’t need atomic commit of new objects either. The majority of uses we hoped to achieve didn’t really pan out. There’s one main usage that’s still there, which is updates: transactions allow the system to stage a new version of all of your system binaries while the system is running from the old binaries. All of the new changes are hidden from applications. Then, with a bit of pixie dust and a reboot, your system is running the new binaries and the old ones are gone. There’s no chance for files being in use because nothing can discover the new files being laid down until commit. I really thought I was the last person alive still trying to make this work when writing filter drivers in 2015 that understand and re-implement the transactional state machine so the filter can operate on system binaries and the system can still update itself. Somebody - much older and more experienced in file systems - remarked when we were finishing TxF that file system and database hybrids emerge every few years because there’s a clear superficial appeal to them, but they don’t last long. At least in our case, he was right, and I got to delete lots of code when putting together the ReFS front end. 1. 2 This was a super interesting read, thanks for sharing it! Even if we could have had perfectly transactional updates to a web server, there’s no atomicity with web clients This seems to become more of an issue when clients run code. When there is no client side code it seems to be a non-issue to me. (Say, all assets can be pushed via HTTP/2 to make sure the version is right.) If there is client-side code, one could force a re-load when the server-side codebase has changed. That aside, I’m not talking about transactions for application change, I’m talking about transactions for user data changes. That is currently unsolved, unless one stores all user uploaded images in the DB. remarked when we were finishing TxF that file system and database hybrids emerge every few years because there’s a clear superficial appeal to them, but they don’t last long Haha, interesting! I guess only time can tell. :) 1. 1 This seems to become more of an issue when clients run code. That’s half of Fielding’s thesis on REST right there ;) It’s a bit unfortunate that the need (and I agree it is a need) for encryption/confidentiality/privacy led to the current state of http2/tls - where a lot of the caching disappeared, leaving only client cache and server/provider cache (no more mtm lan caches)-which makes REST less interesting. Even for application/websites were the architecture is a great fit. Recommended reading (still)ffor those that havennot read it (just remember modern web apps/spas are not REST - they’re more like applets or word files with macros. https://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm 2. 1 What do you see as a problem when implementing transactions on a file system? It seems doable (as ZFS’s snapshot has shown, partially), but why it is not more prominent? Is there any trade offs made that bitter? 1. 4 I don’t think the problem was implementation. It was more a case of a solution looking for a problem. In hindsight, among other things, a file system is an API that has a large body of existing software written to it. Being able to group a whole pile of changes together and commit them atomically is cute, but it doesn’t help if the program reading the changes is reading changes from before and after the commit (torn state.) Although Transactional NTFS could support transactional readers (at the file level), in hindsight it probably needed volume wide transactional read isolation, but even if it did, that implies that both the writer and reader are modified to use a new API and semantics. But if you can modify both the producer and consumer code, there’s not much point trying to impersonate a file system API - there are a lot more options. 1. 1 I’d say the big issue is that classic OSs are not transactional, so adding a transactional FS to it just doesn’t make sense. What if a process starts a transaction, keeps it open for months, then crashes? To make a transactional FS you also need to build a transactional OS around it. 1. 3 I struggle with this kind of thing because I don’t really know what a${LANGUAGE} package manager is. In the .NET / Java world, this is something I can just about understand for pure-Java/.NET packages: it’s a set of bytecode files that provide a set of classes in a given namespace. It gets a bit more interesting when this depends on some native library. If, for example, I want to have a Java wrapper around libavcodec, do I expect a Java package manager to provide the libavcodec shared library? Now it’s not a Java package manager, it’s a Java + Linux/Windows/macOS/*BSD/whatever package manager and needs to know how to get the right version of the library for each of my build targets. If I have a .NET or Python wrapper, does the same apply? Do all three need to have their own logic for fetching the libavcodec binary? If I’m running on a different architecture, do they all need to now have the logic for building a large C library from source? Do they also fetch and build all of the dependent libraries (the FreeBSD package lists 9 build dependencies and 22 dependent libraries for ffmpeg. The Python wrapper when installed from the OS package manager, picks these up automatically, I have no idea how it works with pip).

For C/C++, this makes even less sense. The difference between the install package for a library and the development one is just whether it contains header files (possibly debug symbols) and on non-Debian systems everyone realises that headers are tiny and so puts them in the same package. The only reason that you need a C/C++ package manager is if you want to bundle the library along with your program for distribution (either via static linking or by shipping the shared library). If you do that, then the library can’t be updated independently from the program if someone finds a security vulnerability. If you’re doing anything open source then you’re making work for downstream packages by using a tool like this, because they’ll want to avoid duplication and depend on the version of the library that is already packaged for the OS that they’re building packages for.

So the real use of this seems to be for proprietary software depending on open-source libraries and wanting to ship bundles of a program and all dependencies.

The thing I’d actually like is some kind consistent naming so that I can declare in a manifest for my program that I depend on libfoo.so version 4.2.1 or later and have that mapped automatically to whatever package installs that library, its headers, and its pkg-config files. Then apt, yum, pkg, and so on can all just be extended to parse the manifest and install the build dependencies for me (and when anyone wants to build packages for any of these, they just need to run the tool once on the dependency manifest to generate the package build and run dependency lists).

1. 1

In C / C++ context, it only makes sense to me if you want to statically link everything. And if you want to do that, a package manager fetching binary blob would definitely not be the right generic manager (at risk of pointing to different stdc++, libc etc). Dynamic libraries, as much as I disliked, was handled OK by apt / yum / pkg and so on (as you mentioned). Since you are doing system-wide changes with these anyway, wrapped in a Docker image is fine for these.

1. 1

I can declare in a manifest for my program that I depend on libfoo.so version 4.2.1 or later and have that mapped automatically to whatever package installs that library

The problem is that ‘libfoo.so version 4.2.1 or later’ is actually not as specific as you might think.

Sometimes, there are multiple libraries with the same name. Different operating systems choose differently how to deal with the collision.

Sometimes, there are multiple libraries with different names but which nominally implement the same API. For example, blas and openblas. Maybe you targeted blas for your project, but it would also work fine with openblas.

Or, maybe you require some openblas-specific or ncurses-specific features that aren’t guaranteed to be offered by providers of libblas.so/libcurses.so.

Different libraries have different standards for stability. If you targeted libfoo version 4.2.1, then version 4.2.6 will probably work fine too. Will 4.3.0? 5.0.0?

1. 1

The problem is that ‘libfoo.so version 4.2.1 or later’ is actually not as specific as you might think.

That’s exactly my point. The thing that I want is a canonical name for each library, all of its build-time options, and its version. I then want each OS to provide a map from each of those to things available in the package system.

1. 2

I follow Swift concurrency discussions quite closely especially now I have more time to invest into Swift development. From what I read, developers are quite aware of the thread explosion problem affecting libdispatch and try to not repeat that when implementing the actor model.

The libdispatch suffered from thread explosion because it cannot differentiate code that cannot make progress because limited concurrency or because the code error (see the PINCache thread a few years back about the fake deadlock due to concurrency limitation from libdispatch: https://medium.com/pinterest-engineering/open-sourcing-pincache-787c99925445). That directly resulted more complex APIs such as queue targeting to inherent both QoS and concurrency such that multiple private serial queues can multiplex on one serial queue.

I am hopeful that with actor model, we can figure out something that can efficiently multiplex on fixed number of threads without grow the threads number up dynamically.

1. 6

I spent the last two weeks reading about incremental computing, and how to apply that to incremental data pipelines and somehow I totally missed Differential Dataflows and DDLog. The graph example is spot-on, and this article breaks down every bit in a very clear way. Looking forward to the next one!

1. 5

I’ve been reading about incremental computing in the past a week or two as well! One thing I cannot quite put my head to it is how “incremental” is done. The “tracking of which part changed and recompute” is pretty straightforward. But cannot find good material to explain “incremental” part to me.

For example, in this article, it talks about scores.average(), so in “incremental” fashion, if a new student score added, it somehow can derive that I just need to compute previous_average_scores * previous_count / new_count + new_added_store / new_count or (previous_total_scores + new_added_score) / new_count. How system decide which intermediate value to keep track of and do the compute?

In more complicated cases, for example, if we do scores.top(10).average(), how a incremental computing system can derive that the minimal work should be previous_average_scores - lowest_score_so_far / 10 +new_added_score / 10 and keep track of lowest_score_so_far in the system?

Somehow I believe these issues are solved problems in incremental computing paradigm, but the material on the net is sparse on the details.

1. 3

I think there’s a variety of degrees to which systems are incremental. Some, like Adapton, are based on having an explicit dataflow and reusing previous computation results, thus avoiding to redo computations unaffected by the change – that’s essentially the same concept as build systems.

A truly incremental system, however, would consume a delta (a change) on the input (starting with ∅) and produce a delta on the output for maximum efficiency. In other words, an incremental system would ideally be Δinput→ Δoutput instead of input→output, with the key operations of the pipeline all consuming and producing deltas. That seems to be the idea behind differential dataflow.

But as you point out, you sometimes (actually often) need to reuse the previous state to calculate the output (whether it’s a delta or not), and in some cases you need the entire input anyway (like doing a SHA1 sum on the inputs). I haven’t seen this spectrum of being incremental articulated clearly in the literature so far, but the idea of differential is certainly that the computations should operate on deltas (differences) as opposed to complete values.

The How to recalculate a spreadsheet article has a bunch of links that you might find interesting on the topic. The Build systems à la carte paper might be an interesting read for you as well.

1. 2

Thanks! Equipped with what you said, I re-read the Differential Dataflow paper, and now it is much clearer. Their difference operators are on deltas and the so-called generic conversion from normal operators to difference operators just simply do \delta output = f(input_t + \delta input) - output_t. To make it work efficiently in LINQ settings, a few difference operators (especially for aggregation, such as sum and count) implemented manually because these can work with \delta only.

It is still very interesting, but much less generic / magic than I initially thought it could be.

2. 2

The biggest thing for incrementality is tracing data interdependence. If relation B draws in data from A and C, changes to those relations trigger a chain reaction, propagating said changes. For aggregates (the .group_by() clause of the post) things are pretty coarse-grained, any change to the input relation(s) will trigger a recomputation of the aggregate, so changes to Test or Student will cause scores to be recomputed. While in a reasonably trivial example like the scores one this seems stupid and/or wasteful, for more complex rules things get exponentially more difficult since worst-case joins are commonplace within ddlog code

1. 3

What are the benefits wrt another high performance GC? O(1) allocation and de-allocation is the norm.

1. 7

Thank you for asking. This implementation provides short O(1) pause times, regardless of working set size, and it also compacts. Maybe such things do already exist, but I couldn’t find such things when I searched as I was thinking about this project, so I thought this would be new. I’m happy to compare to another implementation if you could point me to one.

1. 3

I see. Given the expensive read barrier, I think there are other ways to achieve similar characteristics with better trade-offs (I can find references tomorrow, but it is a bit late here, sorry). However, your design seems to make a lot of sense for a compacting database (the expensive read barrier can be factored in the cost to access external pages, storage overhead might be a problem though).

Small question: when GC thread is active, what happens if the main thread fills the region before than the GC can complete?

1. 1

Thanks I’ll be happy to read them. Regarding your question, I use the term “region” in the code to refer to a range within the CB/ring-buffer. That region is allocated by the main thread for use by the GC thread, and then the main thread never touches it. If you’re referring to the CB/ring-buffer itself, as in “what happens if the main thread fills the CB before the GC completes?” then the answer is that the main thread will resize the CB upwards to the next power-of-2, copy all of the contents of the old CB (including the still-being-written GC consolidation region), and then when the main thread observes the GC to have completed, it will copy forward the data from the old CB’s consolidation region to the equivalent range in the new CB. In this way, the GC thread only knows about the CB that was given to it during the start of the GC/consolidation, and the main thread is responsible for the shift of the CB to a larger one.

Thanks! –Dan

2. 2

Yeah, it looks like an exercise to write such GC. Other pauseless GC’s has similar characteristics and arguably cheaper in many cases. This implementation for lookups requires a hash table on-the-side and most likely requires a read-barrier (when rewire objects moved from zone A to B to C).

1. 3

Thanks for taking a look! Yes it does require a read-barrier for the objects which have shifted locations. I’d be interested if you could link me to the best implementations you know of.

1. 3

Read what you did led me immediately to Azul’s Pauseless GC. Both does tons of work on read side to ensure GC thread can finish marking and compacting in predictable way. In their case, I think they trap the pointer read to outdated pages to update the reference (in compacting phase).

Edit: thinking through their marking phase and yours (from my understanding, your zone B is for marking). One thing I don’t quite understand is how you can make sure mutations in zone A has no pointers back to zone B / C objects, hence, you can do marking phase correctly without consulting zone A?

1. 2

I cheat by not using pointers but instead handles. Each allocation is just given an increasing integer ID, which is “dereferenced” indirectly through the O(log32(N)) objtable. This should be a reasonably shallow data structure, but is definitely not free, so adds overhead to the dereferences.

1. 1

Yeah, by pointers, I meant handles. I guess whenever zone A tries to references an object already moved to zone B, you copied it out of zone B back to zone A? Otherwise I cannot see how you can avoid in marking phase missed some objects referenced from zone A?

1. 1

Yeah, the vision is that all objects are either “small” (e.g. tuples under some arity [envisioned, but not yet present in this implementation]) and are copied from either B or C into A when they need to become mutable, or they are larger (e.g. maps/collections) but are persistent objects which are mutated via path-copying of the path-to-mutated-leaf, where such copied path is written out in A and is similarly considered “small”.

2. 3

If you haven’t seen it already, you might find some of the work on Shenandoah interesting - specifically how they’ve optimized forwarding pointers/read barriers.

1. 3

Without a corresponding announcement from JetBrains, this kinda feels like a hostile fork, especially given how the article lacks any kind if info about its future relationship with “JetBrains Kotlin”.

The extensions themselves … look a lot like they were done by people with little design experience, opting for the “let’s add keyword and special syntax all over the place”.

1. 2

We will see in the next a few days whether it is in collaboration with JetBrains. But yes, the whole announcement looks weird. It seems take very similar design points from Swift for TensorFlow and transplanted to Kotlin. Maybe it is a PyTorch v.s. TensorFlow rivalry thing again? But there is no mention of PyTorch as well. Given a few days with their code release would clarify a lot of things.

1. 1

let’s add keyword and special syntax all over the place

Supporting that, wouldn’t it be much nicer if every function was “differentiable” by default and the compiler decides to remove the attribute if not applicable instead of explicitly specifying it? Or if it has to remain, kotlin already has the decorator syntax.

1. 16

Having recently introduced a “please explain to me what how a | is used in a bash shell” question in my interviews, I am surprised by how many people with claimed “DevOps” knowledge can’t answer that elementary question given examples and time to think it out (granted, on a ~60 sample size).

Oh, this is a gem! It will go right next to “why stack the memory area has the same name as stack the data structure” into the pile of most effective interview questions.

1. 12

Do these questions even work? seriously. I remember interviewing someone who didn’t have the best concepts of linux, shell,etc but he knew the tools that were needed for the DevOps role and he gets the job done; knowing things like what a shell pipeline doesn’t factor in for me.

In terms of the article itself, like I said above, people know AWS and know how to be productive with the services and frameworks for AWS. that alone is a figure hard to quantify. Sure I could save money bringing all the servers back internally or using cheaper datacenters, but I worked at a company that worked that way. You end up doing a lot of busy work chucking bad drives, making tickets to the infrastructure group and waiting for the UNIX Admin group to add more storage to your server. WIth AWS I can reasonably assume I can spin up as many c5.12xlarge machines as I want, whenever I want with whatever extras I want. It costs an 1/8 of a million a year, roughly. I see that 1/8 of a million that cuts out a lot of busy work I don’t care about doing and an 1/8 of a million that simplifies finding people to do the remaining work I don’t care about doing. The author says money wasted, I see it as money spent so i don’t have to care, and not caring is something I like; hell it isn’t even my money.

1. 4

I remember interviewing someone who didn’t have the best concepts of linux, shell,etc but he knew the tools that were needed for the DevOps role and he gets the job done

I have to admit, I’ve never interviewed devops, only engineers. And in my experience, it’s more important for an engineer to dig into fundamental processes that he’s working with, and not just to know ready-made recipes to “get the job done”.

1. 7

I agree completely with this statement, and I think this is exactly what the article mentions as one of the lock-in steps. The person can “get the job done” because “they know the tools” is exactly the issue - the person picked up the vendor-specific tools and is efficient with them. But in my experience, when shit hits the fan, the blind copy-pasting of shell commands starts because the person doesn’t undersand the pipe properly.

Now, I don’t mean by that that the commenter above you is wrong. You may be still saving money in the long run. I’m just saying that it also definitely increases that vendor lock in.

2. 3

I feel like saving your company of whatever scale $15,000 a year per big server worthwhile, as long as it doesn’t end up changing your working hours. I know that where I work, if I found a way to introduce massive savings, I would be rewarded for it. Shame SIP infrastructure is so streamlined already… 1. 2 It is optimized for accuracy, not recall. This question may have some positive correlation with good devOps. It may just have positive correlation with year-of-experience, hence, good devOps. Hard to quantify. 2. 2 Too bad the author didn’t specify how many is “many”. I would expect some of the interviewees not answering because of interview stress, misunderstanding the question etc. 1. 25 This is not an answer in vogue, but I don’t want ops people who get too stressed to be able to explain shell pipelines. 1. 12 In my experience, a lot of people that get stressed during interviews don’t have any stress problems when on the job. 1. 6 Indeed. I once interviewed an engineer who was completely falling apart with stress. I was their first interview, and I could tell within minutes they had no chance whatsoever of answering my question. So I pivoted the interview to discuss how they were feeling, and why they were having trouble getting into the problem. We ended up abandoning my technical question entirely and chatting for the rest of the interview. Later, in hiring review, the other interviewers said the candidate nailed every question. Strong hire ratings across the board. Had I pressed on with my own question instead of spending my hour helping them de-stress and get comfortable, we likely never would have hired one of the best I’ve ever worked with. 2. 7 I quite disagree with this, perhaps because I’m the type of person that gets very stressed out by interviews. What you’re saying makes sense if we assume that all stressors are uniform for all people, but that doesn’t really match reality at all. For me, social situations (and interviews count as social situations) are incredibly, sometimes cripplingly stressful. At worst, I’ve had panic attacks during interviews. However, throughout my entire ops career I’ve worked oncall shifts, and had incidents with millions of dollars on the line, and those are not anywhere near the same. I can handle myself very well during incidents because it’s entirely a different type of stressor. 1. 4 Same in my company. All engineering is on-call for a financial system and it’s very hard to hire someone that get stressed out during the interview when this person would have to respond to incidents with billions in transit. 1. 4 Yep. I have a concern that in our push to improve interviewing we are overcorrecting. 2. 5 I’m helping my company interview some people in that area. We have a small automated list of questions around 10 to 12) that we send to candidates that apply, so nobody loses time with things that we’ve agreed interviewees should know. Less than 10% manage to answer questions like “Which command can NOT show the content of a file?” (with a list having grep/cat/emacs/less/ls/vim). When candidates pass this test, we interview them, and less than 5% can answer questions like the author mentions, at least in a 1. 3 Kinda unrelated to the article was just an anecdote to say “there’s a load of people that can’t really use a classic server and need more modern IaaS to operate”. For the sake of defending my practices though, I did give people 5 minutes to think of the formulation and gave examples via text of how one would use it (e.g. ps aux | grep some_name). I think the amount of people that couldn’t answer was ~2/5. As in, I don’t think I did it in an assholish “I want you to fail way”. It’s basically just a way to figure out if people are comfortable~ish working with a terminal, or at least that they used one more than a few times in their lives. 1. 5 On the other hand, I can operate a “classic server”, but struggle with k8s and, to some degree, even with AWS. Although I’m sure I can learn, I simply never bothered to do so as I never had a reason or interest. I suppose it’s the same with many who were raised on AWS: they simply never had a reason to learn about pipes. 1. 1 I didn’t imply malpractice, rather statistical error. That said, anywhere close to 2/5 in conditions you described… It’s way higher than what I would expect. I didn’t hire any DevOps recently tho, so maybe I’m just unaware how bad things got. 2. 1 This is always true for interviews, but this is a measurement error that would be present for any possible interview question. 1. 1 Yeah that was my point. 1. 19 How did anyone look at this and say “yeah nah, selecting text isn’t an important use case on the web”? Why does selecting text matter you ask? Some people use it to aid with reading by selecting the text they are currently reading, other people use it to(and I know this is wild) copy parts of the text, people with dyslexia use tools that read out selected portions of the text to help them read. Does any of this work with Flutter’s default, unselectable text? Fuck no! One more broadside in the campaign against users by providers to take all the advantages of automation for themselves and leave none for us. 1. 8 Unfortunately, I have a pretty good idea about how that happened and it’s one of those things I file in the “Battles not worth fighting anymore”, in the “Reasons why web apps should die a fiery death” binder. With complex enough layouts built using enough CSS framework layers, the semantics of the page barely resembles what’s on the screen anymore. That leads to tragically funny situations where you want to select two or three lines of text in a column, but as soon as you get past the first line’s boundary, you’ve selected half the damn page, including four or five hidden elements that show-up when you paste them. Fixing the damn layout is, by now, a Herculean task where you’re wrestling not just against the CSS engine but also against whatever framework you’re using. In time, it has led to this pseudo-fix where some parts of the page are just labeled unselectable. Can’t select them by mistake if you can’t select them deliberately, either, eh? For single-page web apps, virtually all of the page is supposed to be unselectable by default at this point. This prevents user panic: some people (especially with touchpad gestures) end up accidentally dragging their cursor up or down for a bit and thus end up selecting the whole page. How do I know this: yours truly, definitely not a web developer, implemented something like that eons ago. I was working on a prototype for a hardware gizmo that had a web interface, among other things. Some higher-up threw a tantrum about the sub-par user experience on his Macbook because of said selection accident and I wired something up, largely by copy-pasting off StackOverflow, since I had not written any web-related code in 7+ years at that point, did not know, or care, about any of that, and had far more important problems to solve with the prototype (like, say, the fact that it had a crap USB controller and enumeration didn’t always work as planned…). A few years later a former colleague asked me if I remembered my solution having any browser portability problems (I didn’t, it took me a while to recall what solution he was asking me about and it was hardly my solution, too). Turns out it had bubbled its way up to a bunch of other components, precisely in order to “fix” that kind of but everywhere else. It also turned out that, over the years, the “bug” had been reported in terms of “page background changes when scrolling” on at least one occasion, too, as sometimes the behaviour was so weird that at least one person didn’t even realize they were selecting the damn thing. 1. 5 Fundamentally, a single-page application is an application, not a document. The issues with “selecting text” / “restyling title” should be treated as accessibility issues in application, not that a webpage doesn’t conform to a document standard no more. From that lens, this post basically complained the generic accessibility issues for all cross-platform UI toolchains that tries to be too ambitious and re-implemented everything from drawing layer. I happen to think this is the right way to build cross-platform UI with any cross-platform consistency. It just requires a lot of people power to get basics done, and then native accessibility right, and then ongoing maintenance to add more widgets and more platform-specific abilities There are a lot of failed attempts along this direction, but Flutter may have a chance with the vast resource from Google. Who knows. 1. 1 I’m interested in how this space evolves to take advantage of (eg) Cloudflare Durable Objects, etc. 1. 2 Yep. You could also potentially implement SQLite VFS on top of the Durable Objects specification to make it work within wasm runtime. It would be interesting to explore asynchronous messaging (a.k.a. email server) in this context. 1. 4 So what I didn’t get: What is FaaS ? What exactly is the use case for this service ? Defining some function which can then be called from anywhere (probably “sold”) ? 1. 9 It’s like CGI billed pr request on a managed server. 1. 7 NearlyFreeSpeech.net does something sort of like that. Not per request, but based on RAM/CPU minutes. I find it pretty convenient. 1. 5 Yeah, I was hoping on billing by resource use (RAM, CPU and data transfer through syscalls) in a way that would give you a more precise view into how long your programs were taking to run. This would also give people an incentive to make faster code that uses less ram, which i would really love to see happen. 2. 3 I think this whole FaaS is a very interesting movement. Combined with more edge pods we deployed through Fastly / Cloudflare, we are onto something quite different than the big cloud services we saw with Facebook / Twitter / Gmail today. Reimagining how you are going to do email today (or really, any personalized messaging system like Whatsapp). These edge pods with deployment bundles like wasm enables us to implement the always online inbox, potentially with high availability right at the edge. So your device syncing will be extremely fast. At the same time, it is hosted online so everyone can reach you with minimal central coordination. It is unlikely in the beginning these things by their own will be successful. Decentralization is not a big consideration at this time. But it could deliver benefits today if implemented correctly, even for existing big players like Facebook. You could have a Wasm edge node close to the device materialize the feed GraphQL fragments in anticipation for a new user request. And since the edge node will know what’s the last synced feed fragment, it can also do so incrementally. I am optimistic about this. Would love to see wasm based deployment taking off, especially for edge nodes. 1. 1 This is an approach and idea that DFINITY (https://dfinity.org/) is pursuing, to provide a fully decentralized computing platform. The system is running wasm as the basic unit of execution, and charges for the cycles, memory, and bandwidth used. Currently, it is in beta, but should become available next year. Disclaimer: I work for DFINITY. 1. 1 Thanks! Yep, I looked at DFINITY before. One thing would be compelling to me is the closeness to the customers. With our cloud computing moved to the low-latency territory (most significantly, the cloud gaming), closeness of the edge nodes is a necessity. This is often overlooked by many decentralized movements from cryptocurrency space (probably because these Dapps have different focuses). 2. 2 Functions as a Service. Basically the usecase is for people that want to run code that doesn’t run often enough to justify having a dedicated box for it, and just often enough that you don’t want to set up anything for it beforehand. In this case, I plan to start using it for webhook handlers for things like GitHub and Gitea. 1. 2 So then you plan to be administering/running Wasmcloud? The idea is that people can just upload code to you? What hosting service are you using? This reminds me that I need to write about shared hosting and FastCGI. And open source the .wwz script that a few people are interested in here: https://lobste.rs/s/xl63ah/fastcgi_forgotten_treasure Basically I think shared hosting provides all of that flexibility (and more, because the wasm sandbox is limited). I do want to stand my scripts up on NearlyFreeSpeech’s FastCGI support to test this theory though… I think the main problem with shared hosting is versioning and dependencies – i.e. basically what containers solve. And portability between different OS versions. I think you can actually “resell” shared hosting with a wasmcloud interface… that would be pretty interesting. It would relieve you of having to manage the boxes at least. 1. 4 So then you plan to be administering/running Wasmcloud? I have had many back and forth thoughts about this, all of the options seem horrible. I may do something else in the future, but it’s been fun to prototype a heroku like experience. As for actually running it, IDK if it would be worth the abuse risk doing it on my own. The idea is that people can just upload code to you? If you are either on a paid tier, uploading the example code or talked with me to get “free tier” access yes. This does really turn into a logistical nightmare in practice though. What hosting service are you using? Still figuring that part out to be honest. I think the main problem with shared hosting is versioning and dependencies – i.e. basically what containers solve. The main thing I want to play with using this experiment is something like “what if remote resources were as easy to access as local ones?” Sort of the Plan 9 “everything is a file” model taken to a logical extreme just to see what it’s like if you do that. Static linking against the platform API should make versioning and dependencies easy to track down (at the cost of actually needing to engineer a stable API). I think you can actually “resell” shared hosting with a wasmcloud interface… that would be pretty interesting. It would relieve you of having to manage the boxes at least. I may end up doing that, it’s a good idea. 1. 1 (late reply) FWIW I have some experience going down this rabbithole, going back 10 years. Basically trying to make my own hosting service :) In my case part of the inspiration was looking for answers to the “polyglot problem” that App Engine had back in 2007. Heroku definitely did interesting things around the same time period. Making your own hosting service definitely teaches you a lot, and it goes quite deep. I have a new appreciation for all the stuff we build on top of. (And that is largely the motivation for Oil, i.e. because shell is kind of the “first thing” that glues together the big mess we call user space.) To be a bit more concrete, I went down more of that rabbithole recently. I signed up for NearlyFreeSpeech because they support FastCGI. I found out that it’s FreeBSD! I was hoping for a “portable cloud” experience with Dreamhost and NearlyFreeSpeech. But BSD vs. Linux probably breaks that. It appears there are lots of “free shell” providers that support CGI, but not FastCGI. There are several other monthly providers of FastCGI like a2hosting, but not sure I want to have another account yet, since the only purpose is to test out my “portable cloud”. Anyway, this is a long subject, but I think FastCGI could be a decent basis for “functions as a service”. And I noticed there is some Rust support for FastCGI: https://dafyddcrosby.com/rust-dreamhost-fastcgi/ ( I’m using it from Python; I don’t use Rust) It depends on how long the user functions will last. If you want very long background functions, then FastCGI doesn’t really work there, and shared hosting doesn’t work either. But then you have to do A LOT more work to spin up your own cloud. It’s sort of the “inner platform problem” … To create a platform, you have to solve all those same problems AGAIN at the level below. I very much got that sense with my prior hosting project. This applies to packaging, scheduling / resource management, and especially user authentication and security. Security goes infinitely deep… wasm may help with some aspects, but it’s not a complete solution. And even Google has that problem – running entire cluster managers, just to run another cluster manager on top! (long story, but it is interesting) Anyway I will probably keep digging into FastCGI and shared hosting… It’s sort of “alternative” now, but I think there is still value and simplicity there, just like there is value to shell, etc. 2. 1 So what I didn’t get: What is FaaS ? FaaS is a reaction to the fact that the cloud has horrendous usability. If I own a serer and want to run a program, I can just run it. If I want to deploy it in the cloud, I need to manage VMs, probably containers on top of VMs (that seems to be what the cool kids are doing), and some orchestration framework for both. I need to make sure I get security updates for everything in my base OS image and everything that’s run in my container. What I sctually want is to write a program that sits on top of a mainframe OS and runs in someone’s mainframe^Wdatacenter, with someone else being responsible for managing all of the infrastructure: If I have to maintain most of the software infrastructure, I am missing a big part of the possible benefit of outsourcing maintenance of the hardware infrastructure. Increased efficiency from dense hosting was one of the main selling points for the cloud. If I occasionally need a big beefy computer but only for a couple of hours a month and need a tiny trickle of work done that wouldn’t even stress a first-generation RPi the rest of the time, I can reduce my costs by sharing hosting with a load of other people and having someone else manage load balancing across a huge fleet of machines. If; however, I have to bring along a VM, container runtime, and so on, then I’m bringing a fixed overhead that, in the mostly-idle phases, is huge in comparison to my actual workload. FaaS aims to provide a lightweight runtime environment that runs your program and nothing else and can be scaled up and down based on load and billed by RAM MB-second, CPU-second and network traffic (often with some rounding). It aims to be a generic and scalable version of the kind of old-school shared hosting, where a load of people would use the same Apache instance with CGI: the cost of administering of the base environment that executes the scripts is shared across all users and the cloud provider can run the scripts on whatever node(s) in the datacenter make sense right now. The older systems typically used the filesystem for read-only data and a database for persistent data. With FaaS, you typically don’t have a local filesystem but can use cloud file / object stores and databases as you need them. Again, someone else is responsible for providing a storage layer that can scale up and down on demand and you pay for the amount of data that’s stored there and how often you access it but you don’t need to overprovision (as you do for cloud VM disks, where you’re paying for the maximum amount of space you might need for any given VM). TL;DR: FaaS is an attempt to expose the cloud as a useful computer instead of as a platform on which you can simulate a bunch of computers. 1. 1 How is this different from Airflow? 1. 2 I just read their documentation. It appears that from their perspective, Airflow deals with operations and dependencies between operations, while Dagster derives “solid” (their name for operations)’s dependencies from their inputs / outputs. In this way, it can drive the same operations with different data from local development environment the same way as it is deployed in your ETL pipeline, much easier to develop / debug. Since it only cares about data dependencies, input artifacts and output artifacts can be managed by Dagster too, hence, it is easier to retry without worrying about side-effect. 1. 2 I wonder how pandas’ CSV parser (which is pretty optimized) compares. Whenever I have to parse huge CSV files in Python, I use pandas just for that. 1. 3 I’ve never benchmarked pandas in particular, but have loosely benchmarked Python’s CSV parser. The inherent problem is measurement. What is your benchmark? Let’s say your benchmark is to count the sum of the lengths of all the fields. Well, that means Python will need to materialize objects for every record and every field. And that is probably what’s going to either dominate or greatly impact the benchmark, even if the underlying parser is written in C and could theoretically go faster. Pandas’ CSV parser is written in C, and if the comment at the top is true, it’s derived from Python’s csv module. Like the csv module, Pandas’ CSV parser is your run of the mill NFA embedded in code. This is about twice as slow as using a DFA, which is what my CSV parser uses. And the DFA approaches are slower than more specialized SIMD approaches. I’m less sure about the OP’s approach. 1. 2 Thanks! Love Ripgrep! I tried cargo build --release and time ./target/release/xsv index /tmp/DOHUI_NOH_scaled_data.csv, it took about 24 seconds for index to complete (I assume xsv index find all begins / ends of all cells, which approximately is what I am trying to do here for csv parsing). Didn’t do xsv entirely due to my unfamiliarity to Rust ecosystem. Sorry! 1. 1 Thanks. How do I run an equivalent benchmark using your CSV parser? I don’t think I see any instructions. 1. 1 It is not packaged separately, and ccv can be built with zero-dependency (meaning you may not have OpenMP enabled) so it is a bit more involved to make sure OpenMP is enabled. You can first install apt install libomp-dev clang, and then checkout https://github.com/liuliu/ccv repo. cd lib && ./configure to configure it with OpenMP (there should be a USE_OPENMP macro enabled, configure script should give you exact output of flags). cd ../bin/nnc && make -j would compile the demo csv program under ./bin/nnc 2. 1 Recently the guys from Julia started claiming that they have the fastest parser (link). 1. 4 It kind of looks like Julia’s CSV parser is cheating: https://github.com/JuliaData/CSV.jl/blob/9f6ef108d195f85daa535d23d398253a7ca52e20/src/detection.jl#L304-L309 It’s doing parallel parsing, but I’m pretty sure their technique won’t work for all inputs. Namely, they try to hop around the CSV data and chunk it up, and then parse each chunk in a separate thread AIUI. But you can’t do this in general because of quoting. If you read the code around where I linked, you can see they try to be a bit speculative and avoid common failures (“now we read the next 5 rows and see if we get the right # of columns”), but that isn’t going to be universally correct. It might be a fair trade off to make, since CSV data that fails there is probably quite rare. But either I’m misunderstanding their optimization or they aren’t being transparent about it. I don’t see this downside anywhere in the README or the benchmark article. 1. 2 1. Write the program in a reversible style. 2. Reverse the program. 2.5 ???? 3. Insert gradient codes.  Yeah, I think I need to read the paper… 1. 1 From glance of it, seems it sugests you don’t need to keep a stack of previous values, therefore, reduces the memory usage (no need of auxiliary memory to store the forward pass computed values). Seems plausible until some computations cannot be reversed? 1. 1 Thats also my concern. What are actual use cases of this technology? I guess only computation where the amount of information stays constant are allowed, which would discard many use case. You cant create something from nothing. May be a single or multi body physics simulation? 1. 1 Why don‘t you use Dask or Apache Spark? They all read csv files in parallel. cudf does it even with the help of the GPU, reading from disk directly into GPU memory. It‘s an interesting article nonetheless :) 1. 6 I am not the author, but because it’s an interesting engineering problem? I’d rather read something like this than how to install and run Spark. 1. 1 I get that and that‘s why I said it‘s an interesting read, but the author should‘ve at least mentioned or even benchmarked already parallized implementations 1. 5 I’ve been working on CSV stuff for a long time (I’m the author of xsv), and I’ve never even heard of cudf. So I wouldn’t fault the OP. And just from quickly glancing at cudf, benchmarking its CSV parser looks non-trivial because the happy path is to go through Python. So you’d have to be really careful there. And what happens if the task you want to do with the CSV data can’t be done on the GPU? Similarly, I’ve never used Apache Spark. How much time, effort and work would be required to get up to speed on how it works and produce a benchmark you’d be confident in? Moreover, if I want to use a CSV library in C++ in my application, is Apache Spark really a reasonable choice in that circumstance? 1. 1 I didn‘t want to be rude. From my perspective (mainly data science) everything is obvious and the tools I mentioned are very popular. I think the main difference between our „views“ is that the post focused on libraries and I‘m focused on frameworks. The solutions I proposed are fully fledged data processing frameworks like Pandas, R dataframes if you know those. Its basically Excel as a programming framework, but much faster and more capable. The abstraction level usually is very high. You would not iterate over rows in a column, but apply a function to the whole column. These are no solutions to be just used for their csv implementation, but as the solution for a complete data processing pipeline. And what happens if the task you want to do with the CSV data can’t be done on the GPU? cudf, dask and pandas all belong to the Python scientific ecosystem and are well integrated. You would convert it to Dask or pandas (numpy, cupy). Spark belongs to the Hadoop ecosystem and is used by many companies to process large amounts of data. Again nobody would use it just for the csv implementation. because the happy path is to go through Python In all frameworks Python is just glue code. All numeric code is written in faster languages. 1. 2 No worries. I get all of that. I guess my comment was more a circuitous response to your initial question: “why not use {tools optimized for data science pipelines}” where the OP is more specifically focused on a csv library. But I also tried to address it by pointing out that a direct comparison at the abstraction level on display in the OP is quite difficult on its own. But yeah, while I’m aware that data science has csv parsers in them, and they are probably formidable in their own right in terms of speed, I’m also aware that they are optimized for data science. Pandas is a good example, because its API is clearly optimized for cases where all of the data fits into memory. While the API has some flexibility there, it’s clear in my mind that it won’t and isn’t supposed to be a general purpose csv library. It may seem like a cop-out, but constraints will differ quite a bit which typically influences the design space of the implementation. 2. 1 Dask is an interesting omission, definitely on me! It would be tricky to do though, as @burntsushi pointed out. Dask tries to be as lazy as possible, and that can be a real challenge. OTOH, Pandas’ csv implementation is uninteresting. It is the reason I started to explore in the first place (it drives me crazy to save / load csv in Pandas!). I love Pandas for other reasons. As of Spark, I simply don’t know it has an interesting csv reader implementation! 1. 3 The article spent a lot of space to talk about how to save / restore context and reset stack. While it is interesting, using a proper stackless coroutine construct such as the one provided in C++20 would be much more efficient and much less space wasted for the non-portal stackful implementation. That, I guess, is also why so many languages choose to use a infectious async keyword rather than the stackful implementation (probably with only notable exception of Go?). What I really interested in the M:N discussion would be topics related to: 1. prioritization API design; 2. structured synchronization implementation; 3. observability (since stacktrace is useless at this point). 1. 4 I agree, C++20 coroutines are the elephant-in-the-article. Kept waiting for them to mention it. I can imagine that implementing a complex program based on the fibers described here would require careful tuning of stack sizes. You’d need pretty exhaustive testing, or tricky static analysis, to ensure that fibers never overflowed their stacks. (Unless you have tons of RAM available and don’t care if you’re wasting stack space.) Go gets away without this because it has growable stacks, which have proven very difficult to get right — initially they used stack segments, which hurt performance if a tight loop kept falling off the end of a segment. Now they just move the stack to a larger heap block, which involves carefully relocating all pointers to stack-based data. Given all that, I get why async/await is the model that most languages seem to be converging on (C++, JS, Rust, Nim, I know I’m forgetting some others…) 1. 3 I’ve spent a little bit more time to try it out for https://github.com/liuliu/dflat. Previously, I use https://github.com/glessard/swift-atomics. It sort of works, but has its own gotchas and also “at your own risk” because no SE-0282 to guarantee a memory model. Most importantly though, previous Swift Atomics implementation (and to a broader scope, the lightweight locks such as os_unfair_lock on iOS), you have to choose from two evils: 1. Have a reference-counted object, thus, you have a stable memory address for either locking or atomic operations; 2. Use struct to avoid reference-counting. But unlike C++, you don’t have move / copy constructor, so you don’t know if the underlying address was moved / copied to a different location. You have to be careful. The language itself spends a lot of time on balancing usability, memory safety and efficiency. That is why the whole language nowadays leaning very heavily on protocol + struct as the main programming paradigm (as demonstrated by SwiftUI) because it is safe (immutable by default, no object inheritance), easy to use (auto-synthesize Equatable, Hashable etc protocol conformance) and efficient (no reference counting). The previous implementations such as glessard/swift-atomics choose to use struct as the base, with a giant warning of “there will be dragons!” because for atomics, you really don’t want to allocate reference-counted memory for each of them separately. At the same time, the memory address can change under you. It is a dilemma. The new Swift Atomics library choose a different approach. To casually use it, you can use ManagedAtomic<> which will do the right thing, but you get a reference-counted memory for each and every of your “atomics”. This avoid the foot-gun situation for beginners. To seriously use the new Swift Atomics library, it requires more dances. Basically, you need to first allocate the storage somewhere: class MyOwnClassThatCanHoldTheAtomics { var atomic: UnsafeAtomic<Int>.Storage(0) }  and each time to use it, everything has to be explicit: // Use withUnsafeMutablePointer to get a valid pointer until load call finishes. let loadedValue = withUnsafeMutablePointer(to: &atomic) { UnsafeAtomic(at:$0).load(ordering: .acquiring)
}


This is a lot of dances comparing to a ManagedAtomic:

let atomic: ManagedAtomic<Int>(0)