What about server-side rendering/SSR? Some components can produce HTML and CSS and provide no interaction, at least as a fallback. You still need technology that will inline the component into raw HTML.
I said this in my other comment, but the way it typically works in the news industry is you make a micro-site with a traditional framework (Svelte was literally created to do this), upload it to an S3 bucket, and then inline it on an article page with an iframe and some JS to resize the iframe to be the right height. The thing that makes this sustainable is once publication is over, you essentially never touch the project again. Maybe one or two tweaks in the post-launch window, but nothing on going.
A different approach is some variation on “shortcodes”. There you have content saved in the database along with markers (“shortcodes”) that the CMS knows how to inflate into a component of some sort. Let’s say you have [[related-articles]] to put in an inline list of related articles or <sidebar story="url"> for a sidebar link to a story or {{ adbreak }} or <gallery id="123"> or whatever. The advantage of these is that they aren’t one-and-done. You can change the layout for your related articles and sidebars and change ad vendors and whatnot. But the downside is they tend to be pretty much tied into your CMS, and you basically can’t reuse them if you ever want to go from one CMS to another. Typically the best migration path is to just strip them out when you move to the new CMS, even if it does make some of the old articles look funky and weird.
Yeah, I considered and rejected shortcodes (or rather, Astro components in MDX files, which is the Astro equivalent) for exactly this reason. Very convenient in the short term, much more painful when migrating.
XML is just not used for the same thing and are not comparable.
First thing first, not throwing an error when the type of an attribute is wrong when parsing a YAML schema is a mistake. A version number must be a string, not a number, so version: 1.20 shouldn’t be allowed. For example, if I remember well, Kubernetes is strict on the data types when parsing YAML, and it works well. If <version>1.20</version> works as you would expect, it’s only because everything is a string in XML (you can use a schema to validate a string, but it is still a string).
About XML, it is a language used for documents. Tags and attributes are used to provide more data around the text contained in the document. You can read a good article on XML here: XML is almost always misused (2019, Lobsters).
YAML, on the other end, is a language for storing data structures. It has its own limitations, but also some features that are not widely known, such as anchors.
I would keep most of things as-is, I don’t see the point of a full renaming, it just needs some clarity added on homepage and documentation.
You got the Oil Shell project that regroups everything (I think Oils for UNIX is distancing from what the project is for in the first place).
${PREFIX}/bin/oil is the Oil Shell main binary, which runs the Oil Shell with the Oil Language.
${PREFIX}/bin/osh is the compatibility binary, which runs the Oil Shell with the Oil Compatibility Language which is a POSIX implementation with some additions (like aiming to be compatible with Bash).
I would stop using the uppercase name OSH since it is not clear of what it is referring to. OSH the language would effectively be renamed to “Oil Compatibility Language”. Referring to the binary would need to use the lowercase form osh, and preferably to explicitly refer to it as “osh binary”. Since “Oil Compatibility Language” is too long to write, you could also use “osh language” as a shortcut, as long as you keep being explicit as whether you are referring to the language or the binary.
As to why naming the Oil Compatibility Language’s binary “osh”, it is because it is closer to the name of commonly used shells, which all use the “sh” suffix. While “oil” is a new thing and doesn’t even use this archaic suffix.
PS: If the Oil Shell project starts to aim for the reimplementation of a larger part of what a POSIX operating system is, using the name Oils for UNIX would start being more relevant.
Also bin/oil and bin/osh are symlinks – we would still need a name for the binary they point to. (oils-for-unix is my suggestion. In the old Python tarball it’s actually oil.ovm, though nobody really knows that.)
Except you don’t need virtual columns to do it; they’re just syntactic sugar. I’ve been doing it for years by putting the json_extract call in the CREATE INDEX statement.
Yes, but virtual columns seem like a “syntactic sugar” feature all together. It just wraps some complexity that you can, of course, implement somewhere else.
There’s a bittorrent protocol extension where you can distribute a public key that points to a mutable torrent, but I don’t know if it has been ported to webtorrent.
As far as I understand, you can’t use DHT in a web browser, as nodes do not support WebRTC. The Webtorrent project includes a DHT library that works with Node.js (which is used by the desktop application).
Core libraries like glibc (since 2.1) and libstdc++ (since GCC 5) are intending to remain backwards compatible indefinitely.
If you need to distribute a binary built against glibc, you need to use a very old distribution to build it so that it may run on any other that your users use (it means that you will make less secure binaries - e.g. due to compiler bugs, new libraries that do not compile or without recent things like stack protection). Because some function symbols contain a version number that may not be supported in the earlier version some users have. That is not what you call backward compatible.
And if you think about musl, then it’s a whole separate world: mixing libraries built for glibc with libraries built for musl will break.
GUI apps built for Windows 95 still work out of the box on Windows 10.
I think the author confuses backward compatibility with forward compatibility. Backward compatibility would mean that apps built for Windows 10 would still work on Windows 95.
A binary compiled against an earlier version of glibc is forward compatible with more recent versions of glibc. A binary compiled against a recent version of glibc is not backward compatible with earlier versions (but still forward compatible with newer versions).
But glibc itself, by supporting the symbols of the past, is backward compatible. glibc is partially forward compatible, for the symbols that exist presently, so that newer versions are backward compatible. This is the same for operating systems that can run old binaries.
I’m fairly certain it’s not possible due to different features between the filesystems - in particular no suid means sudo won’t work. I’m also not sure mapping to different users on Linux works properly, though I haven’t checked in a while.
NTFS is a lot like BeFS: the folks talking to the filesystem team didn’t provide a good set of requirements early on and so they ended up with something incredibly general. NTFS, like BeFS, is basically a key-value store, with two ways of storing values. Large values can (as with BeFS) be stored in disk blocks, small values are stored in a reserved region that looks a little bit like a FAT filesystem (BeFS stores them in the inode structure for the file).
Everything is layered on top of this. Compression, encryption, and even simple things like directories, are built on top of the same low-level abstraction. This means that you can take a filesystem with encryption enabled and mount it with an old version of NT and it just won’t be able to read some things.
This is also the big problem for anything claiming to ‘support NTFS’. It’s fairly easy to support reading and writing key-value pairs from an NTFS filesystem but full support means understanding what all of the keys mean and what needs updating for each operation. It’s fairly easy to define a key-value pair that means setuid, but if you’re dual booting and Windows is also using the filesystem then you may need to be careful to not accidentally lose that metadata.
I also don’t know how the NTFS driver handles file ownership and permissions. In a typical *NIX filesystem, you have an small integer UID combined with a two-byte bitmap of permissions. You may also have ACLs, but they’re optional. In contrast, NTFS exposes owners as UUIDs (much larger than a uid that any *NIX program understands) and has only ACLs (which are not expressed with the same verbs as NFSv4 or POSIX ACLs), so you need some translation layer and need to be careful that this doesn’t introduce incompatibilities with the Windows system.
You’re probably better off creating a loopback-mounted ext4 filesystem as a file in NTFS and just mounting the Windows home directory, if you want to dual boot and avoid repartitioning.
Note that WSL1 uses NTFS and provides Linux-compatible semantics via a filter driver. If someone wants to reverse engineer how those are stored (wlspath gives the place they live in the UNC filesystem hierarchy) then you could probably have a Linux root FS that uses the same representation as WSL and also uses the same place in the UNC namespace so that Windows tools know that they’re special.
WSL2 is almost totally unrelated to WSL, it’s a Linux VM running on Hyper-V (I really wish they’d given WSL2 a different name). Its root FS is an ext4 block device (which is backed by a file on the NTFS file system). Shared folders are exported as 9p-over-VMBus from the host.
This is why the performance characteristics of WSL and WSL2 are almost exactly inverted. WSL1 has slow root FS access because it’s an NTFS filesystem with an extra filter driver adding POSIX semantics but the perf accessing the Windows FS is the same because it’s just another place in the NTFS filesystem namespace. WSL2 has fast access to the root FS because it’s a native Linux FS and the Linux VM layer is caching locally, but has much slower access to the host FS because it gets all of the overhead of NTFS, plus the overhead of serialising to an in-memory 9p transport, plus all of the overhead of the Linux VFS layer on top.
Hopefully at some point WSL will move to doing VirtIO over VMBus instead of 9p. The 9p filesystem semantics are not quite POSIX and the NTFS semantics are not 9p or POSIX, so you have two layers of impedance mismatch. With VirtIO over VMBus, the host could use the WSL interface to the NTFS filesystem and directly forward operations over a protocol that uses POSIX semantics.
There are some fun corner cases in the WSL filesystem view. For example, if you enable developer mode then ln -s in WSL will create an NTFS symbolic link. If you disable developer mode then unprivileged users aren’t allowed to create symbolic links (I have no idea why) and so WSL creates an NTFS junction. Nothing on the system other than WSL knows what to do with a junction that refers to a file (the rest of Windows will only ever create junctions that refer to directories) and so will report the symlink as a corrupted junction. This is actually a pretty good example of the split between being able to store key-value pairs and knowing what they mean in NTFS: both WSL and other Windows tools use the same key to identify a junction but WSL puts a value in that nothing else understands.
Since the projects I use at work rely on docker-compose, I can’t make the switch. Unfortunately, podman-compose doesn’t support the whole syntax and features. It is possible to rewrite a compose file in a shell script, but this is not handy to maintain. I used such alternative for deploying and updating a container on a server though.
A big advantage of podman on Linux servers is that it doesn’t bypass netfilter unlike Docker, which is a pain to get it right.
In the case of ZorinOS Reddit, does the GPL cover the whole iso distribution, or only individual pieces of software part of a larger operating system distribution? I’d say the branding of ZorinOS is not GPL, but I’m not sure (IANAL). Microsoft distributes software coming from various operating system distributions with WSL, but their whole operating system is not yet GPL-ed.
An IP-based rate-limit might not be perfect if you only do a check per IP and not per block. Especially with IPv6 when everyone usually got a /48 or a /64.
A /48 means you have 2^(128-48) possible IP addresses to use, or 1208925819614629174706176.
Wanted to try it but it fails to build, too bad. At least I got links to similars projects in other comments (thanks for spot-client, a minimal client that works nicely).
For a huge number of cases (dense, two dimensional, tabular data) CSV is just fine, thank you. Metadata comes in a side car file if needed. This file can be read by your granddad, and will be readable by your grandkid.
Lots of programs can process gzipped CSV files directly, taking care of the OMG this dataset is 10 GB problem.
You can open up a CSV in an editor and edit it. You can load it into any spreadsheet program and graph it. You can load it into any programming REPL and process it.
CSV is problematic for unstructured data, often known in the vernacular as raw, uncleaned data. Usually this data comes in messy, often proprietary, often binary formats.
I was also disappointed to see that the article had no actual point. No alternative is proposed and no insight is given.
They do suggest alternatives, avro, parquet, arrow, and similar formats. Yes, that throws away editing with a simple text editor. But I believe the author was more concerned with people who import and export with Excel. Those people will always load the CSV into Excel to edit anyway, so plain text isn’t a key feature.
You can load it into any spreadsheet program and graph it.
If the author’s dream comes true and spreadsheets support a richer binary format, that won’t change.
You can load it into any programming REPL and process it.
Yes. And the first step: import csv or the equivalent in your language of choice. How is that any different from import other_format?
CSV is just fine, thank you. Metadata comes in a side car file if needed.
I feel this argument is equivalent to “programmers are smart and therefore always do the right thing.” But that’s simply untrue. I’ve gotten malformed CSV-format database dumps from smart people with computer science degrees and years of industry experience. Programmers make mistakes all the time, by accident or from plain ignorance. We have type checkers and linters and test suites to deal with human fallibility, why not use data formats that do the same?
CSV is like C. Yes, it has gotten us far, but that doesn’t mean there’s nothing better to aspire to. Are Rust, Go, and Zig pointless endeavors because C is universally supported everywhere already? Of course not.
FWIW traditional unix command line tools like awk, cut and sed are terrible at all of the above with CSV because they do not understand the quoting mechanism.
I would vastly prefer to be using jq or (if I have to cope with xml) xmlstarlet most of the time.
client-side JS validation via regex or chunking
We’ve got ArrayBuffer and DataView now, we can write JS parsers for complicated file formats. ;)
FWIW traditional unix command line tools like awk, cut and sed are terrible at all of the above with CSV because they do not understand the quoting mechanism.
It’s worse than that. They’re all now locale-aware. Set your locale to French (for example) and now your decimal separator is a comma. Anything using printf / scanf for floats will treat commas as decimal separators and so will combine pairs of adjacent numeric fields into a single value or emit field separators in the middle of numbers.
For personal stuff where I want a tabular format that I can edit in a text editor, I always use TSV instead of CSV. Tabs are not a decimal or thousands separator in any language and they don’t generally show up in the middle of text labels either (they’re also almost impossible to type in things like Excel, that just move to the next cell if you press the tab key, so they don’t show up in the data by accident). All of the standard UNIX tools work really well on them and so do some less-standard ones like ministat.
Those tools are terrible, but are totally sufficient for a large chunk of CSV use cases. Like, yes, you’ll get unclean data sometimes, but in a lot of cases it’s no big deal.
re: JS parsing…I’ve done all of those things. I still appreciate the simplicity of CSV where even an intern can bodge something reasonable together for most use cases.
Like, this is all super far into the territory of Worse is Better.
This is the reason why some industry app I work on needs to support XLSX, because of Office usage. We got CSV, XSLX and Parquet formats support in different parts of the app, depending to how data is uploaded.
I was also disappointed to see that the article had no actual point. No alternative is proposed and no insight is given.
My understanding is that this article is a marketing blog post for the services they provide. Tho, I mostly I agree with them that CSV should be replaced with better tools (and many people are already working on it).
The problem to me is not that it is old. The problem is for the exchange and interoperability of large datasets. In particular how do you stream update a csv from a diff / delta?
If you’re slinging data back and forth within your own organisation (where you control both endpoints), CSV is indeed sub-optimal.
If you’re exchanging data between heterogeneous systems, CSV is the local minimum. You can wish for a better format all you want, but if just one of the systems is gonna emit/accept CSV only, that’s what you’re gonna have to use.
Totally agree. I’ve contributed to a project that works with 100 million line CSV files (OpenAddresses). It works absolutely great for that. Honestly my only complaints with standard CSV are the dialect problems and the row oriented organization.
The dialect problems are real but in practice not a big deal. Common tools like Python’s CSV module or Pandas’ CSV importer can autodetect and generally consume every possible quoting convention. In an idea world more people would use tab as the separator, or even better 0x1e the ASCII record separator (which no one uses). But we make commas work.
The row orientation is a nerdy thing to quibble about, but CSV doesn’t compress as efficiently as it could. Rezipping the data so it’s stored in column order often results in much better compression. But that’s usually an unnatural way to produce or consume the data so fine, rows it is.
I was also disappointed to see that the article had no actual point. No alternative is proposed and no insight is given.
It seems like the author is angry he had to go through other people’s data dump and cleaned it up even though that’s basically what he is trying to sell.
Bad arguments like “this data format cannot survive being opened in an editor and manually edited incorrectly”.
thb this is just a “hey we need to drum up more sale can you write an article about something?” type of low-content marketing.
What’s even dumber is when you see all these websites that check that the browser is Chrome/Chromium to enable a feature, whether or not the browser can actually support it. How did this happen? There is everything needed in CSS and JS to check if a feature is supported nowadays.
There are ways to check whether or not JS supports a feature and the same is true for CSS (s. @supports) but the problem is that some browsers lie with (Mobile) Safari being a prominent example unfortunately.
Rather than publishing ports on any address using Docker (or, the same with podman), I publish to localhost or to a private subnet address (e.g. a virtual network for a virtual machine on a dedicated server, or usually a WireGuard tunnel), then I use a reverse proxy (e.g. nginx or HAproxy) on the front server (if different). This way I’m certain of what to whitelist and rate limit on my stateful firewall setup.
In the case of a database, it doesn’t need to be accessible to the Internet, so I can just bind into a WireGuard tunnel (though I have not yet looked into WireGuard failover which is important for replicated databases).
I like how systemd brings all these features, but I don’t like how this makes this not portable to other operating systems, as systemd only supports Linux. I know that not all operating systems support all the underlying features needed by systemd, but I believe it is a shame to be Linux-centric.
I am not a user of non Linux-based operating systems myself, but I prefer having common standards.
Personally, I’m completely fine that Systemd-the-init-system is Linux-only. It’s essentially built around cgroups, and I can imagine reimplementing everything cgroups-like on top of whatever FreeBSD offers would be extremely challenging if at all possible. FreeBSD can build its own init system.
…However, I would prefer if systemd didn’t work to get other software to depend on systemd. It definitely sucks that systemd has moved most desktop environments from being truly cross platform to being Linux-only with a hack to make them run on the BSDs. That’s not an issue with the init system being Linux-only though, it’s an issue with the scope and political power of the systemd project.
The issue is that it’s expensive to maintain things like login managers and device notification subsystems, so if the systemds of the world are doing it for free, that’s a huge argument to take advantage of it. No political power involved.
With politcal power I just meant that RedHat and Poettering have a lot of leverage. If I, for example, made a login manager that’s just as high quality as logind, I can’t imagine GNOME would switch to supporting my login manager, especially as the only login manager option. (I suppose we’ll get to test that hypothesis though by seeing whether GNOME will ever adopt seatd/libseat as an option.)
It’s great that systemd is providing a good login manager for free, but I can’t shake the feeling that, maybe, it would be possible to provide an equally high quality login daemon without a dependency on a particular Linux-only init system.
I don’t think the “political power” (call it leverage if you disagree with that term) of the systemd project is inherently an issue, but it becomes an issue when projects add a hard dependency on systemd tools which depend on the systemd init system where OS-agnostic alternatives exist and are possible.
Systemd is built on Linux’s capabilities, so this is really a question of–should people not try to take advantage of platform-specific capabilities? Should they always stay stuck on the lowest-common denominator? This attitude reminds me of people who insist on treating powerful relational databases like dumb key-value stores in the name of portability.
I believe the BSDs can do many of the things listed in the article, but also in their very own ways. A cross-platform system manager would be some sort of a miracle, I believe.
The big difference is that systemd (as well as runit, s6, etc.) stay attached to the process, whereas the BSD systems (and OpenRC, traditional Linux init scripts) expect the program to “daemonize”.
Aside from whatever problems systemd may or may not have, I feel this model is vastly superior in pretty much every single way. It simplifies almost everything, especially for application authors, but also for the init implementation and system as a whole.
A cross-platform system manager would be some sort of a miracle, I believe.
daemontools already ran on many different platforms around 2001. I believe many of its spiritual successors do too.
It’s not that hard; like many programs it’s essentially a glorified for loop:
for service in get_services()
start_process(service)
Of course, it’s much more involved with restarts, logging, etc. etc. but you can write a very simple cross-platform proof-of-concept service manager in a day.
Yes and no. Socket activation can be done with inetd(8), and on OpenBSD you can at least limit what filesystem paths are available with unveil(2), although that requires system-specific changes to your code. As far as dynamic users, I don’t think there’s a solution for that.
Edit: Also, there’s no real substitute for LoadCredentials, other than using privdropping and unveil(8). I guess you could use relayd(8) to do TLS termination and hand-off to inetd(8). If you’re doing strictly http, you could probably use a combo of httpd(8) and slowcgi(8) to accomplish similar.
Then I’m imagining a modular system with different features that can be plugged together, with specifications and different implementations depending to the OS. Somehow a way to go back to having a single piece of software for each feature, but at another level. The issue is how you write these specifications while having things implementable on any operating system it makes sense of.
Hell, a Docker API implementation for BSD would be a miracle. The last FreeBSD Docker attempt was ancient and has fallen way out of date. Have a daemon that could take OCI containers and run them with ZFS layers in a BSD jail with BSD virtual networks would be a huge advantage for BSD in production environments.
There is an exciting project for an OCI-compatible runtime for FreeBSD: https://github.com/samuelkarp/runj. containerd has burgeoning FreeBSD support as well.
In fact aside from the XML I’d say SMF is the kind of footprint I’d prefer systemd to have, it points to (and reads from) log files instead of subsuming that functionality, handles socket activation, supervises processes/services and drops privileges. (It can even run zones/jails/containers).
But to answer the question: yes any of the scripts can be used essentially* verbatim on any other platform.
(There might be differences in pathing, FreeBSD installs everything to /usr/local by default)
Absolutely not. Even though they’re just shell scripts, there are a ton of different concerns that make them non-portable.
I’m gonna ignore the typical non-portable problems with shell scripts (depending on system utils that function differently on different systems (yes, even within the BSDs), different shells) and just focus on the biggest problem: both are written depending on their own shell libraries.
If we look at a typical OpenBSD rc.d script, you’ll notice that all the heavy-lifting is done by /etc/rc.d/rc.subr. FreeBSD has an /etc/rc.subr that fulfills the same purpose.
It’s also important to note that trying to wholesale port rc.subr(8) into your init script to make it compatible across platforms will be quite the task, since they’re written for different shells (OpenBSD ksh vs whatever /bin/sh is on FreeBSD). Moreover, the rc.subr(8) use OS-specific features, so porting them wholesale will definitely not work (just eyeballing the OpenBSD /etc/rc.d/rc.subr, I see getcap(1) and some invocations of route(8) that only work on OpenBSD. FreeBSD’s /etc/rc.subr uses some FreeBSD-specific sysctl(8) MIBs.)
If you’re writing an rc script for a BSD, it’s best to just write them from scratch for each OS, since the respective rc.subr(8) framework gives you a lot of tools to make this easy.
This is notably way better than how I remember the situation on sysvinit Linux, since iirc there weren’t such complete helper libraries, and writing such a script could take a lot of time and be v error-prone.
Yeah, exactly. The rc scripts aren’t actually portable, so why do people (even in this very thread) expect the systemd scripts (which FWIW are easier to parse programmatically, see halting theory) to be?
I’m completely in agreement with you. I want rc scripts/unit files/SMF manifests to take advantage of system-specific features. It’s nice that an rc script in OpenBSD can allow me to take advantage of having different rtables or that it’s jail-aware in FreeBSD.
I think there are unfortunate parts of this, since I think it’d be non-trivial to adapt provided program in this example to socket activation in inetd(8) (tbh, maybe I should try when I get a chance). What would be nice is if there was a consistent set of expectations for daemons about socket-activation behavior/features, so it’d be easier to write portable programs, and then ship system-specific configs for the various management tools (systemd, SMF, rc/inetd). Wouldn’t be surprised if that ship has sailed though.
What about server-side rendering/SSR? Some components can produce HTML and CSS and provide no interaction, at least as a fallback. You still need technology that will inline the component into raw HTML.
I said this in my other comment, but the way it typically works in the news industry is you make a micro-site with a traditional framework (Svelte was literally created to do this), upload it to an S3 bucket, and then inline it on an article page with an iframe and some JS to resize the iframe to be the right height. The thing that makes this sustainable is once publication is over, you essentially never touch the project again. Maybe one or two tweaks in the post-launch window, but nothing on going.
A different approach is some variation on “shortcodes”. There you have content saved in the database along with markers (“shortcodes”) that the CMS knows how to inflate into a component of some sort. Let’s say you have
[[related-articles]]
to put in an inline list of related articles or<sidebar story="url">
for a sidebar link to a story or{{ adbreak }}
or<gallery id="123">
or whatever. The advantage of these is that they aren’t one-and-done. You can change the layout for your related articles and sidebars and change ad vendors and whatnot. But the downside is they tend to be pretty much tied into your CMS, and you basically can’t reuse them if you ever want to go from one CMS to another. Typically the best migration path is to just strip them out when you move to the new CMS, even if it does make some of the old articles look funky and weird.Yeah, I considered and rejected shortcodes (or rather, Astro components in MDX files, which is the Astro equivalent) for exactly this reason. Very convenient in the short term, much more painful when migrating.
XML is just not used for the same thing and are not comparable.
First thing first, not throwing an error when the type of an attribute is wrong when parsing a YAML schema is a mistake. A version number must be a string, not a number, so
version: 1.20
shouldn’t be allowed. For example, if I remember well, Kubernetes is strict on the data types when parsing YAML, and it works well. If<version>1.20</version>
works as you would expect, it’s only because everything is a string in XML (you can use a schema to validate a string, but it is still a string).About XML, it is a language used for documents. Tags and attributes are used to provide more data around the text contained in the document. You can read a good article on XML here: XML is almost always misused (2019, Lobsters).
YAML, on the other end, is a language for storing data structures. It has its own limitations, but also some features that are not widely known, such as anchors.
I would keep most of things as-is, I don’t see the point of a full renaming, it just needs some clarity added on homepage and documentation.
${PREFIX}/bin/oil
is the Oil Shell main binary, which runs the Oil Shell with the Oil Language.${PREFIX}/bin/osh
is the compatibility binary, which runs the Oil Shell with the Oil Compatibility Language which is a POSIX implementation with some additions (like aiming to be compatible with Bash).I would stop using the uppercase name OSH since it is not clear of what it is referring to. OSH the language would effectively be renamed to “Oil Compatibility Language”. Referring to the binary would need to use the lowercase form osh, and preferably to explicitly refer to it as “osh binary”. Since “Oil Compatibility Language” is too long to write, you could also use “osh language” as a shortcut, as long as you keep being explicit as whether you are referring to the language or the binary.
As to why naming the Oil Compatibility Language’s binary “osh”, it is because it is closer to the name of commonly used shells, which all use the “sh” suffix. While “oil” is a new thing and doesn’t even use this archaic suffix.
PS: If the Oil Shell project starts to aim for the reimplementation of a larger part of what a POSIX operating system is, using the name
Oils for UNIX
would start being more relevant.I want to get away from the “compatibility” framing with regard to shell, see this comment:
https://lobste.rs/s/plmk9r/new_names_for_oil_project_oil_shell#c_tolgcm
Also
bin/oil
andbin/osh
are symlinks – we would still need a name for the binary they point to. (oils-for-unix
is my suggestion. In the old Python tarball it’s actuallyoil.ovm
, though nobody really knows that.)I thought this was going to talk about the different ways to send HTML or plaintext emails.
What does
Oui, il est le git!
is supposed to mean? I’m French and it irritates me 😂Nothing. And judging from this thread, being French isn’t a requirement to be irritated. :^)
Too bad it isn’t free software though.
Which I think is a much more useful case than the original one he blogged about.
Except you don’t need virtual columns to do it; they’re just syntactic sugar. I’ve been doing it for years by putting the
json_extract
call in theCREATE INDEX
statement.I was going to ask this, I didn’t understand the point of creating virtual columns.
I think being explicit is better in general. Except maybe if you use virtual columns for compatibility reasons.
Yes, but virtual columns seem like a “syntactic sugar” feature all together. It just wraps some complexity that you can, of course, implement somewhere else.
Agreed. Just thought I’d comment since the article could be misread as saying that indexing JSON properties requires virtual columns.
I definitely misunderstood this. Thank you for clarifying!
I wonder if you could support mutability somehow?
I’m partly imagining torrent websites hosted on bittorrent (because it’s kinda meta) but could be generally useful/interesting perhaps.
There’s a bittorrent protocol extension where you can distribute a public key that points to a mutable torrent, but I don’t know if it has been ported to webtorrent.
The reference implementation for BEP0046 is done with webtorrent, don’t know if/how it works in browser though.
As far as I understand, you can’t use DHT in a web browser, as nodes do not support WebRTC. The Webtorrent project includes a DHT library that works with Node.js (which is used by the desktop application).
If you need to distribute a binary built against glibc, you need to use a very old distribution to build it so that it may run on any other that your users use (it means that you will make less secure binaries - e.g. due to compiler bugs, new libraries that do not compile or without recent things like stack protection). Because some function symbols contain a version number that may not be supported in the earlier version some users have. That is not what you call backward compatible.
And if you think about musl, then it’s a whole separate world: mixing libraries built for glibc with libraries built for musl will break.
I think the author confuses backward compatibility with forward compatibility. Backward compatibility would mean that apps built for Windows 10 would still work on Windows 95.
Your use is also at odds with how “backward compatibility” is used with, e.g., game consoles.
I got this wrong.
A binary compiled against an earlier version of glibc is forward compatible with more recent versions of glibc. A binary compiled against a recent version of glibc is not backward compatible with earlier versions (but still forward compatible with newer versions).
But glibc itself, by supporting the symbols of the past, is backward compatible. glibc is partially forward compatible, for the symbols that exist presently, so that newer versions are backward compatible. This is the same for operating systems that can run old binaries.
Too bad it is implemented in C++ rather than C, which makes it harder to implement it in C projects that support Lua already.
I wonder how the frontier of the definition of covered work of GPL applies here.
Has anyone considered trying NTFS root filesystem yet? It might be an…. interesting alternative to partitioning for dual boots.
I’m fairly certain it’s not possible due to different features between the filesystems - in particular no suid means sudo won’t work. I’m also not sure mapping to different users on Linux works properly, though I haven’t checked in a while.
That can probably bevworked around with creative use of extended attributes, if someone really wants to do it.
I’m pretty sure NTFS has something for setuid, since Interix supported it.
NTFS is a lot like BeFS: the folks talking to the filesystem team didn’t provide a good set of requirements early on and so they ended up with something incredibly general. NTFS, like BeFS, is basically a key-value store, with two ways of storing values. Large values can (as with BeFS) be stored in disk blocks, small values are stored in a reserved region that looks a little bit like a FAT filesystem (BeFS stores them in the inode structure for the file).
Everything is layered on top of this. Compression, encryption, and even simple things like directories, are built on top of the same low-level abstraction. This means that you can take a filesystem with encryption enabled and mount it with an old version of NT and it just won’t be able to read some things.
This is also the big problem for anything claiming to ‘support NTFS’. It’s fairly easy to support reading and writing key-value pairs from an NTFS filesystem but full support means understanding what all of the keys mean and what needs updating for each operation. It’s fairly easy to define a key-value pair that means setuid, but if you’re dual booting and Windows is also using the filesystem then you may need to be careful to not accidentally lose that metadata.
I also don’t know how the NTFS driver handles file ownership and permissions. In a typical *NIX filesystem, you have an small integer UID combined with a two-byte bitmap of permissions. You may also have ACLs, but they’re optional. In contrast, NTFS exposes owners as UUIDs (much larger than a uid that any *NIX program understands) and has only ACLs (which are not expressed with the same verbs as NFSv4 or POSIX ACLs), so you need some translation layer and need to be careful that this doesn’t introduce incompatibilities with the Windows system.
You’re probably better off creating a loopback-mounted ext4 filesystem as a file in NTFS and just mounting the Windows home directory, if you want to dual boot and avoid repartitioning.
Note that WSL1 uses NTFS and provides Linux-compatible semantics via a filter driver. If someone wants to reverse engineer how those are stored (
wlspath
gives the place they live in the UNC filesystem hierarchy) then you could probably have a Linux root FS that uses the same representation as WSL and also uses the same place in the UNC namespace so that Windows tools know that they’re special.What is used by WSL 2?
WSL2 is almost totally unrelated to WSL, it’s a Linux VM running on Hyper-V (I really wish they’d given WSL2 a different name). Its root FS is an ext4 block device (which is backed by a file on the NTFS file system). Shared folders are exported as 9p-over-VMBus from the host.
This is why the performance characteristics of WSL and WSL2 are almost exactly inverted. WSL1 has slow root FS access because it’s an NTFS filesystem with an extra filter driver adding POSIX semantics but the perf accessing the Windows FS is the same because it’s just another place in the NTFS filesystem namespace. WSL2 has fast access to the root FS because it’s a native Linux FS and the Linux VM layer is caching locally, but has much slower access to the host FS because it gets all of the overhead of NTFS, plus the overhead of serialising to an in-memory 9p transport, plus all of the overhead of the Linux VFS layer on top.
Hopefully at some point WSL will move to doing VirtIO over VMBus instead of 9p. The 9p filesystem semantics are not quite POSIX and the NTFS semantics are not 9p or POSIX, so you have two layers of impedance mismatch. With VirtIO over VMBus, the host could use the WSL interface to the NTFS filesystem and directly forward operations over a protocol that uses POSIX semantics.
There are some fun corner cases in the WSL filesystem view. For example, if you enable developer mode then
ln -s
in WSL will create an NTFS symbolic link. If you disable developer mode then unprivileged users aren’t allowed to create symbolic links (I have no idea why) and so WSL creates an NTFS junction. Nothing on the system other than WSL knows what to do with a junction that refers to a file (the rest of Windows will only ever create junctions that refer to directories) and so will report the symlink as a corrupted junction. This is actually a pretty good example of the split between being able to store key-value pairs and knowing what they mean in NTFS: both WSL and other Windows tools use the same key to identify a junction but WSL puts a value in that nothing else understands.Actual Linux filesystems. Because it’s just a Linux kernel, in Hyper-V, with dipping mustards.
Why not go the other way around and boot Windows off of btrfs? :D
But really, why not, you have backups… right? :P
Since the projects I use at work rely on
docker-compose
, I can’t make the switch. Unfortunately, podman-compose doesn’t support the whole syntax and features. It is possible to rewrite a compose file in a shell script, but this is not handy to maintain. I used such alternative for deploying and updating a container on a server though.A big advantage of podman on Linux servers is that it doesn’t bypass netfilter unlike Docker, which is a pain to get it right.
In the case of ZorinOS Reddit, does the GPL cover the whole iso distribution, or only individual pieces of software part of a larger operating system distribution? I’d say the branding of ZorinOS is not GPL, but I’m not sure (IANAL). Microsoft distributes software coming from various operating system distributions with WSL, but their whole operating system is not yet GPL-ed.
An IP-based rate-limit might not be perfect if you only do a check per IP and not per block. Especially with IPv6 when everyone usually got a
/48
or a/64
.A
/48
means you have2^(128-48)
possible IP addresses to use, or1208925819614629174706176
.And Lobsters is accessible in IPv6.
Wanted to try it but it fails to build, too bad. At least I got links to similars projects in other comments (thanks for spot-client, a minimal client that works nicely).
For a huge number of cases (dense, two dimensional, tabular data) CSV is just fine, thank you. Metadata comes in a side car file if needed. This file can be read by your granddad, and will be readable by your grandkid.
Lots of programs can process gzipped CSV files directly, taking care of the OMG this dataset is 10 GB problem. You can open up a CSV in an editor and edit it. You can load it into any spreadsheet program and graph it. You can load it into any programming REPL and process it.
CSV is problematic for unstructured data, often known in the vernacular as raw, uncleaned data. Usually this data comes in messy, often proprietary, often binary formats.
I was also disappointed to see that the article had no actual point. No alternative is proposed and no insight is given.
They do suggest alternatives, avro, parquet, arrow, and similar formats. Yes, that throws away editing with a simple text editor. But I believe the author was more concerned with people who import and export with Excel. Those people will always load the CSV into Excel to edit anyway, so plain text isn’t a key feature.
If the author’s dream comes true and spreadsheets support a richer binary format, that won’t change.
Yes. And the first step:
import csv
or the equivalent in your language of choice. How is that any different fromimport other_format
?I feel this argument is equivalent to “programmers are smart and therefore always do the right thing.” But that’s simply untrue. I’ve gotten malformed CSV-format database dumps from smart people with computer science degrees and years of industry experience. Programmers make mistakes all the time, by accident or from plain ignorance. We have type checkers and linters and test suites to deal with human fallibility, why not use data formats that do the same?
CSV is like C. Yes, it has gotten us far, but that doesn’t mean there’s nothing better to aspire to. Are Rust, Go, and Zig pointless endeavors because C is universally supported everywhere already? Of course not.
It also throws away using command-line tools for munging, client-side JS validation via regex or chunking, or all kinds of other things.
Author needs to sell a replacement for CSV for business reasons, but that doesn’t make CSV bad.
FWIW traditional unix command line tools like awk, cut and sed are terrible at all of the above with CSV because they do not understand the quoting mechanism.
I would vastly prefer to be using jq or (if I have to cope with xml) xmlstarlet most of the time.
We’ve got ArrayBuffer and DataView now, we can write JS parsers for complicated file formats. ;)
It’s worse than that. They’re all now locale-aware. Set your locale to French (for example) and now your decimal separator is a comma. Anything using
printf
/scanf
for floats will treat commas as decimal separators and so will combine pairs of adjacent numeric fields into a single value or emit field separators in the middle of numbers.For personal stuff where I want a tabular format that I can edit in a text editor, I always use TSV instead of CSV. Tabs are not a decimal or thousands separator in any language and they don’t generally show up in the middle of text labels either (they’re also almost impossible to type in things like Excel, that just move to the next cell if you press the tab key, so they don’t show up in the data by accident). All of the standard UNIX tools work really well on them and so do some less-standard ones like
ministat
.Tangentially, this reminds me of how incredibly much I hate Microsoft Excel’s CSV parser.
Those tools are terrible, but are totally sufficient for a large chunk of CSV use cases. Like, yes, you’ll get unclean data sometimes, but in a lot of cases it’s no big deal.
re: JS parsing…I’ve done all of those things. I still appreciate the simplicity of CSV where even an intern can bodge something reasonable together for most use cases.
Like, this is all super far into the territory of Worse is Better.
I’d much rather be code reviewing the intern’s janky jq filter, or janky for loop in JavaScript, than their janky awk script. :)
Haha, for sure.
This is the reason why some industry app I work on needs to support XLSX, because of Office usage. We got CSV, XSLX and Parquet formats support in different parts of the app, depending to how data is uploaded.
My understanding is that this article is a marketing blog post for the services they provide. Tho, I mostly I agree with them that CSV should be replaced with better tools (and many people are already working on it).
Better at what though? CSV is not a “tool”- it’s a universally understood format.
You had better have a fantastically good reason for breaking a universally understood format.
That it is “old” or that it doesn’t work in a very small number of use-cases is not sufficient reason to fragment the effort.
The problem to me is not that it is old. The problem is for the exchange and interoperability of large datasets. In particular how do you stream update a csv from a diff / delta?
If you’re slinging data back and forth within your own organisation (where you control both endpoints), CSV is indeed sub-optimal.
If you’re exchanging data between heterogeneous systems, CSV is the local minimum. You can wish for a better format all you want, but if just one of the systems is gonna emit/accept CSV only, that’s what you’re gonna have to use.
Totally agree. I’ve contributed to a project that works with 100 million line CSV files (OpenAddresses). It works absolutely great for that. Honestly my only complaints with standard CSV are the dialect problems and the row oriented organization.
The dialect problems are real but in practice not a big deal. Common tools like Python’s CSV module or Pandas’ CSV importer can autodetect and generally consume every possible quoting convention. In an idea world more people would use tab as the separator, or even better 0x1e the ASCII record separator (which no one uses). But we make commas work.
The row orientation is a nerdy thing to quibble about, but CSV doesn’t compress as efficiently as it could. Rezipping the data so it’s stored in column order often results in much better compression. But that’s usually an unnatural way to produce or consume the data so fine, rows it is.
It seems like the author is angry he had to go through other people’s data dump and cleaned it up even though that’s basically what he is trying to sell. Bad arguments like “this data format cannot survive being opened in an editor and manually edited incorrectly”.
thb this is just a “hey we need to drum up more sale can you write an article about something?” type of low-content marketing.
What’s even dumber is when you see all these websites that check that the browser is Chrome/Chromium to enable a feature, whether or not the browser can actually support it. How did this happen? There is everything needed in CSS and JS to check if a feature is supported nowadays.
There are ways to check whether or not JS supports a feature and the same is true for CSS (s.
@supports
) but the problem is that some browsers lie with (Mobile) Safari being a prominent example unfortunately.Rather than publishing ports on any address using Docker (or, the same with podman), I publish to localhost or to a private subnet address (e.g. a virtual network for a virtual machine on a dedicated server, or usually a WireGuard tunnel), then I use a reverse proxy (e.g. nginx or HAproxy) on the front server (if different). This way I’m certain of what to whitelist and rate limit on my stateful firewall setup.
In the case of a database, it doesn’t need to be accessible to the Internet, so I can just bind into a WireGuard tunnel (though I have not yet looked into WireGuard failover which is important for replicated databases).
I like how systemd brings all these features, but I don’t like how this makes this not portable to other operating systems, as systemd only supports Linux. I know that not all operating systems support all the underlying features needed by systemd, but I believe it is a shame to be Linux-centric.
I am not a user of non Linux-based operating systems myself, but I prefer having common standards.
Personally, I’m completely fine that Systemd-the-init-system is Linux-only. It’s essentially built around cgroups, and I can imagine reimplementing everything cgroups-like on top of whatever FreeBSD offers would be extremely challenging if at all possible. FreeBSD can build its own init system.
…However, I would prefer if systemd didn’t work to get other software to depend on systemd. It definitely sucks that systemd has moved most desktop environments from being truly cross platform to being Linux-only with a hack to make them run on the BSDs. That’s not an issue with the init system being Linux-only though, it’s an issue with the scope and political power of the systemd project.
The issue is that it’s expensive to maintain things like login managers and device notification subsystems, so if the systemds of the world are doing it for free, that’s a huge argument to take advantage of it. No political power involved.
With politcal power I just meant that RedHat and Poettering have a lot of leverage. If I, for example, made a login manager that’s just as high quality as logind, I can’t imagine GNOME would switch to supporting my login manager, especially as the only login manager option. (I suppose we’ll get to test that hypothesis though by seeing whether GNOME will ever adopt seatd/libseat as an option.)
It’s great that systemd is providing a good login manager for free, but I can’t shake the feeling that, maybe, it would be possible to provide an equally high quality login daemon without a dependency on a particular Linux-only init system.
I don’t think the “political power” (call it leverage if you disagree with that term) of the systemd project is inherently an issue, but it becomes an issue when projects add a hard dependency on systemd tools which depend on the systemd init system where OS-agnostic alternatives exist and are possible.
Everybody loves code that hasn’t been written yet. I think we need to learn to looks realistically at what we have now (for free, btw) instead of insisting on the perfect, platform-agnostic software. https://lobste.rs/s/xxyjxl/avoiding_complexity_with_systemd#c_xviza7
Systemd is built on Linux’s capabilities, so this is really a question of–should people not try to take advantage of platform-specific capabilities? Should they always stay stuck on the lowest-common denominator? This attitude reminds me of people who insist on treating powerful relational databases like dumb key-value stores in the name of portability.
I believe the BSDs can do many of the things listed in the article, but also in their very own ways. A cross-platform system manager would be some sort of a miracle, I believe.
The big difference is that systemd (as well as runit, s6, etc.) stay attached to the process, whereas the BSD systems (and OpenRC, traditional Linux init scripts) expect the program to “daemonize”.
Aside from whatever problems systemd may or may not have, I feel this model is vastly superior in pretty much every single way. It simplifies almost everything, especially for application authors, but also for the init implementation and system as a whole.
daemontools already ran on many different platforms around 2001. I believe many of its spiritual successors do too.
It’s not that hard; like many programs it’s essentially a glorified for loop:
Of course, it’s much more involved with restarts, logging, etc. etc. but you can write a very simple cross-platform proof-of-concept service manager in a day.
Yes and no. Socket activation can be done with
inetd(8)
, and on OpenBSD you can at least limit what filesystem paths are available withunveil(2)
, although that requires system-specific changes to your code. As far as dynamic users, I don’t think there’s a solution for that.Edit: Also, there’s no real substitute for
LoadCredentials
, other than using privdropping andunveil(8)
. I guess you could userelayd(8)
to do TLS termination and hand-off toinetd(8)
. If you’re doing strictly http, you could probably use a combo ofhttpd(8)
andslowcgi(8)
to accomplish similar.Then I’m imagining a modular system with different features that can be plugged together, with specifications and different implementations depending to the OS. Somehow a way to go back to having a single piece of software for each feature, but at another level. The issue is how you write these specifications while having things implementable on any operating system it makes sense of.
Hell, a Docker API implementation for BSD would be a miracle. The last FreeBSD Docker attempt was ancient and has fallen way out of date. Have a daemon that could take OCI containers and run them with ZFS layers in a BSD jail with BSD virtual networks would be a huge advantage for BSD in production environments.
There is an exciting project for an OCI-compatible runtime for FreeBSD: https://github.com/samuelkarp/runj. containerd has burgeoning FreeBSD support as well.
But, are FreeBSD rc.d scripts usable verbatim on, say, OpenBSD or SMF?
SMF is a lot more like systemd than the others.
In fact aside from the XML I’d say SMF is the kind of footprint I’d prefer systemd to have, it points to (and reads from) log files instead of subsuming that functionality, handles socket activation, supervises processes/services and drops privileges. (It can even run zones/jails/containers).
But to answer the question: yes any of the scripts can be used essentially* verbatim on any other platform.
(There might be differences in pathing, FreeBSD installs everything to /usr/local by default)
I wish SMF was more portable. I actually like it a lot.
Absolutely not. Even though they’re just shell scripts, there are a ton of different concerns that make them non-portable.
I’m gonna ignore the typical non-portable problems with shell scripts (depending on system utils that function differently on different systems (yes, even within the BSDs), different shells) and just focus on the biggest problem: both are written depending on their own shell libraries.
If we look at a typical OpenBSD rc.d script, you’ll notice that all the heavy-lifting is done by
/etc/rc.d/rc.subr
. FreeBSD has an/etc/rc.subr
that fulfills the same purpose.These have incredibly different interfaces for configuration, you can just take a look at the manpages: OpenBSD rc.subr(8), FreeBSD rc.subr(8). I don’t have personal experience here, but NetBSD appears to have a differing rc.subr(8) as well.
It’s also important to note that trying to wholesale port
rc.subr(8)
into your init script to make it compatible across platforms will be quite the task, since they’re written for different shells (OpenBSD ksh vs whatever/bin/sh
is on FreeBSD). Moreover, therc.subr(8)
use OS-specific features, so porting them wholesale will definitely not work (just eyeballing the OpenBSD/etc/rc.d/rc.subr
, I seegetcap(1)
and some invocations ofroute(8)
that only work on OpenBSD. FreeBSD’s/etc/rc.subr
uses some FreeBSD-specificsysctl(8)
MIBs.)If you’re writing an rc script for a BSD, it’s best to just write them from scratch for each OS, since the respective
rc.subr(8)
framework gives you a lot of tools to make this easy.This is notably way better than how I remember the situation on sysvinit Linux, since iirc there weren’t such complete helper libraries, and writing such a script could take a lot of time and be v error-prone.
Yeah, exactly. The rc scripts aren’t actually portable, so why do people (even in this very thread) expect the systemd scripts (which FWIW are easier to parse programmatically, see halting theory) to be?
Also, thank you for the detailed reply.
I’m completely in agreement with you. I want rc scripts/unit files/SMF manifests to take advantage of system-specific features. It’s nice that an rc script in OpenBSD can allow me to take advantage of having different rtables or that it’s jail-aware in FreeBSD.
I think there are unfortunate parts of this, since I think it’d be non-trivial to adapt provided program in this example to socket activation in
inetd(8)
(tbh, maybe I should try when I get a chance). What would be nice is if there was a consistent set of expectations for daemons about socket-activation behavior/features, so it’d be easier to write portable programs, and then ship system-specific configs for the various management tools (systemd, SMF, rc/inetd). Wouldn’t be surprised if that ship has sailed though.I don’t see why not? They’re just POSIX sh scripts.