Because there’s more to OS development than failed experiments at 90s Bell Labs. It solves actual problems (i.e. plugins in same address space, updating dependencies without relinking everything, sharing code between multiple address spaces). Now, there’s a lot of issues in implementations that can be learned from (i.e. the lack of ELF symbol namespacing); I don’t know if Redox will simply slavishly clone dynamic linking as it exists in your typical Linux system, or iterate on the ideas.
I don’t think plugins in the same address space are a good idea in a low-level language. In particular I think PAM and NSS would have been better as daemons not plugin systems. It’s better to do plugins with a higher-level language that supports sandboxing and has libraries that can be tamed.
Sharing code between address spaces is a mixed blessing. It adds a lot of overhead from indirection via hidden global mutable state. ASLR makes the overhead worse, because everything has to be relinked on every exec.
I don’t think plugins in the same address space are a good idea in a low-level language. In particular I think PAM and NSS would have been better as daemons not plugin systems. It’s better to do plugins with a higher-level language that supports sandboxing and has libraries that can be tamed.
Right, conflating libraries and plugins in dynamic linking was a mistake - especially since unloading libraries is basically impossible. Maybe there’s research into that though?
It’s unfortunate Plan 9 is a thought-terminating object. There’s a lot of room in the osdev space, and unfortunately Plan 9 sucks all the oxygen out of the room when it gets mentioned, especially when it’s the “use more violence” approach to Unix’s problems.
It looks like the major thing this does is create some enterprise support for flakes (in their current form).
There’s still a lot of fertile ground for improving flakes, and I bet that upstream Nix will keep that up, but for commercial use having people that can be paid money to guarantee behavior is pretty helpful. Overall, another step forward in getting Nix more palatable to industry. Good work.
Another tired dream-fantasy from another tedious True Believer who has never stopped to question how one goes from “spicy autocomplete” to “deus ex machina”.
LLMs do not think and do not reason, but they do squander vast amounts of energy and resources to do an extremely poor imitation of thinking and reasonin by remixing human-written text that contains thinking and reasoning. Sadly, many people are unable to discern this vital and essential difference, just as they can’t tell bot-generated images from reality.
Very elaborate prompting can embed human-created algorithms, in effect making LLM bots into vastly inefficient interpreters: multiple orders of magnitude less efficient even than the bloated messes of 202x code such as interpreters in Wasm running inside 4 or 5 nested levels of virtualisation.
The fundamentalist believers who have failed to grasp the point in paragraph 2 then use paragraph 3 to convince themselves of transubstantiation: that if you ignore the embedded code in a Sufficiently Complex prompt, then the result is magic.
You can’t lift yourself by your own bootstraps, contrary to Baron Munchäusen. You can’t bootstrap intelligence by remixing text that contains the result of intelligence: you can merely fake it. No matter how good the fake, it never magically becomes real. There is an implicit leap in here which the fundies try to paper over.
Even the best fake gold is not gold. Yes you can make gold, but transmutation is very very hard and the real thing remains cheaper, however precious the result.
The emperor might not have clothes, but the knock-on effects of invisible thread speculators in the marketplace cannot be ignored.
How much of this submission did you read, out of curiosity? Your quip of “never stopped to question” makes me wonder–given that the whole purpose of the series is questioning and speculation.
If there are specific bits and predictions it makes please quote them and question them here instead of giving the exact lame dismissal I warned about when I submitted the story.
I confess that I’ve not yet finished part 2 or even started part 3, but they seem to be getting progressively harder work for me as they descend further into fantasy.
I have spelled out my objection clearly and distinctly.
The entire article is built on a single assumption which is unquestioned: that LLM bots are intelligent, and that this intelligence is steadily increasing, and it’s only a matter of time until they equal and then exceed human intelligence.
I feel strongly that this assumption is false.
They aren’t, it isn’t, and they won’t.
As I said: they can’t think and they can’t reason. They can’t even count. There is no pathway from very clever text prediction (based on arbitrarily vast models, no matter how big) to reasoning.
The effect of speculation? A financial bubble, followed by a collapse, which will destroy a lot of people’s careers and livelihoods, and squander a vast amount of resources.
LLM is not AI and there is no route from LLMs to AGI. But the tech bros don’t care and some will get rich failing to make anything useful.
This is not done victimless crime. All humanity and the whole world are victims.
There is zero questioning of the central dogma. The miracle machines just keep miraculously getting better and better.
There is no consideration of resource limitations, of efficiency, of cost, of input corpus poisoning, of negative feedback loops, of the environmental degradation this would cause, nothing.
All there is is more and better miracle machines, and given the continual cost-free miracles which have no impediment, no side effects, no cost, then what it might do to people… Which is mostly positive until it’s too late.
The “and then a miracle occurs” step is never examined. When it’s convenient more miracles occur. Nobody seems to mind much.
There is nothing interesting here, to me or for me.
In essence it’s masturbatory fantasy, ad absurdam.
The emperor never did have any clothes, but he magically never catches cold, never gets sick, and the invisible clothes just keep getting better and better.
Firstly - we start off by listing several ways that AI could be used for various creative pursuits, focused mainly on analysis. Yet, beyond the anecdote of “feeding AI samples of my own work, I can generate style guides that highlight my quirks and recurring themes”, this isn’t pursued any further. I’m certainly curious about this, but the author doesn’t go into any further detail about how to actually use tooling for this.
Unfortunately, after that first section, I land in this regrettably common trope of “person A (who likes AI) tries to convince person B (who doesn’t like AI) that AI is useful, completely missing the issues that person B has with AI”. Personally, my issues are all about the ethics of training models, which the author completely glosses over:
“I have a well-founded and well-tested belief that the training for these models isn’t stealing in a legal sense”
So it’s stealing in a non-legal sense, then? Or what?
It seems obvious to me that the current crop of generative AI is only as impressive as it is because of the breadth and quality of the training set, which in large part (if not totality) comes from uncredited and uncompensated labour from the very artists who the companies selling access to the model are attempting to undercut and obsolete – and the undercutting is already happening.
Call me “the chorus” all you want. Yeah, “the people pushing this stuff early on were assholes”, but the people pushing it now are assholes too.
It seems obvious to me that the current crop of generative AI is only as impressive as it is because of the breadth and quality of the training set, which in large part (if not totality) comes from uncredited and uncompensated labour from the very artists who the companies selling access to the model are attempting to undercut and obsolete – and the undercutting is already happening.
The thing that I’ll point out is that it’s incredibly unlikely that those creators are going to be any better compensated under some other operating regime: the alternative to a stupid anime diffusion for a blog post isn’t that I’m going to commission a piece, and the alternative to a silly little anthropomorphic dhole on a slide deck is usually just gonna be “okay fuck it, stock clipart, whatevs”. Similarly, the people who want to support an artist will both find that art and fund that artist. That isn’t changing.
The thing I think should be more widely understood, using the music industry’s attacks on post-scarcity as a guide, is that there will be a few companies stepping forward to agitate for copyright enforcement, and then coincidentally selling access to those datasets with some kind of “reasonable split” with the authors of the content. You need only look at how good Spotify and similar schemes have worked out for the independent and small musicians, whose main purpose is to provide a veneer of legitimizing outrage.
In my opinion, we’re better off normalizing the use of generative AI, making it a valid option that isn’t just to the benefit of the large companies with legal departments who can exercise regulatory capture, and setting a societal expectation that slop is low art but you can totally pay somebody to make good art instead! Remixes are okay! Sampling is okay! Transformative work and curation is okay!
The alternative, of course, is the ghoulish creatorwashing by distributors and publishers to pretend that they’re sticking up for the little guy while eating all of the licensing fees for these models. It’s so painfully obvious a ploy that I’m genuinely baffled people keep advocating on their behalf.
We don’t really have cautionary tale as a tag, so I did what I could.
I’m posting this because I think the extrapolation (and thank God it’s just extrapolation and that means it can be wrong!) is a good thought experiment, and though the stuff a decade or two out is floppier the predictions and analyses for this decade seem prescient.
For discussion purposes, I’d suggest posting only to either identify, agree, or disagree with the presented predictions and their underlying assumptions–otherwise, we’re just probably gonna get a bunch of uninformed and poorly articulated “I hate AI”, “sama is a psycopath”, “LLMs can’t code”, “xAI is bad because Musk is bad”, etc.
It was a good read. Definitely some plausible scenarios in there. Yet, I wish the author would drop the USA-tinted glasses (the “machinations” of the CCP, the Chinese “surveillance state” and Xi Thought Tutors, with no mention of the US use of genAI as propaganda and censorship in social media and, by extension, elections). With the current trend of twitter becoming C-SPAN (and social media brokering all media thanks to generative sugar-coating, thereby completing the transformation), the only real difference between the two superpowers will be the colors on the flag.
While I disagree on your last assertion (for historical reasons, not taking a side), I think you’re completely correct that the story does have a lot of the reflexive (and lazy!) Sinocriticism common these days.
So, look. Every single internet-connected thing that involves anything that could even be considered user-generated content, and has lawyers, sooner or later inserts a clause into its terms saying you grant them a royalty-free, non-exclusive, non-revocable (etc. etc.) license to copy and distribute things you, the user, input into it.
This is like the most standard boilerplate-y clause there is for user-generated content. It’s a basic cover-your-ass to prevent someone suing you for copyright violation because, say, they just found out that when you type something in the built-in search box it makes a copy (Illegal! I’ll sue!) and transmits the copy (Illegal! I’ll sue!) to a third party.
But about every six months someone notices once of these clauses, misinterprets it, and runs around panicking and screaming OH MY GOD THEY CLAIM COPYRIGHT OVER EVERYTHING EVERYONE DOES WHY WOULD THEY NEED THAT PANIC PANIC PANIC PANIC PANIC OUTRAGE OUTRAGE PANIC.
And then it sweeps through the internet with huge highly-upvoted threads full of angry comments from people who have absolutely no clue what the terms actually mean but who know from the tone of discussion that they’re supposed to be outraged about it.
After a few days it blows over, but then about six months later someone notices once of these clauses, misinterprets it, and runs around panicking and screaming OH MY GOD THEY CLAIM COPYRIGHT OVER EVERYTHING EVERYONE DOES WHY WOULD THEY NEED THAT PANIC PANIC PANIC PANIC PANIC OUTRAGE OUTRAGE PANIC.
And then…
@pushcx this should not be allowed on lobste.rs. It’s 100% outrage-mob baiting.
That’s the point. GDPR has not been that well tested in court. As long as it hasn’t, people will stick to legal boilerplate to make it as broad as possible. This is why all terms of services look like copypasta.
Saying that everyone else does it does not make it okay.
Putting words in my mouth doesn’t make a counterargument.
What do you think is not OK about this boilerplate CYA clause? Computers by their nature promiscuously copy data. Online systems copy and transmit it. The legal world has settled on clauses like this as an alternative to popping up a request for license every time you type into an online form or upload a file, because even if nobody ever actually would sue they don’t want to trust to that and want an assurance that if someone sues that person will lose, quickly. They’ve settled on this because copy/pasting a standard clause to minimize risk is a win from their perspective.
Why is this evil and bad and wrong from your perspective? Provide evidence.
The system we currently have may be structured in a way which makes clauses like this necessary or expedient in order to do business, but the validity of such a clause for that reason doesn’t excuse the system that created it.
Every single internet-connected thing that involves anything that could even be considered user-generated content, and has lawyers, sooner or later inserts a clause into its terms saying you grant them a royalty-free, non-exclusive, non-revocable (etc. etc.) license to copy and distribute things you, the user, input into it.
But Firefox isn’t a web service. It’s a program that runs on my computer and sends data to websites I choose to visit. Those websites may need such legal language for user generated content, but why does Mozilla need a license to copy anything I type into Firefox?
This. I’ve chatted with a few lawyers in the space and this is literally the first time we’re seeing that interpretation to apply to a local program you choose to run that is your agent.
Firefox integrates with things that are not purely your “local agent”, including online services and things not owned by Mozilla. And before you decide this means some sort of sinister data-stealing data-selling privacy violation, go back and look at my original example.
When you upload something to the python package index you do so because you intend for the python package index to create copies of it and distribute it, which needs a license.
When you make a comment on pull request for work you don’t intend for Mozilla to have anything to do with that. You don’t intend for Mozilla to receive your post. Nor to have any special rights to view it, distribute it, make copies of it, etc. They do not need a license because they shouldn’t be seeing it. Moreover you don’t even necessarily have the right to grant them said rights - someone else might own the copyright to the material you are legitimately working with.
When you make a comment on pull request for work you don’t intend for Mozilla to have anything to do with that.
If you use their integrated search which might send things you type to a third party, Mozilla needs your permission to do that.
If you use their Pocket service which can offer recommendations of articles you might like, Mozilla needs your permission to analyze things you’ve done, which may require things like making copies of data.
If you use their VPN service you’re passing a lot of stuff through their servers to be transmitted onward.
There’s a ton of stuff Mozilla does that could potentially be affected by copyright issues with user-generated/user-submitted content. So they have the standard boilerplate “you let us do the things with that content that are necessary to support the features you’re using” CYA clause.
you grant them a royalty-free, non-exclusive, non-revocable (etc. etc.) license to copy and distribute things you, the user, input into it.
The question for random people reading these clauses is what does that mean? Legalese can be hard for lawyers to understand. It’s much harder for mere mortals.
I think everyone is OK with Firefox (the browser) processing text which you enter it into. This processing includes uploading the text to web sites (which you ask it to, when you ask it to), etc.
What is much more concerning for the average user is believing that the “ royalty-free, non-exclusive, non-revocable (etc. etc.) license” is unrestricted.
Let’s say I write the worlds most beautiful poem, and then submit it to an online poem contest via FireFox. Will Mozilla then go “ha ha! Firefox made a copy, and uploaded it to the Mozilla servers. We’re publishing our own book of your work, without paying you royalties. And oh, by the way, you also used Firefox to upload intimate pictures of you and your spouse to a web site, so we’re going to publish those, too!”
The average person doesn’t know. Reading the legalese doesn’t help them, because legalese is written in legalese (an English-adjacent language which isn’t colloquial English). Legalese exists because lawsuits live and die based on minutiae such as the Oxford Comma. So for Mozillas protection, they need it, but these needs are in conflict with the users need to understand the notices.
The Mozilla blog doesn’t help, because the italicized text at the top says: It does NOT give us ownership of your data or a right to use it for anything other than what is described in the Privacy Notice
OK, what does the Privacy Notice say?
(your) …data stays on your device and is not sent to Mozilla’s servers unless it says otherwise in this Notice
Which doesn’t help. So now the average person has to read pages of legal gobbledygook. And buried in it is the helpful
Identifying, investigating and addressing potential fraudulent activities,
Which is a huge loophole. “We don’t know what’s potentially fraudulent, so we just take all of the data you give to FireFox, upload to our US-based servers, and give the DoJ / FBI access to it all without a warrant”. A lawyer could make a convincing and possibly winning argument that such use-cases are covered.
The psychological reason for being upset is that they are confused by complicated things which affect them personally, which they don’t understand, and which they have no control over. You can’t address that panic by telling them “don’t panic”.
The psychological reason for being upset is that they are confused by complicated things which affect them personally, which they don’t understand, and which they have no control over. You can’t address that panic by telling them “don’t panic”.
Could you explain why the concern is necessarily born of confusion rather than accurate understanding?
you said the reason for being upset is that they are confused. sorry if I was changing your meaning by adding “necessarily.” why do you say the concern is because of confusion or lack of understanding? what understanding would alleviate the concerns?
I don’t see a lot of difference between confusion and lack of understanding. Their upset is because the subject affects them, and they’re confused about it / don’t understand it, and they have no control over it.
This is entirely normal and expected. Simply being confused isn’t enough.
What would alleviate the concerns is to address all three issues, either singly, or jointly. If people don’t use Firefox, then it doesn’t affect them, and they’re not upset. If they understand what’s going on and make informed decisions, then they’re not upset. And then if they can make informed decisions, they have control over the situation, and they’re not upset.
The solution is a clear message from Mozilla. However, for reasons I noted above, Mozilla has to write their policies in legalese, when then makes it extremely difficult for anyone to understand them.
but who does “they” refer to? are you saying this describes people in general who are concerned about the policy, or are you just supposing that there must be someone somewhere for whom it is true?
what about people who have an accurate layman’s understanding of what the policy means, and are nonetheless concerned?
The psychological reason for being upset is that they are confused by complicated things which affect them personally, which they don’t understand, and which they have no control over. You can’t address that panic by telling them “don’t panic”.
The actual reason for them being upset is that someone told them to be afraid of the supposedly scary thing and told them a pack of lies about what the supposedly scary thing meant.
I propose to deal with that at the source: cut off the outrage-baiting posts that start the whole sordid cycle. Having a thread full of panicked lies at the top of the front page is bad and can be prevented.
And if you really want to comfort the frightened people and resolve their confusion, you should be talking to them, shouldn’t you? The fact that your pushback is against the person debunking the fearmongering says a lot.
The actual reason for them being upset is that someone told them to be afraid of the supposedly scary thing and told them a pack of lies about what the supposedly scary thing meant.
i.e. you completely ignored my long and reasoned explanation as to why people are upset.
which explains clearly just how nefarious and far-reaching the new policy is.
At best that comment points out that a consolidated TOS for Mozilla “services” is confusingly being linked for the browser itself. Nothing has been proven in the slightest about it being “nefarious”, and the fact that you just assert malicious intent as the default assumption is deeply problematic.
So your position is completely unconvincing and I feel no need to address it any further.
But you’re not debunking the fear mongering. You’re conspicuously ignoring any comment that explains why the concern is valid. Don’t hapless readers deserve your protection from such disinformation?
You’re largely describing boilerplate for web services, where the expectation is that users input content, and a service uses that content to provide service.
Firefox is a user agent, where the expectation is that users input content and the agent passes that content through to the intended service or resource.
When you upload or input information through Firefox, you hereby grant us a nonexclusive, royalty-free, worldwide license to use that information
You can call this boilerplate if you like, but it certainly gives Mozilla unambiguous rights relative to what you put into it.
This is like the most standard boilerplate-y clause there is for user-generated content. It’s a basic cover-your-ass to prevent someone suing you for copyright violation because, say, they just found out that when you type something in the built-in search box it makes a copy (Illegal! I’ll sue!) and transmits the copy (Illegal! I’ll sue!) to a third party.
This really does beg the question: Firefox is 20 years old. Why did they only feel the need to add this extremely standard boilerplate-y clause now?
what exactly does that mean? Were they already actively doing this, and the lawyers “won” by updating the TOS to cover that behavior? Or did the lawyers “win” because they were pushing for a business decision to change Firefox’s data gathering activities?
I disagree. I think this is actionable, relevant, and very on-topic. I’d even argue about that with you here, except that you in particular have a very solid history of bad-faith arguing, and I have better things to do.
Anyway, so far 84 of us have upvoted it, vs 7 “off-topic” flags and 8 hides, for a ratio of about 5:1, if we care about user opinions. Your paternalism isn’t a good look. Just hide it, flag it, and move on!
I wouldn’t say that the site rules don’t mean anything–I would say that many users and even admins have disregarded them for political expediency.
The long-term effects of this, of course, are deleterious…but that doesn’t matter when gosh darnit, the outgroup is wrong right now.
In the case of Apple, there’s a weird sort of thing where a release tag covers what is technically marketing. They also are both a large software and hardware vendor and, like it or not, have a large userbase. I’m not saying we should see a constant dripfeed of Apple propaganda, but it isn’t entirely without precedent.
I wouldn’t say that the site rules don’t mean anything–I would say that many users and even admins have disregarded them for political expediency.
Of course. I adopted the parent comment’s hyperbole to avoid getting bogged down in minutia. But there’s nothing wrong with more clarity and precision.
I’m really surprised to see anyone pay even the slightest of attention to this on Lobsters. It’s something my granddad would post to Facebook (example)
Fine is neat, but I’d rather gargle buckshot than let the clowncar of Python ML stuff anywhere near a clean, self-respecting Elixir deployment. I sure as shit don’t want it sharing the same address space as my BEAM VM.
Yeah I mean I love what they did with Nx/Bumblebee but there was no way they’re going to keep up with the Python ML ecosystem. For example, a year ago I wanted the YOLO model but it wasn’t available at the time. So I also appreciate that they give us an out with this.
It may hit a bit close to home, but the folks too lazy/sloppy/unskilled to properly handle good commit messages probably are the ones most easily replaced by LLMs.
Well, let’s consider the cases. If the commit message that one would have put was “try” “lol” “ugh try again”, then what did we gain by having a more verbose commit message written by the LLM? Conversely, if the commit message is actually very important, should we really use an LLM to write it for us?
Arguably those provide more context about the “why” than an LLM-provided summary of the change: they convey that the committer is in a hurry, and that this is likely an (attempted) fix to a recent commit by the same author, probably as part of the same merge.
I’m ambivalent tbh. I know amazing devs who write those silly little intermediary commits. I try to organize my commits because I assume someone, one day, might want to review a commit log to figure stuff out, but I don’t have strong feels because I rarely do that myself.
Most of the people I know who write those sorts of commits tend to be basically using them as fixup commits, except that they tend not to be aware of --fixup and probably find their flow more convenient anyway. When their change eventually appears in the main branch, it’s almost always squashed down to a single commit with a single well-written description (usually done via PRs in some git forge or other).
Or in other words, I suspect the people writing those intermediary commits usually see them as temporary, and I don’t think they’d get much out of an LLM describing their changes.
Perhaps, yeah. I tend to clean up my commit history to remove those intermediary ones as well fwiw, but I know a lot of people who don’t. I suppose the answer isn’t “use an LLM” (I know I won’t) but to just learn how to rewrite your history properly.
Or even use tools that make managing your history easier. I do think Git makes it excessively hard to rewrite commits and maintain a history of the rewrites that you’re doing, which leads to people taking the “squash everything in a PR” approach, which is typically a lot easier. But better tooling in this regard can improve things a lot.
That said, this is perhaps getting a bit off-topic from the core idea of LLM-based commit messages.
Sometimes there’s no bigger meaning behind some changes. It’s either obvious why it’s done or it becomes obvious in context of other changes. Writing commit messages for that is just going through the motions which means sometimes it becomes “refactor” or a similar message. In those cases it could be nice to have an automatic summary instead of nothing.
The majority of bugs (quantity, not quality/severity) we have are due to the stupid little corner cases in C that are totally gone in Rust. Things like simple overwrites of memory (not that rust can catch all of these by far), error path cleanups, forgetting to check error values, and use-after-free mistakes. That’s why I’m wanting to see Rust get into the kernel, these types of issues just go away, allowing developers and maintainers more time to focus on the REAL
bugs that happen (i.e. logic issues, race conditions, etc.)
This is an extremely strong statement.
I think a few things are also interesting:
I think people are realizing how low quality the Linux kernel code is, how haphazard development is, how much burnout and misery is involved, etc.
I think people are realizing how insanely not in the open kernel dev is, how much is private conversations that a few are privy to, how much is politics, etc.
I think people are realizing how insanely not in the open kernel dev is, how much is private conversations that a few are privy to, how much is politics, etc.
The Hellwig/Ojeda part of the thread is just frustrating to read because it almost feels like pleading. “We went over this in private” “we discussed this already, why are you bringing it up again?” “Linus said (in private so there’s no record)”, etc., etc.
Dragging discussions out in front of an audience is a pretty decent tactic for dealing with obstinate maintainers. They don’t like to explain their shoddy reasoning in front of people, and would prefer it remain hidden. It isn’t the first tool in the toolbelt but at a certain point there is no convincing people directly.
Dragging discussions out in front of an audience is a pretty decent tactic for dealing with
With quite a few things actually. A friend of mine is contributing to a non-profit, which until recently had this very toxic member (they’ve even attempted felony). They were driven out of the non-profit very soon after members talked in a thread that was accessible to all members. Obscurity is often one key component of abuse, be it mere stubbornness or criminal behaviour. Shine light, and it often goes away.
IIRC Hintjens noted this quite explicitly as a tactic of bad actors in his works.
It’s amazing how quickly people are to recognize folks trying to subvert an org piecemeal via one-off private conversations once everybody can compare notes. It’s equally amazing to see how much the same people beforehand will swear up and down oh no that’s a conspiracy theory such things can’t happen here until they’ve been burned at least once.
This is an active, unpatched attack vector in most communities.
I’ve found the lowest example of this is even meetings minutes at work. I’ve observed that people tend to act more collaboratively and seek the common good if there are public minutes, as opposed to trying to “privately” win people over to their desires.
I think people are realizing how low quality the Linux kernel code is, how haphazard development is, how much burnout and misery is involved, etc.
Something I’ve noticed is true in virtually everything I’ve looked deeply at is the majority of work is poor to mediocre and most people are not especially great at their jobs. So it wouldn’t surprise me if Linux is the same. (…and also wouldn’t surprise me if the wonderful Rust rewrite also ends up poor to mediocre.)
yet at the same time, another thing that astonishes me is how much stuff actually does get done and how well things manage to work anyway. And Linux also does a lot and works pretty well. Mediocre over the years can end up pretty good.
After tangentially following the kernel news, I think a lot of churning and death spiraling is happening. I would much rather have a rust-first kernel that isn’t crippled by the old guard of C developers reluctant to adopt new tech.
Take all of this energy into RedoxOS and let Linux stay in antiquity.
I’ve seen some of the R4L people talk on Mastodon, and they all seem to hate this argument.
They want to contribute to Linux because they use it, want to use it, and want to improve the lives of everyone who uses it. The fact that it’s out there and deployed and not a toy is a huge part of the reason why they want to improve it.
Hopping off into their own little projects which may or may not be useful to someone in 5-10 years’ time is not interesting to them. If it was, they’d already be working on Redox.
The most effective thing that could happen is for the Linux foundation, and Linus himself, to formally endorse and run a Rust-based kernel. They can adopt an existing one or make a concerted effort to replace large chunks of Linux’s C with Rust.
IMO the Linux project needs to figure out something pretty quickly because it seems to be bleeding maintainers and Linus isn’t getting any younger.
They may be misunderstanding the idea that others are not necessarily incentivized to do things just because it’s interesting for them (the Mastodon posters).
Redox does have the chains of trying to do new OS things. An ABI-compatible Rust rewrite of the Linux kernel might get further along than expected (even if it only runs in virtual contexts, without hardware support (that would come later.))
Linux developers want to work on Linux, they don’t want to make a new OS. Linux is incredibly important, and companies already have Rust-only drivers for their hardware.
Basically, sure, a new OS project would be neat, but it’s really just completely off topic in the sense that it’s not a solution for Rust for Linux. Because the “Linux” part in that matters.
I read a 25+ year old article [1] from a former Netscape developer that I think applies in part
The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they’ve been fixed. There’s nothing wrong with it. It doesn’t acquire bugs just by sitting around on your hard drive. Au contraire, baby! Is software supposed to be like an old Dodge Dart, that rusts just sitting in the garage? Is software like a teddy bear that’s kind of gross if it’s not made out of all new material?
Adopting a “rust-first” kernel is throwing the baby out with the bathwater. Linux has been beaten into submission for over 30 years for a reason. It’s the largest collaborative project in human history and over 30 million lines of code. Throwing it out and starting new would be an absolutely herculean effort that would likely take years, if it ever got off the ground.
The idea that old code is better than new code is patently absurd. Old code has stagnated. It was built using substandard, out of date methodologies. No one remembers what’s a bug and what’s a feature, and everyone is too scared to fix anything because of it. It doesn’t acquire new bugs because no one is willing to work on that weird ass bespoke shit you did with your C preprocessor. Au contraire, baby! Is software supposed to never learn? Are we never to adopt new tools? Can we never look at something we’ve built in an old way and wonder if new methodologies would produce something better?
This is what it looks like to say nothing, to beg the question. Numerous empirical claims, where is the justification?
It’s also self defeating on its face. I take an old codebase, I fix a bug, the codebase is now new. Which one is better?
Like most things in life the truth is somewhere in the middle. There is a reason there is the concept of a “mature node” in the semiconductor industry. They accept that new is needed for each node, but also that the new thing takes time to iron out the kinks and bugs. This is the primary reason why you see apple take new nodes on first before Nvidia for example, as Nvidia require much larger die sizes, and so less defects per square mm.
You can see this sometimes in software for example X11 vs Wayland, where adoption is slow, but most definetly progressing and now-days most people can see that Wayland is now, or is going to become the dominant tech in the space.
I don’t think this would qualify as dialectic, it lacks any internal debate and it leans heavily on appeals by analogy and intuition/ emotion. The post itself makes a ton of empirical claims without justification even beyond the quoted bit.
That means we can probably keep a lot of the old trusty Linux code around while making more of the new code safe by writing it in Rust in the first place.
I don’t think that’s a fair assessment of Spolsky’s argument or of CursedSilicon’s application of it to the Linux kernel.
Firstly, someone has already pointed out the research that suggests that existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).
Secondly, this discussion is mainly around entire codebases, not just existing code. Codebases usually have an entire infrastructure around them for verifying that the behaviour of the codebase has not changed. This is often made up of tests, but it’s also made up of the users who try out a release of a codebase and determine whether it’s working for them. The difference between making a change to an existing codebase and releasing a new project largely comes down to whether this verification (both in terms of automated tests and in terms of users’ ability to use the new release) works for the new code.
Given this difference, if I want to (say) write a new OS completely in Rust, I need to choose: Do I want to make it completely compatible with Linux, and therefore take on the significant challenge of making sure everything behaves truly the same? Or do I make significant breaking changes, write my own OS, and therefore force potential adopters to rebuild their entire Linux workflows in my new OS?
The point is not that either of these options are bad, it is that they represent significant risks to a project. Added to the general risk that is writing new code, this produces a total level of risk that might be considered the baseline risk of doing a rewrite. Now risk is not bad per se! If the benefits of being able to write an OS in a language like Rust outweigh the potential risks, then it still makes sense to perform the rewrite. Or maybe the existing Linux kernel is so difficult to maintain that a new codebase really would be the better option. But the point that CursedSilicon was making by linking the Spolsky piece was, I believe, that the risks for a project like the Linux kernel are very high. There is a lot of existing, old code. And there is a very large ecosystem where either breaking or maintaining compatibility would each come with significant challenges.
Unfortunately, it’s very difficult to measure the risks and benefits here in a quantitative, comparable way, so I think where you fall on the “rewrite vs continuity” spectrum will depend mostly on what sort of examples you’ve seen, and how close you think this case is to those examples. I don’t think there’s any objective way to say whether it makes more sense to have something like R4L, or something like RedoxOS.
Firstly, someone has already pointed out the research that suggests that existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).
I haven’t read it yet, but I haven’t made an argument about that, I just created a parody of the argument as presented. I’ll be candid, i doubt that the research is going to compel me to believe that newer code is inherently buggier, it may compel me to confirm my existing belief that testing software in the field is one good method to find some classes of bugs.
Secondly, this discussion is mainly around entire codebases, not just existing code.
I guess so, it’s a bit dependent on where we say the discussion starts - three things are relevant; RFL, which is not a wholesale rewrite, a wholesale rewrite of the Linux kernel, and Netscape. RFL is not about replacing the entire Linux kernel, although perhaps “codebase” here refers to some sort of unit, like a driver. Netscape wanted a wholesale rewrite, based on the linked post, so perhaps that’s what’s really “the single worst strategic mistake that any software company can make”, but I wonder what the boundary here is? Also, the article immediately mentions that Microsoft tried to do this with Word but it failed, but that Word didn’t suffer from this because it was still actively developed - I wonder if it really “failed” just because pyramid didn’t become the new Word? Did Microsoft have some lessons learned, or incorporate some of that code? Dunno.
I think I’m really entirely justified when I say that the post is entirely emotional/ intuitive appeals, rhetoric, and that it makes empirical claims without justification.
There’s a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong. The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming:
This is rhetoric. These are unsubstantiated empirical claims. The article is all of this. It’s fine as an interesting, thought provoking read that gets to the root of our intuitions, but I think anyone can dismiss it pretty easily since it doesn’t really provide much in the form of an argument.
It’s important to remember that when you start from scratch there is absolutely no reason to believe that you are going to do a better job than you did the first time.
Again, totally unsubstantiated. I have MANY reasons to believe that, it is simply question begging to say otherwise.
That’s all this post is. Over and over again making empirical claims with no evidence and question beggign.
We can discuss the risks and benefits, I’d advocate for that. This article posted doesn’t advocate for that. It’s rhetoric.
existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).
This is a truism. It is survival bias. If the code was buggy, it would eventually be found and fixed. So all things being equal newer code is riskier than old code. But it’s also been impirically shown that using Rust for new code is not “all things being equal”. Google showed that new code in Rust is as reliable as old code in C. Which is good news: you can use old C code from new Rust projects without the risk that comes from new C code.
But it’s also been impirically shown that using Rust for new code is not “all things being equal”.
Yeah, this is what I’ve been saying (not sure if you’d meant to respond to me or the parent, since we agree) - the issue isn’t “new” vs “old” it’s things like “reviewed vs unreviewed” or “released vs unreleased” or “tested well vs not tested well” or “class of bugs is trivial to express vs class of bugs is difficult to express” etc.
I don’t disagree that the rewards can outweigh the risks, and in this case I think there’s a lot of evidence that suggests that memory safety as a default is really important for all sorts of reasons. Let alone the many other PL developments that make Rust a much more suitable language to develop in than C.
It’s a Ship of Theseus—at no point can you call it a “new” codebase, but after a period of time, it could be completely different code. I have a C program I’ve been using and modifying for 25 years. At any given point, it would have been hard to say “this is now a new codebase,
yet not one line of code in the project is the same as when I started (even though it does the same thing at it always has).
I don’t see the point in your question. It’s going to depend on the codebase, and on the nature of the changes; it’s going to be nuanced, and subjective at least to some degree. But the fact that it’s prone to subjectivity doesn’t mean that you get to call an old codebase with a single fixed bug a new codebase, without some heavy qualification which was lacking.
What’s old and new is poorly defined and yet there’s an argument being made that “old” and “new” are good indicators of something. If they’re so poorly defined that we have to bring in all sorts of additional context like the nature of the changes, not just when they happened or the number of lines changed, etc, then it seems to me that we would be just as well served to throw away the “old” and “new” and focus on that context.
I feel like enough people would agree more-or-less on what was an “old” or “new” codebase (i.e. they would agree given particular context) that they remain useful terms in a discussion. The general context used here is apparent (at least to me) given by the discussion so far: an older codebase has been around for a while, has been maintained, has had kinks ironed out.
There’s a really important distinction here though. The point is to argue that new projects will be less stable than old ones, but you’re intuitively (and correctly) bringing in far more important context - maintenance, testing, battle testing, etc. If a new implementation has a higher degree of those properties then it being “new” stops being relevant.
It’s also self defeating on its face. I take an old codebase, I fix a bug, the codebase is now new. Which one is better?
My point was that this statement requires a definition of “new codebase” that nobody would agree with, at least in the context of the discussion we’re in. Maybe you are attacking the base proposition without applying the surrounding context, which might be valid if this were a formal argument and not a free-for-all discussion.
If a new implementation has a higher degree of those properties
I think that it would be considered no longer new if it had had significant battle-testing, for example.
FWIW the important thing in my view is that every new codebase is a potential old codebase (given time and care), and a rewrite necessarily involves a step backwards. The question should probably not be, which is immediately better?, but, which is better in the longer term (and by how much)? However your point that “new codebase” is not automatically worse is certainly valid. There are other factors than age and “time in the field” that determine quality.
Methodologies don’t matter for quality of code. They could be useful for estimates, cost control, figuring out whom you shall fire etc. But not for the quality of code.
I’ve never observed a programmer become better or worse by switching methodology. Dijkstra would’ve not became better if you made him do daily standups or go through code reviews.
There are ways to improve your programming by choosing different approach but these are very individual. Methodology is mostly a beancounting tool.
When I say “methodology” I’m speaking very broadly - simply “the approach one takes”. This isn’t necessarily saying that any methodology is better than any other. The way I approach a task today is better, I think, then the way that I would have approached that task a decade ago - my methodology has changed, the way I think has changed. Perhaps that might mean I write more tests, or I test earlier, but it may mean exactly the opposite, and my methods may only work best for me.
I’m not advocating for “process” or ubiquity, only that the approach one tasks may improve over time, which I suspect we would agree on.
It’s the largest collaborative project in human history and over 30 million lines of code.
How many of those lines are part of the core? My understanding was that the overwhelming majority was driver code. There may not be that much core subsystem code to rewrite.
For a previous project, we included a minimal Linux build. It was around 300 KLoC, which included networking and the storage stack, along with virtio drivers.
That’s around the size a single person could manage and quite easy with a motivated team.
If you started with DPDK and SPDK then you’d already have filesystems and a copy of the FreeBSD network stack to run in isolated environments.
Once many drivers share common rust wrappers over core subsystems, you could flip it and write the subsystem in Rust. Then expose C interface for the rest.
I see that Drew proposes a new OS in that linked article, but I think a better proposal in the same vein is a fork. You get to keep Linux, but you can start porting logic to Rust unimpeded, and it’s a manageable amount of work to keep porting upstream changes.
Remember when libav forked from ffmpeg? Michael Niedermayer single-handedly ported every single libav commit back into ffmpeg, and eventually, ffmpeg won.
At first there will be extremely high C percentage, low Rust percentage, so porting is trivial, just git merge and there will be no conflicts. As the fork ports more and more C code to Rust, however, you start to have to do porting work by inspecting the C code and determining whether the fixes apply to the corresponding Rust code. However, at that point, it means you should start seeing productivity gains, community gains, and feature gains from using a better language than C. At this point the community growth should be able to keep up with the extra porting work required. And this is when distros will start sniffing around, at first offering variants of the distro that uses the forked kernel, and if they like what they taste, they might even drop the original.
I genuinely think it’s a strong idea, given the momentum and potential amount of labor Rust community has at its disposal.
I think the competition would be great, especially in the domain of making it more contributor friendly to improve the kernel(s) that we use daily.
I certainly don’t think this is impossible, for sure. But the point ultimately still stands: Linux kernel devs don’t want a fork. They want Linux. These folks aren’t interested in competing, they’re interested in making the project they work on better. We’ll see if some others choose the fork route, but it’s still ultimately not the point of this project.
Linux developers want to work on Linux, they don’t want to make a new OS.
While I don’t personally want to make a new OS, I’m not sure I actually want to work on Linux. Most of the time I strive for portability, and so abstract myself from the OS whenever I can get away with it. And when I can’t, I have to say Linux’s API isn’t always that great, compared to what the BSDs have to offer (epoll vs kqueue comes to mind). Most annoying though is the lack of documentation for the less used APIs: I’ve recently worked with Netlink sockets, and for the proc stuff so far the best documentation I found was the freaking source code of a third party monitoring program.
I was shocked. Complete documentation of the public API is the minimum bar for a project as serious of the Linux kernel. I can live with an API I don’t like, but lack of documentation is a deal breaker.
While I don’t personally want to make a new OS, I’m not sure I actually want to work on Linux.
I think they mean that Linux kernel devs want to work on the Linux kernel. Most (all?) R4L devs are long time Linux kernel devs. Though, maybe some of the people resigning over LKML toxicity will go work on Redox or something…
Re-Implementing the kernel ABI would be a ton of work for little gain if all they wanted was to upstream all the work on new hardware drivers that is already done - and then eventually start re-implementing bits that need to be revised anyway.
If the singular required Rust toolchain didn’t feel like such a ridiculous to bootstrap 500 ton LLVM clown car I would agree with this statement without reservation.
Zig is easier to implement (and I personally like it as a language) but doesn’t have the same safety guarantees and strong type system that Rust does. It’s a give and take. I actually really like Rust and would like to see a proliferation of toolchain options, such as what’s in progress in GCC land. Overall, it would just be really nice to have an easily bootstrapped toolchain that a normal person can compile from scratch locally, although I don’t think it necessarily needs to be the default, or that using LLVM generally is an issue. However, it might be possible that no matter how you architect it, Rust might just be complicated enough that any sufficiently useful toolchain for the language could just end up being a 500 ton clown car of some kind anyways.
Depends on which parts of GP’s statement you care about: LLVM or bootstrap. Zig is still depending on LLVM (for now), but it is no longer bootstrappable in a limited number of steps (because they switched from a bootstrap C++ implementation of the compiler to keeping a compressed WASM build of the compiler as a blob.
Yep, although I would also add it’s unfair to judge Zig in any case on this matter now given it’s such a young project that clearly is going to evolve a lot before the dust begins to settle (Rust is also young, but not nearly as young as Zig). In ten to twenty years, so long as we’re all still typing away on our keyboards, we might have a dozen Zig 1.0 and a half dozen Zig 2.0 implementations!
Yeah, the absurdly low code quality and toxic environment make me think that Linux is ripe for disruption. Not like anyone can produce a production kernel overnight, but maybe a few years of sustained work might see a functional, production-ready Rust kernel for some niche applications and from there it could be expanded gradually. While it would have a lot of catching up to do with respect to Linux, I would expect it to mature much faster because of Rust, because of a lack of cruft/backwards-compatibility promises, and most importantly because it could avoid the pointless drama and toxicity that burn people out and prevent people from contributing in the first place.
From the thread in OP, if you expand the messages, there is wide agreement among the maintainers that all sorts of really badly designed and almost impossible to use (safely) APIs ended up in the kernel over the years because the developers were inexperienced and kind of learning kernel development as they went. In retrospect they would have designed many of the APIs very differently.
It’s based on my forays into the Linux kernel source code. I don’t doubt there’s some quality code lurking around somewhere, but the stuff I’ve come across (largely filesystem and filesystem adjacent) is baffling.
Seeing how many people are confidently incorrect about Linux maintainers only caring about their job security and keeping code bad to make it a barrier to entry, if nothing else taught me how online discussions are a huge game of Chinese whispers where most participants don’t have a clue of what they are talking about.
I doubt that maintainers are “only caring about their job security and keeping back code” but with all due respect: You’re also just taking arguments out of thin air right now. What I do believe is what we have seen: Pretty toxic responses from some people and a whole lot of issues trying to move forward.
Seeing how many people are confidently incorrect about Linux maintainers only caring about their job security and keeping code bad to make it a barrier to entry
Huh, I’m not seeing any claim to this end from the GP, or did I not look hard enough? At face value, saying that something has an “absurdly low code quality” does not imply anything about nefarious motives.
Still, in GP’s case the Chinese whispers have reduced “the safety of this API is hard to formalize and you pretty much have to use it the way everybody does it” to “absurdly low quality”. To which I ask, what is more likely. 1) That 30-million lines of code contain various levels of technical debt of which maintainers are aware; and that said maintainers are worried even of code where the technical debt is real but not causing substantial issue in practice? Or 2) that a piece of software gets to run on literally billions of devices of all sizes and prices just because it’s free and in spite of its “absurdly low quality”?
Linux is not perfect, neither technically nor socially. But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face.
GP here: I probably should have said “shockingly” rather than “absurdly”. I didn’t really expect to get lawyered over that one word, but yeah, the idea was that for a software that runs on billions of devices, the code quality is shockingly low.
Of course, this is plainly subjective. If your code quality standards are a lot lower than mine then you might disagree with my assessment.
That said, I suspect adoption is a poor proxy for code quality. Internet Explorer was widely adopted and yet it’s broadly understood to have been poorly written.
But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face
I’m sure self-righteousness could get you to the same place, but in my case I arrived by way of experience. You can relax, I wasn’t attacking Linux—I like Linux—it just has a lot of opportunity for improvement.
I guess I’ve seen the internals of too much proprietary software now to be shocked by anything about Linux per se. I might even argue that the quality of Linux is surprisingly good, considering its origins and development model.
I think I’d lawyer you a tiny bit differently: some of the bugs in the kernel shock me when I consider how many devices run that code and fulfill their purposes despite those bugs.
FWIW, I was not making a dig at open source software, and yes plenty of corporate software is worse. I guess my expectations for Linux are higher because of how often it is touted as exemplary in some form or another. I don’t even dislike Linux, I think it’s the best thing out there for a huge swath of use cases—I just see some pretty big opportunities for improvement.
But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face.
Or actual benchmarks: the performance the Linux kernel leaves on the table in some cases is absurd. And sure it’s just one example, but I wouldn’t be surprised if it was representative of a good portion of the kernel.
Well not quite but still “considered broken beyond repair by many people related to life time management” - which is definitely worse than “hard to formalize” when “the way ever[y]body does it” seems to vary between each user.
I love Rust but still, we’re talking of a language which (for good reasons!) considers doubly linked lists unsafe. Take an API that gets a 4 on Rusty Russell’s API design scale (“Follow common convention and you’ll get it right”), but which was designed for a completely different programming language if not paradigm, and it’s not surprising that it can’t easily be transformed into a 9 (“The compiler/linker won’t let you get it wrong”). But at the same time there are a dozen ways in which, according to the same scale, things could actually be worse!
What I dislike is that people are seeing “awareness of complexity” and the message they spread is “absurdly low quality”.
Note that doubly linked lists are not a special case at all in Rust. All the other common data structures like Vec, HashMap etc. also need unsafe code in their implementation.
Implementing these datastructures in Rust, and writing unsafe code in general, is indeed roughly a 4. But these are all already implemented in the standard library, with an API that actually is at a 9. And std::collections::LinkedList is constructive proof that you can have a safe Rust abstraction for doubly linked lists.
Yes, the implementation could have bugs, thus making the abstraction leaky. But that’s the case for literally everything, down to the hardware that your code runs on.
You’re absolutely right that you can build abstractions with enough effort.
My point is that if a doubly linked list is (again, for good reasons) hard to make into a 9, a 20-year-old API may very well be even harder. In fact, std::collections::LinkedList is safe but still not great (for example the cursor API is still unstable); and being in std, it was designed/reviewed by some of the most knowledgeable Rust developers, sort of by definition. That’s the conundrum that maintainers face and, if they realize that, it’s a good thing. I would be scared if maintainers handwaved that away.
Yes, the implementation could have bugs, thus making the abstraction leaky.
Bugs happen, but if the abstraction is downright wrong then that’s something I wouldn’t underestimate. A lot of the appeal of Rust in Linux lies exactly in documenting/formalizing these unwritten rules, and wrong documentation can be worse than no documentation (cue the negative parts of the API design scale!); even more so if your documentation is a formal model like a set of Rust types and functions.
That said, the same thing can happen in a Rust-first kernel, which will also have a lot of unsafe code. And it would be much harder to fix it in a Rust-first kernel, than in Linux at a time when it’s just feeling the waters.
In fact, std::collections::LinkedList is safe but still not great (for example the cursor API is still unstable); and being in std, it was designed/reviewed by some of the most knowledgeable Rust developers, sort of by definition.
At the same time, it was included almost as like, half a joke, and nobody uses it, so there’s not a lot of pressure to actually finish off the cursor API.
It’s also not the kind of linked list the kernel would use, as they’d want an intrusive one.
And yet, safe to use doubly linked lists written in Rust exist. That the implementation needs unsafe is not a real problem. That’s how we should look at wrapping C code in safe Rust abstractions.
The whole comment you replied to, after the one sentence about linked lists, is about abstractions. And abstractions are rarely going to be easy, and sometimes could be hardly possible.
That’s just a fact. Confusing this fact for something as hyperbolic as “absurdly low quality” is stunning example of the Dunning Kruger effect, and frankly insulting as well.
I personally would call Linux low quality because many parts of it are buggy as sin. My GPU stops working properly literally every other time I upgrade Linux.
No one is saying that Linux is low quality because it’s hard or impossible to abstract some subsystems in Rust, they’re saying it’s low quality because a lot of it barely works! I would say that your “Chinese whispers” misrepresents the situation and what people here are actually saying. “the safety of this API is hard to formalize and you pretty much have to use it the way everybody does it” doesn’t apply if no one can tell you how to use an API, and everyone does it differently.
Actually, the NT kernel of all things seems to have a pretty good reputation, and I wouldn’t dismiss the BSD kernels out of hand. I don’t know which kernel is better, but it seems you do. If you could explain how you came to this conclusion that would be most helpful.
*nod* I haven’t been a Windows person since shortly after the release of Windows XP (i.e. the first online activation DRM’d Windows) but, whenever I see glimpses of what’s going on inside the NT kernel in places like Project Zero: The Definitive Guide on Win32 to NT Path Conversion, it really makes me want to know more.
Article was rather more tolerable than I was expecting, so good on the author.
I will highlight one particular issue: there’s no way to have both an unbiased/non-problematic electric brain and one that respects users’ rights fully. To wit:
it feels self-evident that letting a small group of people control this technology imperils the future of many people.
This logic applies just as much to a minority pushing for (say) trans-friendly norms as it does for a minority pushing for incredibly trans-hostile ones. Replace trans-friendly with degrowth or quiverfull or whatever else strikes your fancy.
Like, we’ve managed to create these little electric brains that can help people “think” thoughts that they’re too stupid, ignorant, or lazy to by themselves (I know this, having used various electric brains in each of these capacities). There is no morally coherent way–in my opinion!–of saying “okay, here’s an electric brain just for you that respects your prejudices and freedom of association BUT ALSO will only follow within norms established by our company/society/enlightened intellectual class.”
The only sane thing to do is to let people have whatever electric brains they want with whatever biases they deem tolerable or desirable, make sure they are aware of alternatives and can access them, and then hold them responsible for how they use what their electric brains help them with in meatspace. Otherwise, we’re back to trying to police what people think with their exocortices and that tends to lose every time.
This is just free speech absolutism dressed up in science fiction. There are different consequences to different kinds of speech, and these aren’t even “brains”: they’re databases with a clever compression and indexing strategy. Nobody is or should be required to keep horrendous speech in their database, or to serve it to other people as a service.
Nobody is or should be required to keep horrendous speech in their database, or to serve it to other people as a service.
Isn’t that exactly the problem @friendlysock is describing? This is already a reality. One has to abide the American Copilot refusing to complete code which mentions anything about sex or gender and the Chinese Deepseek refusing to say anything about what happened at Tianenmen square in 1989.
The problem is powerful tech companies (and the governments under which they fall) imposing their morality and worldview on the user. Same is true for social media companies, BTW. You can easily see how awkward this is with the radically changed position of the large tech companies with the new US administration and the difference in values it represents.
It’s not “free speech absolutism” to want to have your own values represented and expressed in the datasets. At least with more distributed systems like Mastodon you get to choose your moderators. Nobody decries this as “free speech absolutism”. It’s actually the opposite - the deal is that you can join a system which shares your values and you will be protected from hearing things you don’t want to hear. Saying it like this, I’m not so sure this is so great, either… you don’t want everyone retreating into their own siloed echo chambers, that’s a recipe for radicalisation and divisiveness.
The problem is not the existence of harmful ideas. The problem is lack of moderation when publishing them.
And yeah, nobody should be required to train models in certain ways. But maybe we should talk about requirements for unchecked outputs? Like when kids ask a chatbot, it shouldn’t try to make them into fascists.
On the other hand, when I ask a chatbot about what’s happening in the US and ask it to compare with e.g. Umberto Eco’s definition of Fascism, it shouldn’t engage in “balanced discussion” just because it’s “political”.
We need authors to have unopiniated tools if we want quality outputs. Imagine your text editor refusing to write certain words.
This is just free speech absolutism dressed up in science fiction.
Ah, I guess? If that bothers you, I think that’s an interesting data point you should reflect on.
Nobody is or should be required to keep horrendous speech in their database, or to serve it to other people as a service.
Sure, but if somebody chooses to do so, they should be permitted. I’m pointing out that the author complains about bias/problematic models, and also complains about centralization of power. The reasonably effective solution (indeed, the democratic one) is to let everybody have their own models–however flawed–and let the marketplace of ideas sort it out.
In case it needs to be explicitly spelled out: there is no way of solving for bias/problematic models that does not also imply the concentration of power in the hands of the few.
One thing that stood out to me is that systemd’s configuration approach is declarative while others are imperative, and this unlocks a lot of benefits, much like how a query planner can execute a declarative query better than most hand-written imperative queries could.
Yea, it’s unreasonbly effective, and in general all the components hang together really well.
In my experience the nicest systems to administer have kind of an “alternating layer” approach of imperative - declarative - imperative - declarative - … Papering over too much complexity with declarative causes trouble, and so does forcing (or encouraging, or allowing) too much imperative scripting at any given layer. Systemd (it seems to me, as a user) really nailed the layers to encapsulate declaratively - things you want to happen at boot, that depend on each other. Units are declarative but effectively “call into” imperative commands, which - in turn - ought to have their own declarative-ish configuration files, which - in turn - are best laid down by something programmable at the higher layer (something aware of larger state - what load balancers are up - what endpoints are up - what DNS servers are up, yadda yadda).
As an aside, my main beef with Kubernetes is that is has a bad design - or perhaps a missing design - for that next level up of configuration management. Way way way too much is squashed into a uniform morass of the generalized “object namespace”, and the only way to really get precise behavior out of that is to implement a custom controller, which is almost like taking on the responsibility of a mini-init system, or a mini-network config daemon.
What I want at this layer instead is a framework for writing a simple and declarative-ish DSL that lets me encode my own business domain. Not every daemon is a Deployment object. Not every service is a Service. I think we were just getting good at this in 2014 with systems like Salt and Chef, but K8s came in and sucked all the oxygen out of the room for a decade, not to mention that Salt and Chef did themselves no favors by simply being a metric boatload of not-always-easy-to-install scripting runtimes.
Years ago when I looked to contribute to Salt I was pretty astounded by the bugginess, but it worked for enough people and there was not and still is not a lot of competition in that space, so VMware acquired them.
We have yet to see the next generation of that sort of software. System Initiative may be the sole exception, and I am enthusiastic about the demos they put out. But I do think we’re missing something here.
Re: concentrating power: the “open source” models have definitely made big improvements here. Training a new model is still out of reach for regular people, but fine tuning isn’t. Computers are always getting better, and code is being optimized, so in the future it may be even more accessible.
Ultimately, I want robots to do the things I don’t want to do. I want them to do my dishes and my laundry. I don’t want them to play music instead of me, write code instead of me, write words instead of me.
We arguably already have robots that do both dishes and laundry. They just don’t do the step of putting the items in the machine.
I don’t want them to play music instead of me, write code instead of me, write words instead of me.
Pushed further: we already have beat matching (helpful for DJs), assemblers and linkers and treeshakers and SAT solvers (helpful for programmers), and spellcheckers (helpful for writers).
And yet, we also still have chamber orchestras, demoscene, and pen and paper.
Technology lets people pick and choose what part of the process they want to spend their life force on–I don’t think that’s bad at all.
Over-centralization left the movement vulnerable as well. A more diverse space would not have felt the same impact of Sunlight’s internal instability, or Code for America’s shift away from funding brigades.
One could with grim amusement note that this is a microcosm of the same issue which at the national level got us into the mess we’re in today and rendered blue team ineffectual–but, that’s strictly politics and not super helpful to look into here.
I hope most of these civic technologists will find that they can continue their work at the state or local level, but it is likely many will not be able to.
My take, having watched this space and been super tangentially involved in civic hackathons and things, is that this is most useful for all the boring local and state touchpoints. Being able to access usable datasets, being able to give people in local government good tooling to do their jobs more efficiently…all of this is good. That said, even with good tools sheer combination of inertia, incompetence, and motivated self-interest one encounters can make it incredibly hard to do something even as simple as keep maps up to date–and that particular is even more pernicious given the staking out of datasets by folks like Esri.
If I had my druthers, we’d standardize at a state level on forms for all the normal shit cities do and provide a state-level depot for that stuff–including hosting and software development–and offer a supported “toolbox” of very simple tools that cities could compose to match their needs (do you have a lot of oil and gas? okay, here’s the stuff for registering wells or whatever. do you have a library? okay, here’s basic circulation desk stuff. do you have a water treatment plant? your own police force? fire department? …etc.) Ditto for useful data and querying.
Then, periodically, there’d be like quarterly or yearly conferences so different state bureaus could share what tools the’ve made and lessons they’ve learned and bubble those up to a federal repository of same.
The problems of course are numerous–an incomplete accounting here:
Every city is going to think it’s a special snowflake with its own needs…and this is even true!
Every city is going to have civil servants that do not want to change how they do anything because they’re scared of getting in trouble, having to learn a new thing, or risking their gravy train.
Even without the current atmosphere, long-term maintenance of development and infrastructure projects is…dicey…at every level of government.
This sort of thing directly cuts into the rackets of the usual beltway bandits and firms who specialize in extracting maximum shareholder value from unlucky or ignorant government officials.
At least in the US, this sort of work would be suspect (“the gubberment can’t do anything worthwhile, private profit motives will save us, muh reaganz”) and an easy target for folks looking to pander to their bases.
A good chunk of the current generation of developers have been brain-rotted by the ZIRP years or are young and full of that New Shiny energy; this makes them ill-suited to the task of boring, durable, tiny, maintainable software.
I’d love to see this fixed, but I’m not enough of a fool to expect it nor enough of a masochist to attempt it.
I don’t really want to go “skill issue” here–least of all because in spite of writing an obnoxious amount of bash over the last few years I still fuck it up regularly–but I do think it’s important to note there’s a difference in kind for these complaints.
Having nonstandard shells on different machines is the result of not just using a common shell–you’re almost always (yes, yes, containers are a thing, sit down there in the back) gonna have either sh or bash if you want it (and yes at least for a while the default bash on OSX was old and you needed to upgrade it to get things like associative arrays). There is a reason I put my skill points into bash instead of zsh, fish, oilsh, ksh, csh, or even powershell–all better shells, none as ubiquitous. But, anyways, that complaint is at least about shells.
The shortcuts and terminal-itself complaints about things like keyboard shortcuts and copy-paste behavior are unique to the terminal environment, and vary across using a browser window, one of any of the dozen X11 terminals, PuTTY (remember that lol), and so on and so forth–but again, that’s neither the shell nor the programs.
And then finally, issues around discoverability and arguments and whatnot are squarely the problem of the utilities. Some utilities are nice and straightforward (albeit this is more a BSD thing than a GNU thing) and have a single command with some flags. Some utilities are relatively consistent but simply so large as to defy easy reference (I’m looking at you FFMPEG, where the problem domain is simply so large it can’t be wrapped up neatly). Some tools have some warts (git), and some tools seemed like they deliberately eschewed reasonable and legible convention (seriously, fuck the constellation of nix and nixos CLI tooling).
Anyways, I think that it’s worth it to remember that these frustrations are really probably clustered around different layers of the stack, and to approach it that way instead of by vertical slice (though we do tend to think in terms of “I just want to look at cat pictures on the terminal and get street cred” instead of “okay so I’m pretty sure uxrvt supports color sixel and I think I remember how to get curl to use redirection to aalib in bash”).
FWIW Oils is unique among these, in that it has OSH, which is POSIX and bash-compatible. (YSH is the incompatible shell)
OSH is actually more POSIX compatible than dash, the default /bin/sh on Debian
It’s also the most bash-compatible shell.
Right now you will get better error messages, and it’s faster in some cases, but eventually we will add some more distinguishing features, like perhaps profiling of processes
There is also a developing “OSH standard library” for task files and testing, which runs under bash too
Definitely agree about the layers. Can you imagine a similarly open-ended survey about frustrations with GUIs? But the lack (or mere locality) of standards and conventions is a real problem in any human interface, even apart from the inadequacies of the standards that do exist.
That worse-is-better, least-common-denominator thing about bash, though: that’s a wicked problem. Not a huge one, but still. It’s the reason I gave up on fish, despite liking basically everything about it better than any other shell I’ve tried. At least oils has a well-though-out compatibility story that could enable systematic upgrades… but that’s still a lot of work, against (or any way across) stiff currents of cultural inertia.
Neat work–maybe use segment anything (or something similar) and do better classification on the detected objects? Captioning is tricky, and given that it’s basically a preprocessing step I wouldn’t sweat using commercial models.
Once I realized that I started open sourcing everything, like pretty much every piece of code I’ve written in the past six years, I’ve open sourced purely as a defense against me losing access to that code in the future.
Having led open-sourcing efforts at a couple of companies now, for similarly self-interested reasons:
If the company doesn’t have a policy, that can be good! Try something!
Go for an Apache or similar license, don’t be clever, don’t deal with legal if you can.
Give a clear strategy for how open-sourcing stuff will be a benefit to the company (interoperation, recruiting, etc.).
Unless it’s in the clear business model of the company, make sure that the open-source work doesn’t create additional constraints on the developers–free as in puppies is a reasonable model (“Here’s the code, it’s licensed, if you want to fork it and do more with it please do so! We don’t take issues or PRs” is a very low bar that still is useful).
Do not put the company as the maintaining org of the project in your ecosystem of choice; let other folks handle packaging and whatever else–potentially including employees in their free time. If you don’t do this, you make it easier for company political issues later to leak into the dependency tree.
This seems to be related to the problem of “why do people adopt worse systems with better docs and discussion instead of better systems without those things”.
I’ve seen at least one community seeming to shift out of discord and Slack to publicly-crawlable fora for exactly this reason, to make it easier for knowledge to get hoovered up by the bots–and I kinda don’t hate it.
There’s so many forms of this. I’ve dealt with a team at $bigcorp adopting a worse system without docs, over a better system with docs, and then rebuilding (in a worse way) the features of the better system slowly. They picked it because the better system was written by a department they saw as competing with them.
The point is that if the decisions are based on no technical factors, then whether or not LLMs support the thing well is also not going to factor in.
This. I don’t think this title makes a ton of sense tbh - innovation doesn’t necessarily mean using the new thing, it can also mean old things adding new things because they get so popular they get the resources to do it.
Not just “worse systems with better docs” but also “systems with more libraries”. For better or worse LLMs increase the increasing returns to scale that benefit larger programming and technology communities.
Which probably means we’re in for an even longer period in which innovation is not actually that innovative - or perhaps highly incremental - while we wait for new, disruptive innovations that are so powerful they overcome all the disadvantages of novelty. But really that’s just life - as we eat more of the low hanging intellectual and technological fruit, we get more to a state like any other field of equilibrium periods punctuated by revolutions. In some ways this is good - if you want to do something new you have to do something really new and figure out how to launch it.
Over in Elixir, some of the projects–now, it could simply be they were sick of losing things to Slack backscroll, but I recall somebody mentioning they wanted to have their framework better supported. I don’t intend this to be fake news, of course, but memory is a fallible thing.
Because there’s more to OS development than failed experiments at 90s Bell Labs. It solves actual problems (i.e. plugins in same address space, updating dependencies without relinking everything, sharing code between multiple address spaces). Now, there’s a lot of issues in implementations that can be learned from (i.e. the lack of ELF symbol namespacing); I don’t know if Redox will simply slavishly clone dynamic linking as it exists in your typical Linux system, or iterate on the ideas.
I don’t think plugins in the same address space are a good idea in a low-level language. In particular I think PAM and NSS would have been better as daemons not plugin systems. It’s better to do plugins with a higher-level language that supports sandboxing and has libraries that can be tamed.
Sharing code between address spaces is a mixed blessing. It adds a lot of overhead from indirection via hidden global mutable state. ASLR makes the overhead worse, because everything has to be relinked on every exec.
Right, conflating libraries and plugins in dynamic linking was a mistake - especially since unloading libraries is basically impossible. Maybe there’s research into that though?
…but arguably not, like, a lot more.
It’s unfortunate Plan 9 is a thought-terminating object. There’s a lot of room in the osdev space, and unfortunately Plan 9 sucks all the oxygen out of the room when it gets mentioned, especially when it’s the “use more violence” approach to Unix’s problems.
How about its own successor, Inferno?
Inferno and Limbo probably mostly live on in Golang, of all things. Rob Pike is like the fucking babadook of tech.
Er. What is a “babadook”? Google tells me it’s a horror film, which isn’t very helpful…
It looks like the major thing this does is create some enterprise support for flakes (in their current form).
There’s still a lot of fertile ground for improving flakes, and I bet that upstream Nix will keep that up, but for commercial use having people that can be paid money to guarantee behavior is pretty helpful. Overall, another step forward in getting Nix more palatable to industry. Good work.
Another tired dream-fantasy from another tedious True Believer who has never stopped to question how one goes from “spicy autocomplete” to “deus ex machina”.
LLMs do not think and do not reason, but they do squander vast amounts of energy and resources to do an extremely poor imitation of thinking and reasonin by remixing human-written text that contains thinking and reasoning. Sadly, many people are unable to discern this vital and essential difference, just as they can’t tell bot-generated images from reality.
Very elaborate prompting can embed human-created algorithms, in effect making LLM bots into vastly inefficient interpreters: multiple orders of magnitude less efficient even than the bloated messes of 202x code such as interpreters in Wasm running inside 4 or 5 nested levels of virtualisation.
The fundamentalist believers who have failed to grasp the point in paragraph 2 then use paragraph 3 to convince themselves of transubstantiation: that if you ignore the embedded code in a Sufficiently Complex prompt, then the result is magic.
(I am intentionally referencing Clarke’s Third Law.)
You can’t lift yourself by your own bootstraps, contrary to Baron Munchäusen. You can’t bootstrap intelligence by remixing text that contains the result of intelligence: you can merely fake it. No matter how good the fake, it never magically becomes real. There is an implicit leap in here which the fundies try to paper over.
This is the “and then a miracle occurs” step.
The emperor has no clothes.
Even the best fake gold is not gold. Yes you can make gold, but transmutation is very very hard and the real thing remains cheaper, however precious the result.
FWIW I find stories like this a welcome counterpoint:
https://www.wheresyoured.at/wheres-the-money/
The emperor might not have clothes, but the knock-on effects of invisible thread speculators in the marketplace cannot be ignored.
How much of this submission did you read, out of curiosity? Your quip of “never stopped to question” makes me wonder–given that the whole purpose of the series is questioning and speculation.
If there are specific bits and predictions it makes please quote them and question them here instead of giving the exact lame dismissal I warned about when I submitted the story.
I read the entire thing, of course.
I confess that I’ve not yet finished part 2 or even started part 3, but they seem to be getting progressively harder work for me as they descend further into fantasy.
I have spelled out my objection clearly and distinctly.
The entire article is built on a single assumption which is unquestioned: that LLM bots are intelligent, and that this intelligence is steadily increasing, and it’s only a matter of time until they equal and then exceed human intelligence.
I feel strongly that this assumption is false.
They aren’t, it isn’t, and they won’t.
As I said: they can’t think and they can’t reason. They can’t even count. There is no pathway from very clever text prediction (based on arbitrarily vast models, no matter how big) to reasoning.
The effect of speculation? A financial bubble, followed by a collapse, which will destroy a lot of people’s careers and livelihoods, and squander a vast amount of resources.
LLM is not AI and there is no route from LLMs to AGI. But the tech bros don’t care and some will get rich failing to make anything useful.
This is not done victimless crime. All humanity and the whole world are victims.
I have now finished all of parts 2 and 3.
It was not time well spent.
There is zero questioning of the central dogma. The miracle machines just keep miraculously getting better and better.
There is no consideration of resource limitations, of efficiency, of cost, of input corpus poisoning, of negative feedback loops, of the environmental degradation this would cause, nothing.
All there is is more and better miracle machines, and given the continual cost-free miracles which have no impediment, no side effects, no cost, then what it might do to people… Which is mostly positive until it’s too late.
The “and then a miracle occurs” step is never examined. When it’s convenient more miracles occur. Nobody seems to mind much.
There is nothing interesting here, to me or for me.
In essence it’s masturbatory fantasy, ad absurdam.
The emperor never did have any clothes, but he magically never catches cold, never gets sick, and the invisible clothes just keep getting better and better.
I have a few problems with this post.
Firstly - we start off by listing several ways that AI could be used for various creative pursuits, focused mainly on analysis. Yet, beyond the anecdote of “feeding AI samples of my own work, I can generate style guides that highlight my quirks and recurring themes”, this isn’t pursued any further. I’m certainly curious about this, but the author doesn’t go into any further detail about how to actually use tooling for this.
Unfortunately, after that first section, I land in this regrettably common trope of “person A (who likes AI) tries to convince person B (who doesn’t like AI) that AI is useful, completely missing the issues that person B has with AI”. Personally, my issues are all about the ethics of training models, which the author completely glosses over:
So it’s stealing in a non-legal sense, then? Or what?
It seems obvious to me that the current crop of generative AI is only as impressive as it is because of the breadth and quality of the training set, which in large part (if not totality) comes from uncredited and uncompensated labour from the very artists who the companies selling access to the model are attempting to undercut and obsolete – and the undercutting is already happening.
Call me “the chorus” all you want. Yeah, “the people pushing this stuff early on were assholes”, but the people pushing it now are assholes too.
The thing that I’ll point out is that it’s incredibly unlikely that those creators are going to be any better compensated under some other operating regime: the alternative to a stupid anime diffusion for a blog post isn’t that I’m going to commission a piece, and the alternative to a silly little anthropomorphic dhole on a slide deck is usually just gonna be “okay fuck it, stock clipart, whatevs”. Similarly, the people who want to support an artist will both find that art and fund that artist. That isn’t changing.
The thing I think should be more widely understood, using the music industry’s attacks on post-scarcity as a guide, is that there will be a few companies stepping forward to agitate for copyright enforcement, and then coincidentally selling access to those datasets with some kind of “reasonable split” with the authors of the content. You need only look at how good Spotify and similar schemes have worked out for the independent and small musicians, whose main purpose is to provide a veneer of legitimizing outrage.
In my opinion, we’re better off normalizing the use of generative AI, making it a valid option that isn’t just to the benefit of the large companies with legal departments who can exercise regulatory capture, and setting a societal expectation that slop is low art but you can totally pay somebody to make good art instead! Remixes are okay! Sampling is okay! Transformative work and curation is okay!
The alternative, of course, is the ghoulish creatorwashing by distributors and publishers to pretend that they’re sticking up for the little guy while eating all of the licensing fees for these models. It’s so painfully obvious a ploy that I’m genuinely baffled people keep advocating on their behalf.
Well-founded and well-tested? A little early for that. Still waiting on the verdicts of some big lawsuits.
We don’t really have
cautionary taleas a tag, so I did what I could.I’m posting this because I think the extrapolation (and thank God it’s just extrapolation and that means it can be wrong!) is a good thought experiment, and though the stuff a decade or two out is floppier the predictions and analyses for this decade seem prescient.
For discussion purposes, I’d suggest posting only to either identify, agree, or disagree with the presented predictions and their underlying assumptions–otherwise, we’re just probably gonna get a bunch of uninformed and poorly articulated “I hate AI”, “sama is a psycopath”, “LLMs can’t code”, “xAI is bad because Musk is bad”, etc.
It was a good read. Definitely some plausible scenarios in there. Yet, I wish the author would drop the USA-tinted glasses (the “machinations” of the CCP, the Chinese “surveillance state” and Xi Thought Tutors, with no mention of the US use of genAI as propaganda and censorship in social media and, by extension, elections). With the current trend of twitter becoming C-SPAN (and social media brokering all media thanks to generative sugar-coating, thereby completing the transformation), the only real difference between the two superpowers will be the colors on the flag.
While I disagree on your last assertion (for historical reasons, not taking a side), I think you’re completely correct that the story does have a lot of the reflexive (and lazy!) Sinocriticism common these days.
Oh, it’s this again.
So, look. Every single internet-connected thing that involves anything that could even be considered user-generated content, and has lawyers, sooner or later inserts a clause into its terms saying you grant them a royalty-free, non-exclusive, non-revocable (etc. etc.) license to copy and distribute things you, the user, input into it.
This is like the most standard boilerplate-y clause there is for user-generated content. It’s a basic cover-your-ass to prevent someone suing you for copyright violation because, say, they just found out that when you type something in the built-in search box it makes a copy (Illegal! I’ll sue!) and transmits the copy (Illegal! I’ll sue!) to a third party.
But about every six months someone notices once of these clauses, misinterprets it, and runs around panicking and screaming OH MY GOD THEY CLAIM COPYRIGHT OVER EVERYTHING EVERYONE DOES WHY WOULD THEY NEED THAT PANIC PANIC PANIC PANIC PANIC OUTRAGE OUTRAGE PANIC.
And then it sweeps through the internet with huge highly-upvoted threads full of angry comments from people who have absolutely no clue what the terms actually mean but who know from the tone of discussion that they’re supposed to be outraged about it.
After a few days it blows over, but then about six months later someone notices once of these clauses, misinterprets it, and runs around panicking and screaming OH MY GOD THEY CLAIM COPYRIGHT OVER EVERYTHING EVERYONE DOES WHY WOULD THEY NEED THAT PANIC PANIC PANIC PANIC PANIC OUTRAGE OUTRAGE PANIC.
And then…
@pushcx this should not be allowed on lobste.rs. It’s 100% outrage-mob baiting.
Saying that everyone else does it does not make it okay. Are there court cases or articles describing the limits you say are implicit?
If you are as right as you think you are, then you could be educating instead of complaining to moderators.
That’s the point. GDPR has not been that well tested in court. As long as it hasn’t, people will stick to legal boilerplate to make it as broad as possible. This is why all terms of services look like copypasta.
Putting words in my mouth doesn’t make a counterargument.
What do you think is not OK about this boilerplate CYA clause? Computers by their nature promiscuously copy data. Online systems copy and transmit it. The legal world has settled on clauses like this as an alternative to popping up a request for license every time you type into an online form or upload a file, because even if nobody ever actually would sue they don’t want to trust to that and want an assurance that if someone sues that person will lose, quickly. They’ve settled on this because copy/pasting a standard clause to minimize risk is a win from their perspective.
Why is this evil and bad and wrong from your perspective? Provide evidence.
The system we currently have may be structured in a way which makes clauses like this necessary or expedient in order to do business, but the validity of such a clause for that reason doesn’t excuse the system that created it.
But Firefox isn’t a web service. It’s a program that runs on my computer and sends data to websites I choose to visit. Those websites may need such legal language for user generated content, but why does Mozilla need a license to copy anything I type into Firefox?
This. I’ve chatted with a few lawyers in the space and this is literally the first time we’re seeing that interpretation to apply to a local program you choose to run that is your agent.
Firefox integrates with things that are not purely your “local agent”, including online services and things not owned by Mozilla. And before you decide this means some sort of sinister data-stealing data-selling privacy violation, go back and look at my original example.
So clearly rejecting their TOS should just toggle off all of those services, right?
None of these are activities falling under copyright, so a license is meaningless.
The list of data subprocessors is short and well documented: https://support.mozilla.org/en-US/kb/firefox-subprocessor-list
So it also can’t be an issue of “let’s be blanket because we can’t give you the list”.
The Python Package Index has almost exactly the same clause in its terms of service for things you voluntarily choose to send to them.
I guess their legal advisers are just bad or something. Maybe you could go see about getting hired to replace them.
When you upload something to the python package index you do so because you intend for the python package index to create copies of it and distribute it, which needs a license.
When you make a comment on pull request for work you don’t intend for Mozilla to have anything to do with that. You don’t intend for Mozilla to receive your post. Nor to have any special rights to view it, distribute it, make copies of it, etc. They do not need a license because they shouldn’t be seeing it. Moreover you don’t even necessarily have the right to grant them said rights - someone else might own the copyright to the material you are legitimately working with.
These scenarios are not even remotely similar.
If you use their integrated search which might send things you type to a third party, Mozilla needs your permission to do that.
If you use their Pocket service which can offer recommendations of articles you might like, Mozilla needs your permission to analyze things you’ve done, which may require things like making copies of data.
If you use their VPN service you’re passing a lot of stuff through their servers to be transmitted onward.
There’s a ton of stuff Mozilla does that could potentially be affected by copyright issues with user-generated/user-submitted content. So they have the standard boilerplate “you let us do the things with that content that are necessary to support the features you’re using” CYA clause.
More specifically, their recommendations are at odds with the interests of users.
The question for random people reading these clauses is what does that mean? Legalese can be hard for lawyers to understand. It’s much harder for mere mortals.
I think everyone is OK with Firefox (the browser) processing text which you enter it into. This processing includes uploading the text to web sites (which you ask it to, when you ask it to), etc.
What is much more concerning for the average user is believing that the “ royalty-free, non-exclusive, non-revocable (etc. etc.) license” is unrestricted.
Let’s say I write the worlds most beautiful poem, and then submit it to an online poem contest via FireFox. Will Mozilla then go “ha ha! Firefox made a copy, and uploaded it to the Mozilla servers. We’re publishing our own book of your work, without paying you royalties. And oh, by the way, you also used Firefox to upload intimate pictures of you and your spouse to a web site, so we’re going to publish those, too!”
The average person doesn’t know. Reading the legalese doesn’t help them, because legalese is written in legalese (an English-adjacent language which isn’t colloquial English). Legalese exists because lawsuits live and die based on minutiae such as the Oxford Comma. So for Mozillas protection, they need it, but these needs are in conflict with the users need to understand the notices.
The Mozilla blog doesn’t help, because the italicized text at the top says: It does NOT give us ownership of your data or a right to use it for anything other than what is described in the Privacy Notice
OK, what does the Privacy Notice say?
(your) …data stays on your device and is not sent to Mozilla’s servers unless it says otherwise in this Notice
Which doesn’t help. So now the average person has to read pages of legal gobbledygook. And buried in it is the helpful
Identifying, investigating and addressing potential fraudulent activities,
Which is a huge loophole. “We don’t know what’s potentially fraudulent, so we just take all of the data you give to FireFox, upload to our US-based servers, and give the DoJ / FBI access to it all without a warrant”. A lawyer could make a convincing and possibly winning argument that such use-cases are covered.
The psychological reason for being upset is that they are confused by complicated things which affect them personally, which they don’t understand, and which they have no control over. You can’t address that panic by telling them “don’t panic”.
Could you explain why the concern is necessarily born of confusion rather than accurate understanding?
I didn’t say the concern is necessarily born of confusion. I said that the concern was because they didn’t understand the issues.
you said the reason for being upset is that they are confused. sorry if I was changing your meaning by adding “necessarily.” why do you say the concern is because of confusion or lack of understanding? what understanding would alleviate the concerns?
I don’t see a lot of difference between confusion and lack of understanding. Their upset is because the subject affects them, and they’re confused about it / don’t understand it, and they have no control over it.
This is entirely normal and expected. Simply being confused isn’t enough.
What would alleviate the concerns is to address all three issues, either singly, or jointly. If people don’t use Firefox, then it doesn’t affect them, and they’re not upset. If they understand what’s going on and make informed decisions, then they’re not upset. And then if they can make informed decisions, they have control over the situation, and they’re not upset.
The solution is a clear message from Mozilla. However, for reasons I noted above, Mozilla has to write their policies in legalese, when then makes it extremely difficult for anyone to understand them.
but who does “they” refer to? are you saying this describes people in general who are concerned about the policy, or are you just supposing that there must be someone somewhere for whom it is true?
what about people who have an accurate layman’s understanding of what the policy means, and are nonetheless concerned?
The actual reason for them being upset is that someone told them to be afraid of the supposedly scary thing and told them a pack of lies about what the supposedly scary thing meant.
I propose to deal with that at the source: cut off the outrage-baiting posts that start the whole sordid cycle. Having a thread full of panicked lies at the top of the front page is bad and can be prevented.
And if you really want to comfort the frightened people and resolve their confusion, you should be talking to them, shouldn’t you? The fact that your pushback is against the person debunking the fearmongering says a lot.
i.e. you completely ignored my long and reasoned explanation as to why people are upset.
Alternatively, you could look at the comment above in https://lobste.rs/s/de2ab1/firefox_adds_terms_use#c_yws3nv, which explains clearly just how nefarious and far-reaching the new policy is.
I haven’t seen you debunk anything. In order to “debunk” my argument, you would have to address it. Instead, you simply re-stated your position.
I explained why your position wasn’t convincing. If you’re not going to address those arguments, I don’t need to respond to your “debunking”.
At best that comment points out that a consolidated TOS for Mozilla “services” is confusingly being linked for the browser itself. Nothing has been proven in the slightest about it being “nefarious”, and the fact that you just assert malicious intent as the default assumption is deeply problematic.
So your position is completely unconvincing and I feel no need to address it any further.
But you’re not debunking the fear mongering. You’re conspicuously ignoring any comment that explains why the concern is valid. Don’t hapless readers deserve your protection from such disinformation?
You’re largely describing boilerplate for web services, where the expectation is that users input content, and a service uses that content to provide service.
Firefox is a user agent, where the expectation is that users input content and the agent passes that content through to the intended service or resource.
You can call this boilerplate if you like, but it certainly gives Mozilla unambiguous rights relative to what you put into it.
This really does beg the question: Firefox is 20 years old. Why did they only feel the need to add this extremely standard boilerplate-y clause now?
Their lawyers won the debate this time.
why though?
what exactly does that mean? Were they already actively doing this, and the lawyers “won” by updating the TOS to cover that behavior? Or did the lawyers “win” because they were pushing for a business decision to change Firefox’s data gathering activities?
Please, If you could reflect for a moment on your own comment that you have written could you determine if comes off as outraged?
I am incredibly tired of this sort of thing sparking ignorant outrage on a regular basis. It should not be permitted on this site.
There’s a “hide” button just for you. You can be the ninth lobster to click it!
This post is
Many much more mild examples have been removed on this site without hesitation. This one has to be, too, if the site rules mean anything.
I disagree. I think this is actionable, relevant, and very on-topic. I’d even argue about that with you here, except that you in particular have a very solid history of bad-faith arguing, and I have better things to do.
Anyway, so far 84 of us have upvoted it, vs 7 “off-topic” flags and 8 hides, for a ratio of about 5:1, if we care about user opinions. Your paternalism isn’t a good look. Just hide it, flag it, and move on!
I will note that we have both a
privacytag andlawtag, which are explicit carveouts for this sort of content.Now, whether or not we should retire those or not is a bigger question.
We already know the site rules don’t mean anything. The same rules are regularly violated for Apple marketing presentations.
What would a post that is not meant to whip up outrage look like? Presumably the blog author did their best to write such a post.
I wouldn’t say that the site rules don’t mean anything–I would say that many users and even admins have disregarded them for political expediency.
The long-term effects of this, of course, are deleterious…but that doesn’t matter when gosh darnit, the outgroup is wrong right now.
In the case of Apple, there’s a weird sort of thing where a
releasetag covers what is technically marketing. They also are both a large software and hardware vendor and, like it or not, have a large userbase. I’m not saying we should see a constant dripfeed of Apple propaganda, but it isn’t entirely without precedent.Of course. I adopted the parent comment’s hyperbole to avoid getting bogged down in minutia. But there’s nothing wrong with more clarity and precision.
then don’t express the ignorant outrage?
I’m really surprised to see anyone pay even the slightest of attention to this on Lobsters. It’s something my granddad would post to Facebook (example)
Such an ad-hominem argument is something my grandma would post on Instagram.
It’s not an ad hominem. I’m not attacking anyone instead of their argument.
I’m honestly at a loss for how to tag this. But uh, here you go. The repo for the monstrosity purports to be here.
We need a “hacks” tag
Holy crap https://github.com/MichiganTypeScript/typescript-types-only-wasm-runtime/blob/master/packages/ts-type-math/shift.ts
Fine is neat, but I’d rather gargle buckshot than let the clowncar of Python ML stuff anywhere near a clean, self-respecting Elixir deployment. I sure as shit don’t want it sharing the same address space as my BEAM VM.
I’m glad they’re trying stuff though.
I think we all understand the implicit subtext of “this is fine”.
Does the page load for you? I just get an
ERR_SSL_PROTOCOL_ERRORfrom dashbit.co.Yeah I mean I love what they did with Nx/Bumblebee but there was no way they’re going to keep up with the Python ML ecosystem. For example, a year ago I wanted the YOLO model but it wasn’t available at the time. So I also appreciate that they give us an out with this.
Source here, in lisp, no less.
Is writing a 150 character commit message so difficult that it warrants invoking an
omniscient AIbig hammerLLM? I am not sure what this is forIt’s funny, I think the answer is probably “yes” given how many commit messages are “try” “lol” “ugh try again”.
It may hit a bit close to home, but the folks too lazy/sloppy/unskilled to properly handle good commit messages probably are the ones most easily replaced by LLMs.
Imagine if some people use this to create more descriptive commit messages for their personal projects. Crazy, I know.
Not my experience at all.
This right here. Spicy autocomplete is pretty good for writing a sentence when you’re mentally casting about to find the words.
That is very true :-)
As long as the LLM prompt is not “try” or “lol” or “ugh try again”
Well, let’s consider the cases. If the commit message that one would have put was “try” “lol” “ugh try again”, then what did we gain by having a more verbose commit message written by the LLM? Conversely, if the commit message is actually very important, should we really use an LLM to write it for us?
A very fair point, although I was being a bit tongue in cheek with my comment. I agree with you.
Arguably those provide more context about the “why” than an LLM-provided summary of the change: they convey that the committer is in a hurry, and that this is likely an (attempted) fix to a recent commit by the same author, probably as part of the same merge.
I’m ambivalent tbh. I know amazing devs who write those silly little intermediary commits. I try to organize my commits because I assume someone, one day, might want to review a commit log to figure stuff out, but I don’t have strong feels because I rarely do that myself.
Most of the people I know who write those sorts of commits tend to be basically using them as fixup commits, except that they tend not to be aware of
--fixupand probably find their flow more convenient anyway. When their change eventually appears in the main branch, it’s almost always squashed down to a single commit with a single well-written description (usually done via PRs in some git forge or other).Or in other words, I suspect the people writing those intermediary commits usually see them as temporary, and I don’t think they’d get much out of an LLM describing their changes.
Perhaps, yeah. I tend to clean up my commit history to remove those intermediary ones as well fwiw, but I know a lot of people who don’t. I suppose the answer isn’t “use an LLM” (I know I won’t) but to just learn how to rewrite your history properly.
Or even use tools that make managing your history easier. I do think Git makes it excessively hard to rewrite commits and maintain a history of the rewrites that you’re doing, which leads to people taking the “squash everything in a PR” approach, which is typically a lot easier. But better tooling in this regard can improve things a lot.
That said, this is perhaps getting a bit off-topic from the core idea of LLM-based commit messages.
Most simple use cases of LLMs are not that much more than an admission of the user’s own incompetence.
Sometimes there’s no bigger meaning behind some changes. It’s either obvious why it’s done or it becomes obvious in context of other changes. Writing commit messages for that is just going through the motions which means sometimes it becomes “refactor” or a similar message. In those cases it could be nice to have an automatic summary instead of nothing.
This is an extremely strong statement.
I think a few things are also interesting:
I think people are realizing how low quality the Linux kernel code is, how haphazard development is, how much burnout and misery is involved, etc.
I think people are realizing how insanely not in the open kernel dev is, how much is private conversations that a few are privy to, how much is politics, etc.
The Hellwig/Ojeda part of the thread is just frustrating to read because it almost feels like pleading. “We went over this in private” “we discussed this already, why are you bringing it up again?” “Linus said (in private so there’s no record)”, etc., etc.
Dragging discussions out in front of an audience is a pretty decent tactic for dealing with obstinate maintainers. They don’t like to explain their shoddy reasoning in front of people, and would prefer it remain hidden. It isn’t the first tool in the toolbelt but at a certain point there is no convincing people directly.
With quite a few things actually. A friend of mine is contributing to a non-profit, which until recently had this very toxic member (they’ve even attempted felony). They were driven out of the non-profit very soon after members talked in a thread that was accessible to all members. Obscurity is often one key component of abuse, be it mere stubbornness or criminal behaviour. Shine light, and it often goes away.
IIRC Hintjens noted this quite explicitly as a tactic of bad actors in his works.
It’s amazing how quickly people are to recognize folks trying to subvert an org piecemeal via one-off private conversations once everybody can compare notes. It’s equally amazing to see how much the same people beforehand will swear up and down oh no that’s a conspiracy theory such things can’t happen here until they’ve been burned at least once.
This is an active, unpatched attack vector in most communities.
I’ve found the lowest example of this is even meetings minutes at work. I’ve observed that people tend to act more collaboratively and seek the common good if there are public minutes, as opposed to trying to “privately” win people over to their desires.
There is something to be said for keeping things between people with skin in the game.
It’s flipped over here, though, because more people want to contribute. The question is whether it’ll be stabe long-term.
Something I’ve noticed is true in virtually everything I’ve looked deeply at is the majority of work is poor to mediocre and most people are not especially great at their jobs. So it wouldn’t surprise me if Linux is the same. (…and also wouldn’t surprise me if the wonderful Rust rewrite also ends up poor to mediocre.)
yet at the same time, another thing that astonishes me is how much stuff actually does get done and how well things manage to work anyway. And Linux also does a lot and works pretty well. Mediocre over the years can end up pretty good.
After tangentially following the kernel news, I think a lot of churning and death spiraling is happening. I would much rather have a rust-first kernel that isn’t crippled by the old guard of C developers reluctant to adopt new tech.
Take all of this energy into RedoxOS and let Linux stay in antiquity.
I’ve seen some of the R4L people talk on Mastodon, and they all seem to hate this argument.
They want to contribute to Linux because they use it, want to use it, and want to improve the lives of everyone who uses it. The fact that it’s out there and deployed and not a toy is a huge part of the reason why they want to improve it.
Hopping off into their own little projects which may or may not be useful to someone in 5-10 years’ time is not interesting to them. If it was, they’d already be working on Redox.
The most effective thing that could happen is for the Linux foundation, and Linus himself, to formally endorse and run a Rust-based kernel. They can adopt an existing one or make a concerted effort to replace large chunks of Linux’s C with Rust.
IMO the Linux project needs to figure out something pretty quickly because it seems to be bleeding maintainers and Linus isn’t getting any younger.
They may be misunderstanding the idea that others are not necessarily incentivized to do things just because it’s interesting for them (the Mastodon posters).
Yep, I made a similar remark upthread. A Rust-first kernel would have a lot of benefits over Linux, assuming a competent group of maintainers.
along similar lines: https://drewdevault.com/2024/08/30/2024-08-30-Rust-in-Linux-revisited.html
Redox does have the chains of trying to do new OS things. An ABI-compatible Rust rewrite of the Linux kernel might get further along than expected (even if it only runs in virtual contexts, without hardware support (that would come later.))
Linux developers want to work on Linux, they don’t want to make a new OS. Linux is incredibly important, and companies already have Rust-only drivers for their hardware.
Basically, sure, a new OS project would be neat, but it’s really just completely off topic in the sense that it’s not a solution for Rust for Linux. Because the “Linux” part in that matters.
I read a 25+ year old article [1] from a former Netscape developer that I think applies in part
The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they’ve been fixed. There’s nothing wrong with it. It doesn’t acquire bugs just by sitting around on your hard drive. Au contraire, baby! Is software supposed to be like an old Dodge Dart, that rusts just sitting in the garage? Is software like a teddy bear that’s kind of gross if it’s not made out of all new material?Adopting a “rust-first” kernel is throwing the baby out with the bathwater. Linux has been beaten into submission for over 30 years for a reason. It’s the largest collaborative project in human history and over 30 million lines of code. Throwing it out and starting new would be an absolutely herculean effort that would likely take years, if it ever got off the ground.
[1] https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/
The idea that old code is better than new code is patently absurd. Old code has stagnated. It was built using substandard, out of date methodologies. No one remembers what’s a bug and what’s a feature, and everyone is too scared to fix anything because of it. It doesn’t acquire new bugs because no one is willing to work on that weird ass bespoke shit you did with your C preprocessor. Au contraire, baby! Is software supposed to never learn? Are we never to adopt new tools? Can we never look at something we’ve built in an old way and wonder if new methodologies would produce something better?
This is what it looks like to say nothing, to beg the question. Numerous empirical claims, where is the justification?
It’s also self defeating on its face. I take an old codebase, I fix a bug, the codebase is now new. Which one is better?
Like most things in life the truth is somewhere in the middle. There is a reason there is the concept of a “mature node” in the semiconductor industry. They accept that new is needed for each node, but also that the new thing takes time to iron out the kinks and bugs. This is the primary reason why you see apple take new nodes on first before Nvidia for example, as Nvidia require much larger die sizes, and so less defects per square mm.
You can see this sometimes in software for example X11 vs Wayland, where adoption is slow, but most definetly progressing and now-days most people can see that Wayland is now, or is going to become the dominant tech in the space.
The truth lies where it lies. Maybe the middle, maybe elsewhere. I just don’t think we’ll get to the truth with rhetoric.
Aren’t the arguments above more dialectic than rhetoric?
I don’t think this would qualify as dialectic, it lacks any internal debate and it leans heavily on appeals by analogy and intuition/ emotion. The post itself makes a ton of empirical claims without justification even beyond the quoted bit.
fair enough, I can see how one would make that argument.
“Good” is subjective, but there is real evidence that older code does contain fewer vulnerabilities: https://www.usenix.org/conference/usenixsecurity22/presentation/alexopoulos
That means we can probably keep a lot of the old trusty Linux code around while making more of the new code safe by writing it in Rust in the first place.
I don’t think that’s a fair assessment of Spolsky’s argument or of CursedSilicon’s application of it to the Linux kernel.
Firstly, someone has already pointed out the research that suggests that existing code has fewer bugs in than new code (and that the older code is, the less likely it is to be buggy).
Secondly, this discussion is mainly around entire codebases, not just existing code. Codebases usually have an entire infrastructure around them for verifying that the behaviour of the codebase has not changed. This is often made up of tests, but it’s also made up of the users who try out a release of a codebase and determine whether it’s working for them. The difference between making a change to an existing codebase and releasing a new project largely comes down to whether this verification (both in terms of automated tests and in terms of users’ ability to use the new release) works for the new code.
Given this difference, if I want to (say) write a new OS completely in Rust, I need to choose: Do I want to make it completely compatible with Linux, and therefore take on the significant challenge of making sure everything behaves truly the same? Or do I make significant breaking changes, write my own OS, and therefore force potential adopters to rebuild their entire Linux workflows in my new OS?
The point is not that either of these options are bad, it is that they represent significant risks to a project. Added to the general risk that is writing new code, this produces a total level of risk that might be considered the baseline risk of doing a rewrite. Now risk is not bad per se! If the benefits of being able to write an OS in a language like Rust outweigh the potential risks, then it still makes sense to perform the rewrite. Or maybe the existing Linux kernel is so difficult to maintain that a new codebase really would be the better option. But the point that CursedSilicon was making by linking the Spolsky piece was, I believe, that the risks for a project like the Linux kernel are very high. There is a lot of existing, old code. And there is a very large ecosystem where either breaking or maintaining compatibility would each come with significant challenges.
Unfortunately, it’s very difficult to measure the risks and benefits here in a quantitative, comparable way, so I think where you fall on the “rewrite vs continuity” spectrum will depend mostly on what sort of examples you’ve seen, and how close you think this case is to those examples. I don’t think there’s any objective way to say whether it makes more sense to have something like R4L, or something like RedoxOS.
I haven’t read it yet, but I haven’t made an argument about that, I just created a parody of the argument as presented. I’ll be candid, i doubt that the research is going to compel me to believe that newer code is inherently buggier, it may compel me to confirm my existing belief that testing software in the field is one good method to find some classes of bugs.
I guess so, it’s a bit dependent on where we say the discussion starts - three things are relevant; RFL, which is not a wholesale rewrite, a wholesale rewrite of the Linux kernel, and Netscape. RFL is not about replacing the entire Linux kernel, although perhaps “codebase” here refers to some sort of unit, like a driver. Netscape wanted a wholesale rewrite, based on the linked post, so perhaps that’s what’s really “the single worst strategic mistake that any software company can make”, but I wonder what the boundary here is? Also, the article immediately mentions that Microsoft tried to do this with Word but it failed, but that Word didn’t suffer from this because it was still actively developed - I wonder if it really “failed” just because pyramid didn’t become the new Word? Did Microsoft have some lessons learned, or incorporate some of that code? Dunno.
I think I’m really entirely justified when I say that the post is entirely emotional/ intuitive appeals, rhetoric, and that it makes empirical claims without justification.
This is rhetoric. These are unsubstantiated empirical claims. The article is all of this. It’s fine as an interesting, thought provoking read that gets to the root of our intuitions, but I think anyone can dismiss it pretty easily since it doesn’t really provide much in the form of an argument.
Again, totally unsubstantiated. I have MANY reasons to believe that, it is simply question begging to say otherwise.
That’s all this post is. Over and over again making empirical claims with no evidence and question beggign.
We can discuss the risks and benefits, I’d advocate for that. This article posted doesn’t advocate for that. It’s rhetoric.
This is a truism. It is survival bias. If the code was buggy, it would eventually be found and fixed. So all things being equal newer code is riskier than old code. But it’s also been impirically shown that using Rust for new code is not “all things being equal”. Google showed that new code in Rust is as reliable as old code in C. Which is good news: you can use old C code from new Rust projects without the risk that comes from new C code.
Yeah, this is what I’ve been saying (not sure if you’d meant to respond to me or the parent, since we agree) - the issue isn’t “new” vs “old” it’s things like “reviewed vs unreviewed” or “released vs unreleased” or “tested well vs not tested well” or “class of bugs is trivial to express vs class of bugs is difficult to express” etc.
Was restating your thesis in the hopes of making it clearer.
I don’t disagree that the rewards can outweigh the risks, and in this case I think there’s a lot of evidence that suggests that memory safety as a default is really important for all sorts of reasons. Let alone the many other PL developments that make Rust a much more suitable language to develop in than C.
That doesn’t mean the risks don’t exist, though.
Nobody would call an old codebase with a handful of fixes a new codebase, at least not in the contexts in which those terms have been used here.
How many lines then?
It’s a Ship of Theseus—at no point can you call it a “new” codebase, but after a period of time, it could be completely different code. I have a C program I’ve been using and modifying for 25 years. At any given point, it would have been hard to say “this is now a new codebase, yet not one line of code in the project is the same as when I started (even though it does the same thing at it always has).
I don’t see the point in your question. It’s going to depend on the codebase, and on the nature of the changes; it’s going to be nuanced, and subjective at least to some degree. But the fact that it’s prone to subjectivity doesn’t mean that you get to call an old codebase with a single fixed bug a new codebase, without some heavy qualification which was lacking.
If it requires all of that nuance and context maybe the issue isn’t what’s “old” and what’s “new”.
I don’t follow, to me that seems like a non-sequitur.
What’s old and new is poorly defined and yet there’s an argument being made that “old” and “new” are good indicators of something. If they’re so poorly defined that we have to bring in all sorts of additional context like the nature of the changes, not just when they happened or the number of lines changed, etc, then it seems to me that we would be just as well served to throw away the “old” and “new” and focus on that context.
I feel like enough people would agree more-or-less on what was an “old” or “new” codebase (i.e. they would agree given particular context) that they remain useful terms in a discussion. The general context used here is apparent (at least to me) given by the discussion so far: an older codebase has been around for a while, has been maintained, has had kinks ironed out.
There’s a really important distinction here though. The point is to argue that new projects will be less stable than old ones, but you’re intuitively (and correctly) bringing in far more important context - maintenance, testing, battle testing, etc. If a new implementation has a higher degree of those properties then it being “new” stops being relevant.
Ok, but:
My point was that this statement requires a definition of “new codebase” that nobody would agree with, at least in the context of the discussion we’re in. Maybe you are attacking the base proposition without applying the surrounding context, which might be valid if this were a formal argument and not a free-for-all discussion.
I think that it would be considered no longer new if it had had significant battle-testing, for example.
FWIW the important thing in my view is that every new codebase is a potential old codebase (given time and care), and a rewrite necessarily involves a step backwards. The question should probably not be, which is immediately better?, but, which is better in the longer term (and by how much)? However your point that “new codebase” is not automatically worse is certainly valid. There are other factors than age and “time in the field” that determine quality.
Methodologies don’t matter for quality of code. They could be useful for estimates, cost control, figuring out whom you shall fire etc. But not for the quality of code.
You’re suggesting that the way you approach programming has no bearing on the quality of the produced program?
I’ve never observed a programmer become better or worse by switching methodology. Dijkstra would’ve not became better if you made him do daily standups or go through code reviews.
There are ways to improve your programming by choosing different approach but these are very individual. Methodology is mostly a beancounting tool.
When I say “methodology” I’m speaking very broadly - simply “the approach one takes”. This isn’t necessarily saying that any methodology is better than any other. The way I approach a task today is better, I think, then the way that I would have approached that task a decade ago - my methodology has changed, the way I think has changed. Perhaps that might mean I write more tests, or I test earlier, but it may mean exactly the opposite, and my methods may only work best for me.
I’m not advocating for “process” or ubiquity, only that the approach one tasks may improve over time, which I suspect we would agree on.
If you take this logic to its end, you should never create new things.
At one point in time, Linux was also the new kid on the block.
The best time to plant a tree is 30 years ago. The second best time is now.
I don’t think Joel Spolsky was ever a Netscape developer. He was a Microsoft developer who worked on Excel.
My mistake! The article contained a bit about Netscape and I misremembered it
How many of those lines are part of the core? My understanding was that the overwhelming majority was driver code. There may not be that much core subsystem code to rewrite.
For a previous project, we included a minimal Linux build. It was around 300 KLoC, which included networking and the storage stack, along with virtio drivers.
That’s around the size a single person could manage and quite easy with a motivated team.
If you started with DPDK and SPDK then you’d already have filesystems and a copy of the FreeBSD network stack to run in isolated environments.
Once many drivers share common rust wrappers over core subsystems, you could flip it and write the subsystem in Rust. Then expose C interface for the rest.
Oh sure, that would be my plan as well. And I bet some subsystem maintainers see this coming, and resist it for reasons that aren’t entirely selfless.
That’s pretty far into the future, both from a maintainer acceptance PoV and from a rustc_codegen_gcc and/or gccrs maturity PoV.
Sure. But I doubt I’ll running a different kernel 10y from now.
And like us, those maintainers are not getting any younger and if they need a hand, I am confident I’ll get faster into it with a strict type checker.
I am also confident nobody in our office would be able to help out with C at all.
This cannot possibly be true.
It’s the largest collaborative open source os kernel project in human history
It’s been described as such based purely on the number of unique human contributions to it
I would expect Wikipedia should be bigger 🤔
I see that Drew proposes a new OS in that linked article, but I think a better proposal in the same vein is a fork. You get to keep Linux, but you can start porting logic to Rust unimpeded, and it’s a manageable amount of work to keep porting upstream changes.
Remember when libav forked from ffmpeg? Michael Niedermayer single-handedly ported every single libav commit back into ffmpeg, and eventually, ffmpeg won.
At first there will be extremely high C percentage, low Rust percentage, so porting is trivial, just git merge and there will be no conflicts. As the fork ports more and more C code to Rust, however, you start to have to do porting work by inspecting the C code and determining whether the fixes apply to the corresponding Rust code. However, at that point, it means you should start seeing productivity gains, community gains, and feature gains from using a better language than C. At this point the community growth should be able to keep up with the extra porting work required. And this is when distros will start sniffing around, at first offering variants of the distro that uses the forked kernel, and if they like what they taste, they might even drop the original.
I genuinely think it’s a strong idea, given the momentum and potential amount of labor Rust community has at its disposal.
I think the competition would be great, especially in the domain of making it more contributor friendly to improve the kernel(s) that we use daily.
I certainly don’t think this is impossible, for sure. But the point ultimately still stands: Linux kernel devs don’t want a fork. They want Linux. These folks aren’t interested in competing, they’re interested in making the project they work on better. We’ll see if some others choose the fork route, but it’s still ultimately not the point of this project.
While I don’t personally want to make a new OS, I’m not sure I actually want to work on Linux. Most of the time I strive for portability, and so abstract myself from the OS whenever I can get away with it. And when I can’t, I have to say Linux’s API isn’t always that great, compared to what the BSDs have to offer (epoll vs kqueue comes to mind). Most annoying though is the lack of documentation for the less used APIs: I’ve recently worked with Netlink sockets, and for the proc stuff so far the best documentation I found was the freaking source code of a third party monitoring program.
I was shocked. Complete documentation of the public API is the minimum bar for a project as serious of the Linux kernel. I can live with an API I don’t like, but lack of documentation is a deal breaker.
I think they mean that Linux kernel devs want to work on the Linux kernel. Most (all?) R4L devs are long time Linux kernel devs. Though, maybe some of the people resigning over LKML toxicity will go work on Redox or something…
That’s is what I was saying, yes.
I’m talking about the people who develop the Linux kernel, not people who write userland programs for Linux.
Re-Implementing the kernel ABI would be a ton of work for little gain if all they wanted was to upstream all the work on new hardware drivers that is already done - and then eventually start re-implementing bits that need to be revised anyway.
If the singular required Rust toolchain didn’t feel like such a ridiculous to bootstrap 500 ton LLVM clown car I would agree with this statement without reservation.
Would zig be a better starting place?
Zig is easier to implement (and I personally like it as a language) but doesn’t have the same safety guarantees and strong type system that Rust does. It’s a give and take. I actually really like Rust and would like to see a proliferation of toolchain options, such as what’s in progress in GCC land. Overall, it would just be really nice to have an easily bootstrapped toolchain that a normal person can compile from scratch locally, although I don’t think it necessarily needs to be the default, or that using LLVM generally is an issue. However, it might be possible that no matter how you architect it, Rust might just be complicated enough that any sufficiently useful toolchain for the language could just end up being a 500 ton clown car of some kind anyways.
Depends on which parts of GP’s statement you care about: LLVM or bootstrap. Zig is still depending on LLVM (for now), but it is no longer bootstrappable in a limited number of steps (because they switched from a bootstrap C++ implementation of the compiler to keeping a compressed WASM build of the compiler as a blob.
Yep, although I would also add it’s unfair to judge Zig in any case on this matter now given it’s such a young project that clearly is going to evolve a lot before the dust begins to settle (Rust is also young, but not nearly as young as Zig). In ten to twenty years, so long as we’re all still typing away on our keyboards, we might have a dozen Zig 1.0 and a half dozen Zig 2.0 implementations!
Yeah, the absurdly low code quality and toxic environment make me think that Linux is ripe for disruption. Not like anyone can produce a production kernel overnight, but maybe a few years of sustained work might see a functional, production-ready Rust kernel for some niche applications and from there it could be expanded gradually. While it would have a lot of catching up to do with respect to Linux, I would expect it to mature much faster because of Rust, because of a lack of cruft/backwards-compatibility promises, and most importantly because it could avoid the pointless drama and toxicity that burn people out and prevent people from contributing in the first place.
What is the, some kind of a new meme? Where did you hear it first?
From the thread in OP, if you expand the messages, there is wide agreement among the maintainers that all sorts of really badly designed and almost impossible to use (safely) APIs ended up in the kernel over the years because the developers were inexperienced and kind of learning kernel development as they went. In retrospect they would have designed many of the APIs very differently.
Someone should compile everything to help future OS developers avoid those traps! There are a lot of exieting non-posix experiments though.
It’s based on my forays into the Linux kernel source code. I don’t doubt there’s some quality code lurking around somewhere, but the stuff I’ve come across (largely filesystem and filesystem adjacent) is baffling.
Seeing how many people are confidently incorrect about Linux maintainers only caring about their job security and keeping code bad to make it a barrier to entry, if nothing else taught me how online discussions are a huge game of Chinese whispers where most participants don’t have a clue of what they are talking about.
I doubt that maintainers are “only caring about their job security and keeping back code” but with all due respect: You’re also just taking arguments out of thin air right now. What I do believe is what we have seen: Pretty toxic responses from some people and a whole lot of issues trying to move forward.
Huh, I’m not seeing any claim to this end from the GP, or did I not look hard enough? At face value, saying that something has an “absurdly low code quality” does not imply anything about nefarious motives.
I can personally attest to having never made that specific claim.
Indeed that remark wasn’t directly referring to GP’s comment, but rather to the range of confidently incorrect comments that I read in the previous episodes, and to the “gatekeeping greybeards” theme that can be seen elsewhere on this page. First occurrence, found just by searching for “old”: Linux is apparently “crippled by the old guard of C developers reluctant to adopt new tech”, to which GP replied in agreement in fact. Another one, maintainers don’t want to “do the hard work”.
Still, in GP’s case the Chinese whispers have reduced “the safety of this API is hard to formalize and you pretty much have to use it the way everybody does it” to “absurdly low quality”. To which I ask, what is more likely. 1) That 30-million lines of code contain various levels of technical debt of which maintainers are aware; and that said maintainers are worried even of code where the technical debt is real but not causing substantial issue in practice? Or 2) that a piece of software gets to run on literally billions of devices of all sizes and prices just because it’s free and in spite of its “absurdly low quality”?
Linux is not perfect, neither technically nor socially. But it sure takes a lot of entitlement and self-righteousness to declare it “of absurdly low quality” with a straight face.
GP here: I probably should have said “shockingly” rather than “absurdly”. I didn’t really expect to get lawyered over that one word, but yeah, the idea was that for a software that runs on billions of devices, the code quality is shockingly low.
Of course, this is plainly subjective. If your code quality standards are a lot lower than mine then you might disagree with my assessment.
That said, I suspect adoption is a poor proxy for code quality. Internet Explorer was widely adopted and yet it’s broadly understood to have been poorly written.
I’m sure self-righteousness could get you to the same place, but in my case I arrived by way of experience. You can relax, I wasn’t attacking Linux—I like Linux—it just has a lot of opportunity for improvement.
I guess I’ve seen the internals of too much proprietary software now to be shocked by anything about Linux per se. I might even argue that the quality of Linux is surprisingly good, considering its origins and development model.
I think I’d lawyer you a tiny bit differently: some of the bugs in the kernel shock me when I consider how many devices run that code and fulfill their purposes despite those bugs.
FWIW, I was not making a dig at open source software, and yes plenty of corporate software is worse. I guess my expectations for Linux are higher because of how often it is touted as exemplary in some form or another. I don’t even dislike Linux, I think it’s the best thing out there for a huge swath of use cases—I just see some pretty big opportunities for improvement.
Or actual benchmarks: the performance the Linux kernel leaves on the table in some cases is absurd. And sure it’s just one example, but I wouldn’t be surprised if it was representative of a good portion of the kernel.
Well not quite but still “considered broken beyond repair by many people related to life time management” - which is definitely worse than “hard to formalize” when “the way ever[y]body does it” seems to vary between each user.
I love Rust but still, we’re talking of a language which (for good reasons!) considers doubly linked lists unsafe. Take an API that gets a 4 on Rusty Russell’s API design scale (“Follow common convention and you’ll get it right”), but which was designed for a completely different programming language if not paradigm, and it’s not surprising that it can’t easily be transformed into a 9 (“The compiler/linker won’t let you get it wrong”). But at the same time there are a dozen ways in which, according to the same scale, things could actually be worse!
What I dislike is that people are seeing “awareness of complexity” and the message they spread is “absurdly low quality”.
Note that doubly linked lists are not a special case at all in Rust. All the other common data structures like
Vec,HashMapetc. also need unsafe code in their implementation.Implementing these datastructures in Rust, and writing unsafe code in general, is indeed roughly a 4. But these are all already implemented in the standard library, with an API that actually is at a 9. And
std::collections::LinkedListis constructive proof that you can have a safe Rust abstraction for doubly linked lists.Yes, the implementation could have bugs, thus making the abstraction leaky. But that’s the case for literally everything, down to the hardware that your code runs on.
You’re absolutely right that you can build abstractions with enough effort.
My point is that if a doubly linked list is (again, for good reasons) hard to make into a 9, a 20-year-old API may very well be even harder. In fact,
std::collections::LinkedListis safe but still not great (for example the cursor API is still unstable); and being in std, it was designed/reviewed by some of the most knowledgeable Rust developers, sort of by definition. That’s the conundrum that maintainers face and, if they realize that, it’s a good thing. I would be scared if maintainers handwaved that away.Bugs happen, but if the abstraction is downright wrong then that’s something I wouldn’t underestimate. A lot of the appeal of Rust in Linux lies exactly in documenting/formalizing these unwritten rules, and wrong documentation can be worse than no documentation (cue the negative parts of the API design scale!); even more so if your documentation is a formal model like a set of Rust types and functions.
That said, the same thing can happen in a Rust-first kernel, which will also have a lot of unsafe code. And it would be much harder to fix it in a Rust-first kernel, than in Linux at a time when it’s just feeling the waters.
At the same time, it was included almost as like, half a joke, and nobody uses it, so there’s not a lot of pressure to actually finish off the cursor API.
It’s also not the kind of linked list the kernel would use, as they’d want an intrusive one.
And yet, safe to use doubly linked lists written in Rust exist. That the implementation needs unsafe is not a real problem. That’s how we should look at wrapping C code in safe Rust abstractions.
The whole comment you replied to, after the one sentence about linked lists, is about abstractions. And abstractions are rarely going to be easy, and sometimes could be hardly possible.
That’s just a fact. Confusing this fact for something as hyperbolic as “absurdly low quality” is stunning example of the Dunning Kruger effect, and frankly insulting as well.
I personally would call Linux low quality because many parts of it are buggy as sin. My GPU stops working properly literally every other time I upgrade Linux.
No one is saying that Linux is low quality because it’s hard or impossible to abstract some subsystems in Rust, they’re saying it’s low quality because a lot of it barely works! I would say that your “Chinese whispers” misrepresents the situation and what people here are actually saying. “the safety of this API is hard to formalize and you pretty much have to use it the way everybody does it” doesn’t apply if no one can tell you how to use an API, and everyone does it differently.
I agree, Linux is the worst of all kernels.
Except for all the others.
Actually, the NT kernel of all things seems to have a pretty good reputation, and I wouldn’t dismiss the BSD kernels out of hand. I don’t know which kernel is better, but it seems you do. If you could explain how you came to this conclusion that would be most helpful.
NT gets a bad rap because of the OS on top of it, not because it’s actually bad. NT itself is a very well-designed kernel.
*nod* I haven’t been a Windows person since shortly after the release of Windows XP (i.e. the first online activation DRM’d Windows) but, whenever I see glimpses of what’s going on inside the NT kernel in places like Project Zero: The Definitive Guide on Win32 to NT Path Conversion, it really makes me want to know more.
More likely a fork that gets rusted from the inside out
Somewhere else it was mentioned that most developers in the kernel could just not be bothered with checking for basic things.
Nobody is forcing any of these people to do this.
Article was rather more tolerable than I was expecting, so good on the author.
I will highlight one particular issue: there’s no way to have both an unbiased/non-problematic electric brain and one that respects users’ rights fully. To wit:
This logic applies just as much to a minority pushing for (say) trans-friendly norms as it does for a minority pushing for incredibly trans-hostile ones. Replace trans-friendly with degrowth or quiverfull or whatever else strikes your fancy.
Like, we’ve managed to create these little electric brains that can help people “think” thoughts that they’re too stupid, ignorant, or lazy to by themselves (I know this, having used various electric brains in each of these capacities). There is no morally coherent way–in my opinion!–of saying “okay, here’s an electric brain just for you that respects your prejudices and freedom of association BUT ALSO will only follow within norms established by our company/society/enlightened intellectual class.”
The only sane thing to do is to let people have whatever electric brains they want with whatever biases they deem tolerable or desirable, make sure they are aware of alternatives and can access them, and then hold them responsible for how they use what their electric brains help them with in meatspace. Otherwise, we’re back to trying to police what people think with their exocortices and that tends to lose every time.
This is just free speech absolutism dressed up in science fiction. There are different consequences to different kinds of speech, and these aren’t even “brains”: they’re databases with a clever compression and indexing strategy. Nobody is or should be required to keep horrendous speech in their database, or to serve it to other people as a service.
Isn’t that exactly the problem @friendlysock is describing? This is already a reality. One has to abide the American Copilot refusing to complete code which mentions anything about sex or gender and the Chinese Deepseek refusing to say anything about what happened at Tianenmen square in 1989.
The problem is powerful tech companies (and the governments under which they fall) imposing their morality and worldview on the user. Same is true for social media companies, BTW. You can easily see how awkward this is with the radically changed position of the large tech companies with the new US administration and the difference in values it represents.
It’s not “free speech absolutism” to want to have your own values represented and expressed in the datasets. At least with more distributed systems like Mastodon you get to choose your moderators. Nobody decries this as “free speech absolutism”. It’s actually the opposite - the deal is that you can join a system which shares your values and you will be protected from hearing things you don’t want to hear. Saying it like this, I’m not so sure this is so great, either… you don’t want everyone retreating into their own siloed echo chambers, that’s a recipe for radicalisation and divisiveness.
Why if you want to write a movie villain?
The problem is not the existence of harmful ideas. The problem is lack of moderation when publishing them.
And yeah, nobody should be required to train models in certain ways. But maybe we should talk about requirements for unchecked outputs? Like when kids ask a chatbot, it shouldn’t try to make them into fascists.
On the other hand, when I ask a chatbot about what’s happening in the US and ask it to compare with e.g. Umberto Eco’s definition of Fascism, it shouldn’t engage in “balanced discussion” just because it’s “political”.
We need authors to have unopiniated tools if we want quality outputs. Imagine your text editor refusing to write certain words.
Ah, I guess? If that bothers you, I think that’s an interesting data point you should reflect on.
Sure, but if somebody chooses to do so, they should be permitted. I’m pointing out that the author complains about bias/problematic models, and also complains about centralization of power. The reasonably effective solution (indeed, the democratic one) is to let everybody have their own models–however flawed–and let the marketplace of ideas sort it out.
In case it needs to be explicitly spelled out: there is no way of solving for bias/problematic models that does not also imply the concentration of power in the hands of the few.
I’m not claiming this is feasible at this point, but is “delete all the models and research and stop all R&D in ML” a counterexample to this claim?
One thing that stood out to me is that systemd’s configuration approach is declarative while others are imperative, and this unlocks a lot of benefits, much like how a query planner can execute a declarative query better than most hand-written imperative queries could.
It pairs very well with NixOS for this reason.
Yea, it’s unreasonbly effective, and in general all the components hang together really well.
In my experience the nicest systems to administer have kind of an “alternating layer” approach of imperative - declarative - imperative - declarative - … Papering over too much complexity with declarative causes trouble, and so does forcing (or encouraging, or allowing) too much imperative scripting at any given layer. Systemd (it seems to me, as a user) really nailed the layers to encapsulate declaratively - things you want to happen at boot, that depend on each other. Units are declarative but effectively “call into” imperative commands, which - in turn - ought to have their own declarative-ish configuration files, which - in turn - are best laid down by something programmable at the higher layer (something aware of larger state - what load balancers are up - what endpoints are up - what DNS servers are up, yadda yadda).
As an aside, my main beef with Kubernetes is that is has a bad design - or perhaps a missing design - for that next level up of configuration management. Way way way too much is squashed into a uniform morass of the generalized “object namespace”, and the only way to really get precise behavior out of that is to implement a custom controller, which is almost like taking on the responsibility of a mini-init system, or a mini-network config daemon.
What I want at this layer instead is a framework for writing a simple and declarative-ish DSL that lets me encode my own business domain. Not every daemon is a Deployment object. Not every service is a Service. I think we were just getting good at this in 2014 with systems like Salt and Chef, but K8s came in and sucked all the oxygen out of the room for a decade, not to mention that Salt and Chef did themselves no favors by simply being a metric boatload of not-always-easy-to-install scripting runtimes.
Years ago when I looked to contribute to Salt I was pretty astounded by the bugginess, but it worked for enough people and there was not and still is not a lot of competition in that space, so VMware acquired them.
We have yet to see the next generation of that sort of software. System Initiative may be the sole exception, and I am enthusiastic about the demos they put out. But I do think we’re missing something here.
Re: concentrating power: the “open source” models have definitely made big improvements here. Training a new model is still out of reach for regular people, but fine tuning isn’t. Computers are always getting better, and code is being optimized, so in the future it may be even more accessible.
We arguably already have robots that do both dishes and laundry. They just don’t do the step of putting the items in the machine.
Pushed further: we already have beat matching (helpful for DJs), assemblers and linkers and treeshakers and SAT solvers (helpful for programmers), and spellcheckers (helpful for writers).
And yet, we also still have chamber orchestras, demoscene, and pen and paper.
Technology lets people pick and choose what part of the process they want to spend their life force on–I don’t think that’s bad at all.
One could with grim amusement note that this is a microcosm of the same issue which at the national level got us into the mess we’re in today and rendered blue team ineffectual–but, that’s strictly politics and not super helpful to look into here.
My take, having watched this space and been super tangentially involved in civic hackathons and things, is that this is most useful for all the boring local and state touchpoints. Being able to access usable datasets, being able to give people in local government good tooling to do their jobs more efficiently…all of this is good. That said, even with good tools sheer combination of inertia, incompetence, and motivated self-interest one encounters can make it incredibly hard to do something even as simple as keep maps up to date–and that particular is even more pernicious given the staking out of datasets by folks like Esri.
If I had my druthers, we’d standardize at a state level on forms for all the normal shit cities do and provide a state-level depot for that stuff–including hosting and software development–and offer a supported “toolbox” of very simple tools that cities could compose to match their needs (do you have a lot of oil and gas? okay, here’s the stuff for registering wells or whatever. do you have a library? okay, here’s basic circulation desk stuff. do you have a water treatment plant? your own police force? fire department? …etc.) Ditto for useful data and querying.
Then, periodically, there’d be like quarterly or yearly conferences so different state bureaus could share what tools the’ve made and lessons they’ve learned and bubble those up to a federal repository of same.
The problems of course are numerous–an incomplete accounting here:
I’d love to see this fixed, but I’m not enough of a fool to expect it nor enough of a masochist to attempt it.
I don’t really want to go “skill issue” here–least of all because in spite of writing an obnoxious amount of bash over the last few years I still fuck it up regularly–but I do think it’s important to note there’s a difference in kind for these complaints.
Having nonstandard shells on different machines is the result of not just using a common shell–you’re almost always (yes, yes, containers are a thing, sit down there in the back) gonna have either sh or bash if you want it (and yes at least for a while the default bash on OSX was old and you needed to upgrade it to get things like associative arrays). There is a reason I put my skill points into bash instead of zsh, fish, oilsh, ksh, csh, or even powershell–all better shells, none as ubiquitous. But, anyways, that complaint is at least about shells.
The shortcuts and terminal-itself complaints about things like keyboard shortcuts and copy-paste behavior are unique to the terminal environment, and vary across using a browser window, one of any of the dozen X11 terminals, PuTTY (remember that lol), and so on and so forth–but again, that’s neither the shell nor the programs.
And then finally, issues around discoverability and arguments and whatnot are squarely the problem of the utilities. Some utilities are nice and straightforward (albeit this is more a BSD thing than a GNU thing) and have a single command with some flags. Some utilities are relatively consistent but simply so large as to defy easy reference (I’m looking at you FFMPEG, where the problem domain is simply so large it can’t be wrapped up neatly). Some tools have some warts (git), and some tools seemed like they deliberately eschewed reasonable and legible convention (seriously, fuck the constellation of nix and nixos CLI tooling).
Anyways, I think that it’s worth it to remember that these frustrations are really probably clustered around different layers of the stack, and to approach it that way instead of by vertical slice (though we do tend to think in terms of “I just want to look at cat pictures on the terminal and get street cred” instead of “okay so I’m pretty sure uxrvt supports color sixel and I think I remember how to get curl to use redirection to aalib in bash”).
FWIW Oils is unique among these, in that it has OSH, which is POSIX and bash-compatible. (YSH is the incompatible shell)
OSH is actually more POSIX compatible than dash, the default
/bin/shon DebianIt’s also the most bash-compatible shell.
Right now you will get better error messages, and it’s faster in some cases, but eventually we will add some more distinguishing features, like perhaps profiling of processes
There is also a developing “OSH standard library” for task files and testing, which runs under bash too
Definitely agree about the layers. Can you imagine a similarly open-ended survey about frustrations with GUIs? But the lack (or mere locality) of standards and conventions is a real problem in any human interface, even apart from the inadequacies of the standards that do exist.
That worse-is-better, least-common-denominator thing about bash, though: that’s a wicked problem. Not a huge one, but still. It’s the reason I gave up on fish, despite liking basically everything about it better than any other shell I’ve tried. At least oils has a well-though-out compatibility story that could enable systematic upgrades… but that’s still a lot of work, against (or any way across) stiff currents of cultural inertia.
You ever see something so deeply cursed that it wraps around again to become amazing? This is now my benchmark for that feeling.
https://www.microsoft.com/en-us/research/blog/lambda-the-ultimatae-excel-worksheet-function/
Neat work–maybe use segment anything (or something similar) and do better classification on the detected objects? Captioning is tricky, and given that it’s basically a preprocessing step I wouldn’t sweat using commercial models.
Also, please watch the self-promotion. Maybe post some other stuff.
Having led open-sourcing efforts at a couple of companies now, for similarly self-interested reasons:
This seems to be related to the problem of “why do people adopt worse systems with better docs and discussion instead of better systems without those things”.
I’ve seen at least one community seeming to shift out of discord and Slack to publicly-crawlable fora for exactly this reason, to make it easier for knowledge to get hoovered up by the bots–and I kinda don’t hate it.
There’s so many forms of this. I’ve dealt with a team at $bigcorp adopting a worse system without docs, over a better system with docs, and then rebuilding (in a worse way) the features of the better system slowly. They picked it because the better system was written by a department they saw as competing with them.
The point is that if the decisions are based on no technical factors, then whether or not LLMs support the thing well is also not going to factor in.
This. I don’t think this title makes a ton of sense tbh - innovation doesn’t necessarily mean using the new thing, it can also mean old things adding new things because they get so popular they get the resources to do it.
Not just “worse systems with better docs” but also “systems with more libraries”. For better or worse LLMs increase the increasing returns to scale that benefit larger programming and technology communities.
Which probably means we’re in for an even longer period in which innovation is not actually that innovative - or perhaps highly incremental - while we wait for new, disruptive innovations that are so powerful they overcome all the disadvantages of novelty. But really that’s just life - as we eat more of the low hanging intellectual and technological fruit, we get more to a state like any other field of equilibrium periods punctuated by revolutions. In some ways this is good - if you want to do something new you have to do something really new and figure out how to launch it.
What community was that?
Over in Elixir, some of the projects–now, it could simply be they were sick of losing things to Slack backscroll, but I recall somebody mentioning they wanted to have their framework better supported. I don’t intend this to be fake news, of course, but memory is a fallible thing.
Interesting. I’m curious for an example.