It’s fully offline. The language models are a few megabytes. THAT’S AMAZING. Sure the translations are far from fluent, but I use it all the time to get the gist of articles in French, German, Spanish etc., and I don’t need them to be fluent, just understandable.
If you want online LLM stuff I mean they have you covered but again that’s online and LLM.
i get it, but please have a look at the screenshot. it’s not even just the translation that’s bad, it’s also (presumably) parsing the html incorrectly, doubling up the text in various parts (e.g. altano’s username and the flag button), and adding nonsensical text (e.g. “si tratta di un’azienda” == “it’s about a company”???). it’s so odd that the translation built into a major browser cannot parse html correctly
but of course, the main issue is that the translation is bad. the sentences sound weird, words have incorrect articles/endings (which are absolutely trivial to get right in that case). it even translated “gaslight” into “austriare” which is a completely made up word that makes zero sense
and this was english to italian, which shouldn’t be as hard as chinese to turkish or finnish to japanese
i’m sorry for complaining about someone’s hard work that they provided for free. it’s better than nothing, and it’s better than automatically translating things literally word for word, but it’s just… really not much better than that…
truth be told I’ve only been using it with English on the target end and it’s been fine when translating a stray site in French, German, Russian, Chinese, etc. I never translate text to my native language, before LLMs the translators all were terrible.
Came here to comment the same. The quality of the translations isn’t even the biggest issue though. The UX Chrome has is just way better. You can force translate and when the page updates dynamically it kinda “just works”. With Firefox it is impossible to do any real tasks on a government/bank website. The fact that there is no option in the context menu is also baffling to me.
the events are almost completely unstructured, i.e. you can’t add arbitrary content without defining a format on top of it
requires lots of manual maintenance
adding an event spanning multiple days requires adding it to every single day
same for recurring events
events don’t have an id so deleting/modifying a previously added one requires finding all the other instances of the same event over multiple days/weeks/months by guesstimating if something is the same event or not
the week in every line seems of questionable utility
not a standard format so no existing email/calendar software can use it
many new and exciting ways to fuck up your schedule with a single typo that no other calendar software has ever provided
some of these can be addressed with tooling, but then wouldn’t you just rather use something without all the limitations?
Grepability, maybe? With one day per line, searching for an event always gets the date of the event, too. (You could copy the event date for each line, but maybe that has some other problem?)
Perhaps a naiive question, but why would you ever want to statically link your systems libc? Surely that’s the one dynamic library you can guarantee is there.
On Linux, not really, different distros have different non compatible glibc versions. Particularly a problem when running software built using a normal distro on some LTS red hat like Enterprise distro. I doubt this is a problem in open BSD land though.
There’s also a small performance difference with static linking. Less indirection, link time optimization. At the cost of not sharing memory pages with other processes. Maybe that’s why they’re doing it?
Heck, forget different versions of glibc, you might find yourself on a Linux that uses a different libc altogether, like musl (Alpine Linux) or bionic (Android). And then there are weirdos like NixOS where /usr/lib doesn’t even exist and any copies of glibc you might find are at weird random paths with hashes in them. Static linking means you just run fine on all of these.
On OpenBSD, I think the situation is the opposite: libc is stable, the syscall interface is not. By statically linking libc, unless you’re a program like ls shipped with OpenBSD itself, you’re just inviting breakage for no good reason.
The main reason I see (in the case of OPEN SD) is security.
Statically linking against libc means that all symbols are stored in your binary, which means you cannot use eg. LD_PRELOAD to make programs run malicious code.
Another reason is to guarantee that your binaries are self contained, and to not depend on any external file in case of a system recovery scenario where all your partitions are not mounted for example.
Thanks to your comment I started looking up what would be the reasons and context for all this static libc.a linking.
From what I could find, the complain seems to come from Go that used to make direct syscalls to the kernel, rather than doing it through libc (up until 1.16). OpenBSD released a pinsycall(2) syscall, which somehow forced Go to comply, and as Go (usually) creates static binaries, it is then forced to statically link against libc.a.
i worked on debuggers and i use them in sometimes weird creative ways. for instance i recently turned rr into a bash profiler, by recording the execution of the process and then injecting calls to dump the current line and function name at various points in the recording. this technique of recording the process and extracting a profile from the recording was invented by a dear friend i met when working on a proprietary debugger. it’s a really powerful idea because the program can run unmodified and it doesn’t require support from the system/interpreter/runtime, thus collecting the profile adds basically zero overhead (the overhead of the debugger isn’t necessarily zero, but it doesn’t modify the state of bash, unlike the overhead of printing this information from within bash which can be very significant in certain optimised programs, rendering the profile meaningless)
$1700 is quite a large budget. If the total cost were halved, that would still be a sizeable budget. I feel like tech writers these days are forgetting what the phrase “on a budget” implies.
I agree, it is a big sum of money. I interpret “on a budget” as “relatively cheap”, not as “nearly free”. I think it is pretty cheap compared to what one normally needs to pay for that amount of VRAM. To me, the term is more justified here than in a post where someone buys a second-hand Apple laptop for $1100 and claims it is a cheap solution to browse the web.
I really hope AMD catches up and prices come down, because AI-capable hardware is not nearly as accessible as it should be.
Maybe it would be more accurate to say “on a budget” is a form of weasel word. Its interpretation depends on your familiarity with current prices and your socioeconomic status.
From my very subjective (and probably outdated) PoV…
$1000 is a fancy high-end laptop
$2000 buys a laptop only extremely well-paid people can justify
$800 buys a high-end desktop
You can imagine the surprise I felt (or was that shame?) when seeing a $1700 price tag on a “budget” desktop PC that can do AI.
“Building a personal, private AI computer for $1700” would communicate the intent a little better, without suggesting to the reader anything about their ability to afford it.
Btw, I don’t mean to imply any wrong was committed. I’m just pointing out that the wording on the post had some unintended effects on at least this reader. To a large degree that is unavoidable, no matter what a person publishes on the web.
Agreed. I’m running local models with an RTX 3060 12 GB that costs about $330 on NewEgg or 320€ new at ebay.de, and it’s actually useful. The context sizes must be kept tiny but even then it can provide basic code completion and short chats.
The code they write is riddled with subtle bugs but making my computer program itself seems to never get old. Luckily they also make it quicker to write throwaway unit tests. The small chat models are useful for language tasks such as translating unstructured article title + URL snippets to clean markdown links. They also act as a thesaurus, very useful for naming things, and can unblock progress when you’re really stuck with some piece of code (rubberduck debugging). Usually the model just tells me to “double-check you didn’t make any mistakes” though :)
On the software side I use ollama for running the models, continue.dev for programming (it’s really janky), and the PageAssist Firefox extension for chat.
If you look at “modern” gaming graphics cards, that is so cheap I was actually surprised. (even compared to a 30xx from some years ago)
If the median price of a thing is high, then absolute values don’t matter. A new car for under 10k EUR would still be “on a budget”, even if it’s a lot of money.
I posted some notes here - Gemini 2.0 Flash Lite is half the price of GPT-4o mini and Gemini 2.0 Flash is a little cheaper than GPT-4o mini too: https://simonwillison.net/2025/Feb/5/gemini-2/
I also got the three new models to draw me pelicans on bicycles.
I can’t tell what the pelican test is supposed to show. Which of the results is better? Is this an improvement over the previous generations? Is this a meaningful indicator of intelligence? For instance if I drew a shitty pelican on a bike, would that mean I didn’t pass this modified version of the Turing test?
I talked about that on a podcast recently. Here’s the transcript. Short version: I think it’s amusing to get hyped technology to draw crap pictures - but there does also seem to be a weird correlation between how good a model is and how well it can draw a pelican on a bicycle.
So it sounds like there was a battery concern? To avoid physical issues they severely capped the max charge on one of the two battery models used in these devices. However it seems that they wanted to keep this quiet so got some engineer to do a hack build rather than the usual CI?
Some obvious questions:
They are harming these users who they sold a defective device to. It is out of warranty but the right thing to do here would be a recall.
They aren’t being open about this, the honest thing would be to share information about whatever risk they are trying to mitigate. This could be important if these batteries are being used with third-party OSes or have been repurposed for uses outside of the original phone.
How can a random engineer sign and ship an update outside of the regular CI process. Shouldn’t this signing key be very locked down?
How can a random engineer sign and ship an update outside of the regular CI process. Shouldn’t this signing key be very locked down?
It’s a little more complicated than that. Apparently, only the kernel build was from a random engineer. The OS as a whole went through normal release processes. Apparently the kernel builds being separate and then getting vendored into the OS build is normal.
Yes, it’s terrible practice. Not going through CI usually means no control over what actually went into the binary because the environment it’s built in is not controlled. It’s hard to reproduce this build and there is an endless source of potential errors, like embedding the wrong version numbers (because CI usually takes care of version numbering), using the wrong branch, using the wrong compiler, using the wrong build flags, etc. There’s a reason why this stuff is automated and under strict version control.
It makes me feel marginally better. At least that indicates the signing key isn’t available to just anyone - presumably the config to include the hand-built kernel had to go through normal review and CI processes.
But it doesn’t make me feel that much better. The whole thing is still sketchy as hell.
Going from 4.45V to 3.95V is a massive jump. For context: most of the usable energy of a lithium ion battery is between it’s max charge voltage (eg 4.2V) and about 3.4V. Below that there is only a tiny bit of capacity, the voltage plummets quickly (lookup “lithium ion discharge curves”), the exact choice of low cutoff voltage (2.6-3V are common) is a bit arbitrary and only grants you a few more % at best.
It could be a well researched change with actual data behind it. But I vote corporate laziness instead, Google probably just doesn’t care and put pressure on the person to do something quickly and cheaply.
i don’t know if this was the case elsewhere, but in the uk google offered free battery swaps at EE stores or £40 for each device (i think regardless of the condition), which is probably fair for a 5 year old phone?
Posted this due to this gem of a statement and recent discussions both of how to combat AI scrapers and the in-band signaling attacks that prompt injection opens up:
To catch attempts at subverting Operator with jailbreaks and prompt injections, which might hypothetically be embedded in websites that the AI model browses, OpenAI says it has implemented real-time moderation and detection systems. OpenAI reports the system recognized all but one case of prompt injection attempts during an early internal red-teaming session.
Just waiting for big websites to start happy-pathing the AI’s interaction with it, not the human’s.
The system performs tasks by viewing and interacting with on-screen elements like buttons and text fields similar to how a human would.
It isn’t looking for APIs, it is trying to navigate like a normal user. So I imagine people will start to figure out how to, for example, get the bot to navigate through a referral link. Interestingly, this appears to be all visual-based, not scraping the content, so just hiding the prompt injection in HTML comments won’t suffice.
so i filled the ofcom form to assess the risks with what i think applies to lobsters. like for basically every other website on the internet, the result is that every feature of lobste.rs can be used to commit terrorism and child exploitation in 37 different ways. posting urls? clearly urls to child porn. writing comments? that’s what a terrorist would do. search function? it’s for searching for terrorism isn’t it. does not block minors? this just screams grooming. can see other people’s profiles? of course it’s for human trafficking. does not ask for an id? probably because lobste.rs is full of crime
The FOSDEM organizers, insofar as they are FOSDEM organizers, exist for the purpose of organizing FOSDEM. In that role, they have decided that there will be such and such a speaker at such and such a time at such and such a place.
You can disagree with that decision. You can register your disagreement through channels they’ve provided. If that’s proven ineffective, you can disengage with the whole affair. If you don’t like that option, you can register your disagreement by non-obstructive protest, which clearly indicates friendly criticism. Or if you don’t like that option, you can register your disagreement via obstructive protest. But in that case you have declared yourself to desire to obstruct FOSDEM’s reason for existence; it’s not surprising that FOSDEM will in turn assert its will to proceed with itself.
There’s a bizarre trend among vaguely-radical Westerners whereby they expect to be able to disrupt the operations of an organization and not suffer any opposition for their own disruption because they call the disruption “protest”. This is a very confused understanding of human relations. If you are intentionally disrupting the operations of an organization, at least in that moment, the org is your enemy and you are theirs. Of course your enemy is not going to roll over and let you attack it. Own your enemyship.
FOSDEM is volunteer-run, by people who are involved in F/LOSS themselves. It exists thanks to a lot of goodwill from the ULB. Protestors are not fighting the cops or some big corporations, they’re causing potential grief for normal working-class people like themselves.
While the ULB campus always gives me the impression it’s not averse to a bit of political activism, it would be an own goal if some tone deaf protestors were to jeopardise the possibility of future FOSDEM conferences.
Dorsey won’t care, he’ll take his same carefully rehearsed speech and deliver it again at any of the hundreds of for-profit tech conferences. They’ll be delighted to have him. But there’s really only one volunteer-run event of FOSDEM’s scale in the EU.
By all means, boo away Dorsey, but be considerate of the position of the people running this.
Protestors are not fighting the cops or some big corporations, they’re causing potential grief for normal working-class people like themselves.
As a former and future conference organizer who has taken a tiny paycheck from three of the ~dozen conferences I’ve organized but never a salary °, this is 100% true. I’ve been fortunate that my conferences have had no real controversy, but two that did have a bit of a tempest in a teapot were went to two directions. One, the controversy was forgotten in a week, aside from a couple of people who just couldn’t let it go on Twitter. It took about a month for their attention to swing elsewhere. In retrospect, almost a decade later now, we’d have made a different decision, and the controversy wouldn’t have happened. Unfortunately, the other was life-altering for a few of the organizers because of poor assumptions and unclear communication on our part and a handful of attendees who felt that it was reasonable to expect intra-day turnaround on questions leading to a hostile inquiry two weeks before a $1M event for 1,500 people put on by eight volunteers spread way too thin.
I’ve also had to kick out attendees who were causing a disruption. No, man, you can’t come into the conference and start doing like political polling, petitioning, and signature collection, even if I agree with you, and might have considered setting aside a booth for you if you’d arranged for it ahead of time.
As conference organizers, we have a duty to platform ideas worth being heard and balance that with the person presenting them. The most effective way to protest a person or a presentation is not to attend, and the second most is to occupy space in silence with a to-the-point message on an unobtrusive sign or article of clothing. Anything more disruptive, and you’re creating a scene that will get you kicked out according to the customs of the organizer team, the terms of attendance, and the laws of venue if the organizers enforce their code of conduct. I’ve never physically thrown someone out of an event in my 24 years of event organization, but I’ve gotten temptingly close and been fortunate that someone with a cooler head yet more formidable stature intervened (and I was 6’2” 250 lbs at the time!).
° A tenet of my conf organizer org is “pay people for their work.”
As conference organizers, we have a duty to platform ideas worth being heard
Or you can just acknowledge that a person has done enough damage in their life that you won’t let them shout out any other weird takes.
It’s not like FOSDEM is mainstream enough that it needs to have people who are more well-known outside of FLOSS circles to keynote. There are enough figureheads (often people who spent decades of their life doing good things) who would be more well-suited. This is not some random enterprise conference where you may invite any random startup CEO to shill their stuff. FOSDEM should do better. (I remember phk, and it was great)
As conference organizers, we have a duty to platform ideas worth being heard and balance that with the person presenting them.
that’s cool but literally nobody had heard of block’s involvement in open source until this was announced, so i don’t know what ideas you’re referring to
Respectfully I disagree with this. Peaceful protest can be non-disruptive but still effective. If Jack is talking the whole time with people on the stage in protest around him, I think a lot of attendees will inevitably read Drew DeVault’s article and understand his argument.
I was going to point this out, but then I realized the concern is more likely the “presumably” in “presumably being platformed”. In other words, they’re not saying that Block’s sponsorship is a lie, they’re saying FOSDEM did not accept a bribe: Block’s sponsorship is not the reason Dorsey got a keynote as DeVault alleges.
Yes, that could be it. If so, a bit disappointing to see such misconstrual or misrepresentation of DeVault’s clear statement of presumption as a statement of fact. (There’s also a lot of grey area between “bribe” and “total neutrality”. Patronage is a thing.)
Would you agree then with the statement that “presumably DeVault lied when he construed Block’s main sponsorship as the reason Dorsey got the keynote selection”? Would presumably have made @talideon ’s comment acceptable?
The linked article makes it clear that patronage is not in play for this event.
To be clear, in our 25 year history, we have always had the hard rule that sponsorship does not give you preferential treatment for talk selection; this policy has always applied, it applied in this particular case, and it will continue to apply in the future. Any claims that any talk was allowed for sponsorship reasons are false.
Not trying to be adversarial, just trying to highlight how others are reading DeVault’s statement in light of the clear answer from FOSDEM that Block’s sponsorship had no role in his keynote. I don’t care one way or the other about the keynote, have no feelings either way about DeVault (who seems to be the most polarizing figure on this site), and will not be at FOSDEM. But when I read DeVault’s wording, I generally understood he believes FOSDEM accepted Block’s sponsorship in return for a keynote address and “presumably” is there to avoid any legal issues from making such a claim.
Would you agree then with the statement that “presumably DeVault lied when he construed Block’s main sponsorship as the reason Dorsey got the keynote selection”?
No, because there’s no need to presume anything when you can just go look at his words. None of it gets anywhere near “lie”, to me. It strikes me as easy and reasonable to take his words as a true statement of his belief.
[FOSDEM’s very clear statement snipped]
Thanks for pointing that out.
But when I read DeVault’s wording, I generally understood he believes FOSDEM accepted Block’s sponsorship in return for a keynote address and “presumably” is there to avoid any legal issues from making such a claim.
We’re on roughly the same page here. To me, he’s definitely casting aspersions on the integrity of the selection process, though I wouldn’t go so far as to say there’s a clear belief of quid pro quo: this is the bit where the grey area is.
To me, he’s definitely casting aspersions on the integrity of the selection process, though I wouldn’t go so far as to say there’s a clear belief of quid pro quo: this is the bit where the grey area is.
Protesting is the act of clearly communicating that you don’t like something. Effective protests are those where the specific ideas being communicated are convincing enough, or the people doing the protest are important enough or widespread enough, that you take the communication seriously.
Communication takes many forms and some of them are more disruptive than others. But it is a popular fiction that disrupting and shouting down events because you don’t like the speakers is an effective form of protest - in fact there could be nothing more ineffective than associating your side of the argument with something that would make an ordinary attendee annoyed. Only if an ordinary attendee would side against the speaker by default would this be a good strategy.
But when employed, usually this is not about actually communicating that you protest against the thing, it’s about attempting to get your way by force, just with a 1A-tinted veneer. Protesting is allowed as long as it is actually protesting, instead of trying to take control.
This is just another example of the black and white thinking responsible for a large part of the awfulness in the world.
Effectiveness isn’t binary. The allowed form of protesting probably not as effective as it could be in a different form than what is allowed. That doesn’t make it ineffective.
Protesting is making your disagreement and numbers visible. If you’re physically stopping somebody from doing whatever it is they’re planning and you’re against, it is not a protest, it’s just disrupting. And by implication it means you have the power to stop it and are not the oppressed underdog you are likely proclaiming to be.
‘Protesting is allowed as long as it is ineffective’ is an appeal to the moral valence of the particular action of protesting. If you change the meaning of the word to refer to a completely different thing, you cannot keep the moral valence. Physically stopping something from occurring is not something anyone has to tolerate just because you used a certain set of Latin glyphs/mouth-noises to identify it.
Could be, I’m not very pleased with how I worded that so I might not be very clear about what I meant exactly, but it seems disingenuous to go back and edit it, given I’m not sure it’d end up better anyway. I’m trying to point at a meaningful difference between bringing attention to an issue and the size of the cohort in agreement with you, versus unilaterally acting to stop people doing something simply because you wish they wouldn’t. Of course everybody thinks that they’re in the right, therefore their actions are justified, but that can’t really be the case all that often, since everybody thinks it.
I’m trying to point at a meaningful difference between bringing attention to an issue and the size of the cohort in agreement with you, versus unilaterally acting to stop people doing something simply because you wish they wouldn’t.
The latter sounds like direct action. It is widely regarded as a kind of protest. Protests typically involve a small minority of the population deliberately causing a disruption to force a response. Often, a majority of the contemporary population disagree with the aims of protest, even protests that we consider good and/or effective in retrospect.
note that unset FLY_REGIONS["a100-40gb"] is incorrectly quoted: it gets expanded as a glob (like ls files.[ch]pp). for instance it will match a file called FLY_REGIONSa if it exist, so you’ll end up unsetting the variable FLY_REGIONSa instead of index a100-40gb of FLY_REGIONS
you want something like unset "FLY_REGIONS[a100-40gb]"
Associative arrays have also changed a bunch during bash 4 – there are a bunch more footguns.
Koichi Murase (who is also a bash contributor) has recently fixed a ton of issues in the OSH implementation, so I’m excited that we will be extremely compatible in this regard.
On the other hand, YSH also has nestable dicts and lists, not just bash-style data structures. In bash, you can’t have an array as the value of an associative array, or vice versa. They can only be strings.
In YSH you can arbitrarily nest, which is a lot more powerful.
In particular, bash assoc arrays can’t express JSON, but YSH dicts and lists can.
ShellCheck will warn about this. If anybody is writing non-trivial amounts of shell, I highly recommend using it.
unset FLY_REGIONS["a100-40gb"]
^-- SC2184 (warning): Quote arguments to unset so they're not glob expanded.
It also has a detailed page about every possible warning explaining why it is there and what to do to address the problem. For this example, here’s the page describing warning SC2184.
I introduced a Dynamic Lookahead Insertion strategy, which works as follows:
Simulate Insertions: For each remaining city, simulate inserting it between every edge in the current cycle.
Estimate Total Distance: For each simulated insertion, estimate the total distance of the complete TSP path if that city were inserted at that position.
Choose the Best Insertion: Select the insertion that results in the least total distance and update the cycle accordingly.
Repeat: Continue this process iteratively for all cities and for all edges until the path is complete.
this is literally just the cheapest insertion: it greedily inserts at every step and never backtracks to find better solutions. i don’t know why they called it dynamic lookahead since it’s clearly fixed to 1
i am somewhat surprised that it does significantly better than the existing solvers
i am somewhat surprised that it does significantly better than the existing solvers
I don’t see any evidence that it does any better than classic 2/3-opt (both by looking at the algorithm description and noticing the lack of citation), let alone the usual benchmark primal heuristic solver, http://webhotel4.ruc.dk/~keld/research/LKH/
EDIT: also “random” instances can be misleading. A lot of NP-complete problems are hard only around a phase transition from trivially feasible to obviously infeasible instances. I’d look at instance sets like http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/
Yeah, this is actually very similar to—but substantially simpler than!—the Sparse A* search algorithm, which is the kind of thing a naïve recent graduate with effectively no knowledge of CS can implement quickly and easily. (That’s not a hand-wave statement; it’s a description of me a bit over 15 years ago!) It’s surprising it does as well as it does on these cases, but like you and @pkhuong’s sibling comment it would be extremely surprising to me if this were what it claims. Indeed: I would be rather surprised, given the history of the problem, if it hadn’t been tried before, likely by precocious undergrads! (I have been surprised before, but even so: suspicion is appropriate.)
I did not realize msvc was so far behind on c++23. I guess even in 2024 for “portable” code it might be best to stick with C++20 unless one can use gcc/clang on Windows.
Seems reasonable. I mean, if no one wants it and no one wants to work on it, don’t do it. Personally, 99% of curl usage for me is curl | sh and reduces to “is the host compromised” and “is tls safe”, not “is curl memory safe” so while it’d be nice to answer that last one with “yes” it’s not something I’m overly concerned with.
It’s probably used by a ton of scripts that come installed by default with your distro, some of which can run in privileged contexts. It’s not just curl|sh either, the library is used by a lot of projects because it’s so featureful and reliable. A memory safe curl would have reduced the attack surface for a lot of people, and it’s sad to see this end.
Even in the case where it’s used by scripts, how often is it talking to an untrusted resource? Sometimes but I don’t expect it to be very often for things like setting up a docker image, etc.
Having curl be memory safe would be good and cool, in case it’s not clear that I feel that way. But if no one wants to work on it and users aren’t pushing hard for it it’s not surprising that it isn’t happening.
I don’t think that’s true at all. If I curl an HTTPS site with no redirects etc I feel fine reducing my security to “the site is compromised” or “TLS is compromised”. Neither is a memory safety issue unless something else has already broken.
Except that you’re receiving and processing bytes from unknown third parties long before you have any idea if the TLS is valid. Once the TLS validates then sure, after that point, if you trust the certificate and the site (both very big ifs!) then you might treat it as trusted from that point. But there’s a lot of space for issues before that
If I do curl https://hackedsite.com/, and curl is memory safe, then at worst it can do write unintended data to stdout or a file
If I do curl https://hackedsite.com/, and curl is not, and the attacker knows how to exploit it, then they may be able to execute arbitrary code on your machine. Like plant a root kit, advanced persistent threats, etc.
I think it is better to look at it from the ecosystem perspective than a personal perspective.
Something that is exposed to arbitrary input and does a lot of parsing, which curl is a prime example of, should definitely be memory safe. (Whether it’s written in Rust is a slightly different question, but that was the motivation here)
At the very least, the command line tool should be sandboxed. I wouldn’t be surprised if it is on OpenBSD, e.g. with pledge(), but I have not heard about that on other Unixes.
After doing all the setup, and before receiving input over the network, it should drop privileges so it can only write to the requested file / descriptor.
Even if curl had most components in Rust, it should still be sandboxed, for defense in depth. It is very widely used, like OpenSSH, and we just saw what a juicy target that was. It’s easy to imagine millions of dollars of resources on the other side.
At the very least, the command line tool should be sandboxed. I wouldn’t be surprised if it is on OpenBSD, e.g. with pledge(), but I have not heard about that on other Unixes.
I was curious and had a look at OpenBSD, since I too half-expect it by now. But! I don’t see any evidence of pledge being used, at least from the outside; no patches applied in the OpenBSD ports tree, and no such call in the upstream. Somehow I’m not convinced and want to spin up a VM just to ktrace it …
Hm, if I had time I would try to find an old CVE and exploit it … the double frees and such are very reliably caught by an ASAN build IME. And then it would be interesting to know if pledge() can stop it in production before that. I guess it might be hard to turn the CVE into a real exploit, but there are probably some repros out there
I’m not an OpenBSD user but I have been meaning to give it a whirl, after the xz / SSH / systemd issue, which affected both Debian and Red Hat …
Feeling a bit tired after six straight hours of code; going to try the VM install now, just for funsies! Will report back later. I don’t expect a sneaky pledge found its way in, but I’ve been more surprised before. I would expect sanitizers to be used by default, too, given their security-over-speed posture.
It’s a very nice system. I haven’t had the sense with any other OS that I could feasibly fit the entire thing in my head, notwithstanding the rate of change (or churn, less charitably) of most others. OpenBSD feels like a rock. A really good rock. You can turn it over in your hand for days and feel like you know every nook and cranny of it, and that’s quite a nice feeling to have for something you intend to expose to the Internet.
edit: I confirm there is no pledge made by the curl pkg on OpenBSD 7.6. Fwiw, the “native” such tool in OpenBSD is called ftp(1), which does of course pledge ("stdio rpath dns tty inet" when tested with an HTTPS URL).
https://i.ibb.co/QD4HHzD/curldeps.png this is a dependency tree generated on my ubuntu with debtree --rdeps-depth=50 --condense libcurl4t64. i do not know if it’s complete but it should help convey my point.
libcurl is incredibly widely used by applications as an HTTP client. It’s the go-to HTTP library for applications written in C and a few other languages.
I guess maybe it’s not clear from my post but I was talking specifically about why my use of curl doesn’t change much whether it’s memory safe or not. I’m very aware of what curl is and what libcurl is.
The sad fact is that pretty much no software has been “working fine for decades”. Rather, curl (and pretty much all software, this isn’t special to curl) has been “sort of working for decades, but if an anonymous hacker from the internet pokes it the wrong way it gives them complete access to your system”, for decades. That’s not fine.
Curl does an unusually good job at presenting information about this, here’s a chart.
I’ll cheer on pretty much anyone working to replace C code, but it’s true there are a handful of projects (numbering in the single digits) that have demonstrated the ability to meet the extraordinarily rigorous high bar needed to successfully develop and deploy nontrivial C code, and curl is one of them. Not that they never make mistakes, but that the mistakes are relatively rare, and they own up to them and learn from them.
I would certainly not generalize from this because I don’t think it’s realistic for others to replicate the success of curl.
Dovecot. They use an incredibly stylised C, which uses a parallel data stack for returning large things by value and giving arena-like properties, which makes most lifetime issues trivial to reason about.
SQLite and Lua are the main other ones I can think of off the top of my head. I’ve heard others say qemu belongs on this list but I can’t vouch for it myself. Potentially also openssh?
Note that both SQLite and Lua have a relatively closed development process where they don’t like to accept patches from outside the maintainer team unless they are specifically bugfixes; curl is honestly the only C project I know that welcomes outside contributions but still somehow manages to maintain a very high level of code quality.
“sort of working for decades…” I think is a bit too strong. If we follow your rationale my car has been sort of working for a decade, but if someone pokes it the wrong way with a big iron spike the whole thing falls apart.
I think this points to the various way someone can define what is a software that’s “working fine”. If it accomplishes its duty for decade with few incidents, is that working fine? Sure, it is not bullet proof and we all agree that ideally it would be impervious to attacks and free of bugs but what software is?
It’s not just the car “falls apart”. It’s more like “the car explodes”. It doesn’t just not work, it damages things. Like putting the gasoline tank somewhere where it explodes during typical car crashes.
The other is that people are actively trying to attack it. The internet is a hostile and largely lawless environment, unlike a typical western city. If I’m forced to stretch the analogy to fit it’s taking that car that explodes really easily and driving it into a warzone where you are a target. It’s not fit for that purpose.
There is software that doesn’t operate in that sort of hostile environment (like say the code that generates sqlites parser from a DSL), but most software, including curl (and sqlite proper), operates in a hostile environment where security is a requirement.
Or to put it another way, “working fine” includes being able to respond to contingencies. My car isn’t “working fine” if the airbag is broken, even though I’ve never had to use the airbag. The fact that the happy path is working fine isn’t sufficient.
but what software is
Really gets at the core of what a lot of software development improvements. We haven’t achieved this. We don’t know how to make software that “works fine” at scale, and this comes at an incredibly high cost to society. Billions or trillions of dollars lost to security incidents. Many peoples lives upended. Far more of both if you count non-security bugs as well. Projects like rust are an imperfect attempt at improving the situation, but we all know that we haven’t solved it yet.
I’m fine rewriting something to be memory safe, I’d like for this to have worked as well, I’m just more concerned with the TLS layer than the HTTP layer when it comes to curl.
But the probelm is Rust is treated by some like a magic bullet when we still see memory errors in these programs. There are other options that offer similar safety but don’t slap it all over their marketing. It’s like the crowd that noticed “X written in Rust” tagline became a meme then all just switched to “Memory-safe X”.
Ada is probably the most main stream and in the same vein as Rust, and that’s ignoring things like Zig and Hare because they have slightly different safety properties. Things slightly more far afield are ATS and FStar (esp via KaraMeL), which have really interesting applications. There are a bunch of other research and other academic languages in the space as well.
I do agree that it’s limited, but it isn’t the sole operator in the space either.
and that’s ignoring things like Zig and Hare because they have slightly different safety properties.
Which is to say, strictly worse. Hare doesn’t consider memory safety a feature and is betting on default spatial memory safe APIs being enough to make up for totally not-safe temporal memory safety.
I don’t recall Zig specifically but I recall it being somewhat similar, perhaps with hardened allocator support being encouraged.
Yeah, I agree with Ada maybe (I don’t know Ada’s safety properties when you manage memory yourself). I just wanted to point out that Zig and Hare definitely aren’t memory safe in the way that one might expect when comparing to Rust.
it should provide most of the same mechanisms Rust does, in a different style. Cyclone was similar, but you would use Regions to track allocations and had various pointers with different semantics. it’s definitely a rich space.
I do agree that Zig and especially Hare aren’t quite where Rust are tho! I suspect that’s roughly ok based on what they’re aiming for.
there’s also Cyclone, which is sadly dead, but which felt decidedly more accessible to me
I find this remark interesting because a major idea of Rust’s lifetime system was to be a more accessible (not necessarily in the same sense) version of Cyclone.
Well, admittedly I’ve written much more of other languages, but I didn’t find it difficult at the time; I do know that the Cyclone folks haven’t restarted their work since Rust has come along, since they feel it has completed what they set out to do.
You can’t even use pointers in Ada without getting UB if you do something wrong. SPARK solved this recently by adding borrow checking, but that’s the same approach as Rust.
That’s the big thing that unifies most (all?) languages from before Rust: being way more restrictive. Rust’s innovation was combining safety and convenient features in one language.
I apologize, you may know this better than I, but in order to hit that UB, you have to reach for the Unchecked Conversions for Access Types, right? Like a Unconstrained Array to Pointer via Unchecked Conversions. But yes, agreed, and I believe that’s generally because Access Types were meant to be an item you reached for with other mechanisms first. To your point however, this is a trade off on restrictiveness!
I do agree tho, Rust’s Big Idea was to make things much more accessibly safe and we are better for it, but I was originally attempting to answer what else could fill the niche, not putting Rust down in anyway. Rust absolutely has done great things, but there are other safer languages out there as well. Like what the FStar people have accomplished with KaraMeL is interesting too!
Ada is not mainstream, despite being regularly mentioned when discussing Rust safety.
It ranks very low or below threshold in most usage metrics. Can you name any Ada software I might want on my laptop or server ? Any device I might buy with Ada firmware ? AFAICT, Ada has been used for some military contracts and that’s it.
For better or worse, Rust has succeded where Ada hasn’t: being a reasonable and desirable language choice for normal projects.
Well, saying “the most mainstream” was not meant to convey “popular, well used, &c.” but rather “I wasn’t picking some language that no one has heard of.”
I don’t disagree tho, Ada is not incredibly popular, but it’s one that has a community around it, people know it enough to argue about it solely of the languages I mentioned, and it certainly has enough staying power that there’s a community to support it.
Wrt Ada projects, I think the one I’m most interested in would be Ironclad currently, but I’m sure there’s an Awesome Ada list with other things there. Whilst I understand that Ada isn’t the most popular, the question was “name anything else that fills the niche,” which is why it came to mind.
Can you name any Ada software I might want on my laptop or server
I wrote and use septum on my laptop. I wrote it in my spare time, and used it at work on large software projects, but once people heard it was written in Ada there was an instant response of “Ewww! Why’d you write it in Ada?” Works just fine for me, even though I never optimized it, sifting through massive projects (10+ million lines of code).
MISRA disallows malloc() so hopefully you know exactly how much memory your program will need to work (and by extension, no flexible array members in a structure). In fact, a lot of the standard C functions are excluded from use (no use of stdio.h or time.h, the functions abort(), getenv(), bsearch() or qsort() Each function must have one return statement, so no early returns either. No recursion, even if you can prove a bounded worse case (think balanced trees). No unions, no variable argument functions. MISRA is very restrictive standard and I doubt it comes close to how rust is used.
By now I’m honestly not sure if the unreflecting fanboys or the anti-fanboys that need to bring up how bad the fanboys are at every opportunity, even if not really related to the topic at hand, are the more annoying crowd. (I.e. if you think this complaint is on-topic here, do you think Daniel took this project on because he believed it to be a magic bullet? Or maybe because he thought it was a worthwhile thing to try?)
man doesn’t, but it’s based on troff which is a fully-featured typesetting language, with a similar amount of power as TeX or PostScript.
The modern mdoc macros are rich enough that I don’t remember needing to drop down to low-level troff directives in my man pages, but the old man macros are comparatively feeble. Sadly the mandoc renderer needs to implement large amounts of troff for compatibility with the weird tricks that man pages use to fill gaps in the man macros.
does the macro expansion really need to run on every invocation? can’t you just ship pre expanded man pages that just need minor reformatting (for alignment or whatever) and printing out?
The macros would be used to adapt to different widths and scenarios, so preformatting them does kinda of miss the point. That’s a trade-off individual systems can make.
Groff is fairly slow because every invocation is a macro for the raw troff(7) requests. Mandoc is faster because it already knows the man(7) and mdoc(7) languages and doesn’t lower everything to troff. For example, on the 31,000 line gcc manual page, mandoc requires 87ms and groff 261 ms, on my machine.
Not really a reason to preformat, unless you have a 1980s VAX.
this just looks extremely cursed tbh
it’s not even a terminal anymore, it’s a browser that happens to understand ansi escapes…
for what it’s worth the translation feature in firefox is terrible https://i.ibb.co/spNhv92Q/translation.png
i don’t know why they’d ship something this bad
It’s fully offline. The language models are a few megabytes. THAT’S AMAZING. Sure the translations are far from fluent, but I use it all the time to get the gist of articles in French, German, Spanish etc., and I don’t need them to be fluent, just understandable.
If you want online LLM stuff I mean they have you covered but again that’s online and LLM.
i get it, but please have a look at the screenshot. it’s not even just the translation that’s bad, it’s also (presumably) parsing the html incorrectly, doubling up the text in various parts (e.g. altano’s username and the flag button), and adding nonsensical text (e.g. “si tratta di un’azienda” == “it’s about a company”???). it’s so odd that the translation built into a major browser cannot parse html correctly
but of course, the main issue is that the translation is bad. the sentences sound weird, words have incorrect articles/endings (which are absolutely trivial to get right in that case). it even translated “gaslight” into “austriare” which is a completely made up word that makes zero sense
and this was english to italian, which shouldn’t be as hard as chinese to turkish or finnish to japanese
i’m sorry for complaining about someone’s hard work that they provided for free. it’s better than nothing, and it’s better than automatically translating things literally word for word, but it’s just… really not much better than that…
truth be told I’ve only been using it with English on the target end and it’s been fine when translating a stray site in French, German, Russian, Chinese, etc. I never translate text to my native language, before LLMs the translators all were terrible.
Came here to comment the same. The quality of the translations isn’t even the biggest issue though. The UX Chrome has is just way better. You can force translate and when the page updates dynamically it kinda “just works”. With Firefox it is impossible to do any real tasks on a government/bank website. The fact that there is no option in the context menu is also baffling to me.
One day per line sucks if you wanna track more than 2 or 3 events per day.
some other obvious ux issues:
some of these can be addressed with tooling, but then wouldn’t you just rather use something without all the limitations?
Tbh, I don’t see why that limit is introduced. I just add extra lines for each new event, and his whole workflow still works perfectly fine
Grepability, maybe? With one day per line, searching for an event always gets the date of the event, too. (You could copy the event date for each line, but maybe that has some other problem?)
A markdown based format with a h2 for each day would be better imho. (h3 for each event)
Perhaps a naiive question, but why would you ever want to statically link your systems libc? Surely that’s the one dynamic library you can guarantee is there.
On Linux, not really, different distros have different non compatible glibc versions. Particularly a problem when running software built using a normal distro on some LTS red hat like Enterprise distro. I doubt this is a problem in open BSD land though.
There’s also a small performance difference with static linking. Less indirection, link time optimization. At the cost of not sharing memory pages with other processes. Maybe that’s why they’re doing it?
Heck, forget different versions of glibc, you might find yourself on a Linux that uses a different libc altogether, like musl (Alpine Linux) or bionic (Android). And then there are weirdos like NixOS where
/usr/libdoesn’t even exist and any copies of glibc you might find are at weird random paths with hashes in them. Static linking means you just run fine on all of these.On OpenBSD, I think the situation is the opposite: libc is stable, the syscall interface is not. By statically linking libc, unless you’re a program like
lsshipped with OpenBSD itself, you’re just inviting breakage for no good reason.The main reason I see (in the case of OPEN SD) is security. Statically linking against libc means that all symbols are stored in your binary, which means you cannot use eg.
LD_PRELOADto make programs run malicious code.Another reason is to guarantee that your binaries are self contained, and to not depend on any external file in case of a system recovery scenario where all your partitions are not mounted for example.
In every scenario in which an attacker can set a malicious LD_PRELOAD, your system is already fully compromised
Right, that’s a fair point.
Thanks to your comment I started looking up what would be the reasons and context for all this static libc.a linking.
From what I could find, the complain seems to come from Go that used to make direct syscalls to the kernel, rather than doing it through libc (up until 1.16). OpenBSD released a
pinsycall(2)syscall, which somehow forced Go to comply, and as Go (usually) creates static binaries, it is then forced to statically link against libc.a.“It rather involved being on the other side of this airtight hatchway”
i worked on debuggers and i use them in sometimes
weirdcreative ways. for instance i recently turned rr into a bash profiler, by recording the execution of the process and then injecting calls to dump the current line and function name at various points in the recording. this technique of recording the process and extracting a profile from the recording was invented by a dear friend i met when working on a proprietary debugger. it’s a really powerful idea because the program can run unmodified and it doesn’t require support from the system/interpreter/runtime, thus collecting the profile adds basically zero overhead (the overhead of the debugger isn’t necessarily zero, but it doesn’t modify the state of bash, unlike the overhead of printing this information from within bash which can be very significant in certain optimised programs, rendering the profile meaningless)here’s the version for doing that with rr https://gist.github.com/izabera/ac954b99e7022ddd157a5034dfbd04c1 and it’s trivially adaptable to other languages
$1700 is quite a large budget. If the total cost were halved, that would still be a sizeable budget. I feel like tech writers these days are forgetting what the phrase “on a budget” implies.
I agree, it is a big sum of money. I interpret “on a budget” as “relatively cheap”, not as “nearly free”. I think it is pretty cheap compared to what one normally needs to pay for that amount of VRAM. To me, the term is more justified here than in a post where someone buys a second-hand Apple laptop for $1100 and claims it is a cheap solution to browse the web.
I really hope AMD catches up and prices come down, because AI-capable hardware is not nearly as accessible as it should be.
Maybe it would be more accurate to say “on a budget” is a form of weasel word. Its interpretation depends on your familiarity with current prices and your socioeconomic status.
From my very subjective (and probably outdated) PoV…
You can imagine the surprise I felt (or was that shame?) when seeing a $1700 price tag on a “budget” desktop PC that can do AI.
“Building a personal, private AI computer for $1700” would communicate the intent a little better, without suggesting to the reader anything about their ability to afford it.
Btw, I don’t mean to imply any wrong was committed. I’m just pointing out that the wording on the post had some unintended effects on at least this reader. To a large degree that is unavoidable, no matter what a person publishes on the web.
A few weeks ago I saw a reference to someone on ex-Twitter speccing an LLM workstation for $6,000, so $1,700 is a on a budget compared to that.
They get 5 tokens per second on the 70b models
Agreed. I’m running local models with an RTX 3060 12 GB that costs about $330 on NewEgg or 320€ new at ebay.de, and it’s actually useful. The context sizes must be kept tiny but even then it can provide basic code completion and short chats.
The code they write is riddled with subtle bugs but making my computer program itself seems to never get old. Luckily they also make it quicker to write throwaway unit tests. The small chat models are useful for language tasks such as translating unstructured article title + URL snippets to clean markdown links. They also act as a thesaurus, very useful for naming things, and can unblock progress when you’re really stuck with some piece of code (rubberduck debugging). Usually the model just tells me to “double-check you didn’t make any mistakes” though :)
On the software side I use ollama for running the models, continue.dev for programming (it’s really janky), and the PageAssist Firefox extension for chat.
Apparently the Commodore Amiga 500 was introduced at 699 USD in 1987 - just shy of 2 000 USD inflation adjusted.
Guess that says more about how much prices for computers have come down, than anything.
If you look at “modern” gaming graphics cards, that is so cheap I was actually surprised. (even compared to a 30xx from some years ago)
If the median price of a thing is high, then absolute values don’t matter. A new car for under 10k EUR would still be “on a budget”, even if it’s a lot of money.
I posted some notes here - Gemini 2.0 Flash Lite is half the price of GPT-4o mini and Gemini 2.0 Flash is a little cheaper than GPT-4o mini too: https://simonwillison.net/2025/Feb/5/gemini-2/
I also got the three new models to draw me pelicans on bicycles.
I can’t tell what the pelican test is supposed to show. Which of the results is better? Is this an improvement over the previous generations? Is this a meaningful indicator of intelligence? For instance if I drew a shitty pelican on a bike, would that mean I didn’t pass this modified version of the Turing test?
I talked about that on a podcast recently. Here’s the transcript. Short version: I think it’s amusing to get hyped technology to draw crap pictures - but there does also seem to be a weird correlation between how good a model is and how well it can draw a pelican on a bicycle.
nice, thanks :)
Eventually people are going to start training on the synthetic pelican datasets to juice their Simon benchmark scores
So it sounds like there was a battery concern? To avoid physical issues they severely capped the max charge on one of the two battery models used in these devices. However it seems that they wanted to keep this quiet so got some engineer to do a hack build rather than the usual CI?
Some obvious questions:
It’s a little more complicated than that. Apparently, only the kernel build was from a random engineer. The OS as a whole went through normal release processes. Apparently the kernel builds being separate and then getting vendored into the OS build is normal.
That… doesn’t make me feel much better 😅
Yes, it’s terrible practice. Not going through CI usually means no control over what actually went into the binary because the environment it’s built in is not controlled. It’s hard to reproduce this build and there is an endless source of potential errors, like embedding the wrong version numbers (because CI usually takes care of version numbering), using the wrong branch, using the wrong compiler, using the wrong build flags, etc. There’s a reason why this stuff is automated and under strict version control.
It makes me feel marginally better. At least that indicates the signing key isn’t available to just anyone - presumably the config to include the hand-built kernel had to go through normal review and CI processes.
But it doesn’t make me feel that much better. The whole thing is still sketchy as hell.
Yeah, I’m sure that binary kernel they checked in was well reviewed. I guess it is at least traceable to a human.
Going from 4.45V to 3.95V is a massive jump. For context: most of the usable energy of a lithium ion battery is between it’s max charge voltage (eg 4.2V) and about 3.4V. Below that there is only a tiny bit of capacity, the voltage plummets quickly (lookup “lithium ion discharge curves”), the exact choice of low cutoff voltage (2.6-3V are common) is a bit arbitrary and only grants you a few more % at best.
It could be a well researched change with actual data behind it. But I vote corporate laziness instead, Google probably just doesn’t care and put pressure on the person to do something quickly and cheaply.
i don’t know if this was the case elsewhere, but in the uk google offered free battery swaps at EE stores or £40 for each device (i think regardless of the condition), which is probably fair for a 5 year old phone?
Posted this due to this gem of a statement and recent discussions both of how to combat AI scrapers and the in-band signaling attacks that prompt injection opens up:
Just waiting for big websites to start happy-pathing the AI’s interaction with it, not the human’s.
that’s called an api
It isn’t looking for APIs, it is trying to navigate like a normal user. So I imagine people will start to figure out how to, for example, get the bot to navigate through a referral link. Interestingly, this appears to be all visual-based, not scraping the content, so just hiding the prompt injection in HTML comments won’t suffice.
Also, what do they consider prompt injection here? How much information disclosure about the previous step on another website they consider OK?
so i filled the ofcom form to assess the risks with what i think applies to lobsters. like for basically every other website on the internet, the result is that every feature of lobste.rs can be used to commit terrorism and child exploitation in 37 different ways. posting urls? clearly urls to child porn. writing comments? that’s what a terrorist would do. search function? it’s for searching for terrorism isn’t it. does not block minors? this just screams grooming. can see other people’s profiles? of course it’s for human trafficking. does not ask for an id? probably because lobste.rs is full of crime
https://www.ofcom.org.uk/os-toolkit/assessment-tool/?question=playback&stfid=&previousId=189445&type=UserToUser&id=7994658c-5d85-4717-90af-72adeb81a451
This is a weird one for multiple reasons that will become obvious as soon as you start reading https://github.com/shinh/elvm
relevant resources: https://www.youtube.com/watch?v=PH9q0HNBjT4 and https://www.youtube.com/watch?v=yPfagLeUa7k
“Protesting is allowed as long as it is ineffective”.
The FOSDEM organizers, insofar as they are FOSDEM organizers, exist for the purpose of organizing FOSDEM. In that role, they have decided that there will be such and such a speaker at such and such a time at such and such a place.
You can disagree with that decision. You can register your disagreement through channels they’ve provided. If that’s proven ineffective, you can disengage with the whole affair. If you don’t like that option, you can register your disagreement by non-obstructive protest, which clearly indicates friendly criticism. Or if you don’t like that option, you can register your disagreement via obstructive protest. But in that case you have declared yourself to desire to obstruct FOSDEM’s reason for existence; it’s not surprising that FOSDEM will in turn assert its will to proceed with itself.
There’s a bizarre trend among vaguely-radical Westerners whereby they expect to be able to disrupt the operations of an organization and not suffer any opposition for their own disruption because they call the disruption “protest”. This is a very confused understanding of human relations. If you are intentionally disrupting the operations of an organization, at least in that moment, the org is your enemy and you are theirs. Of course your enemy is not going to roll over and let you attack it. Own your enemyship.
FOSDEM is volunteer-run, by people who are involved in F/LOSS themselves. It exists thanks to a lot of goodwill from the ULB. Protestors are not fighting the cops or some big corporations, they’re causing potential grief for normal working-class people like themselves.
While the ULB campus always gives me the impression it’s not averse to a bit of political activism, it would be an own goal if some tone deaf protestors were to jeopardise the possibility of future FOSDEM conferences.
Dorsey won’t care, he’ll take his same carefully rehearsed speech and deliver it again at any of the hundreds of for-profit tech conferences. They’ll be delighted to have him. But there’s really only one volunteer-run event of FOSDEM’s scale in the EU.
By all means, boo away Dorsey, but be considerate of the position of the people running this.
As a former and future conference organizer who has taken a tiny paycheck from three of the ~dozen conferences I’ve organized but never a salary °, this is 100% true. I’ve been fortunate that my conferences have had no real controversy, but two that did have a bit of a tempest in a teapot were went to two directions. One, the controversy was forgotten in a week, aside from a couple of people who just couldn’t let it go on Twitter. It took about a month for their attention to swing elsewhere. In retrospect, almost a decade later now, we’d have made a different decision, and the controversy wouldn’t have happened. Unfortunately, the other was life-altering for a few of the organizers because of poor assumptions and unclear communication on our part and a handful of attendees who felt that it was reasonable to expect intra-day turnaround on questions leading to a hostile inquiry two weeks before a $1M event for 1,500 people put on by eight volunteers spread way too thin.
I’ve also had to kick out attendees who were causing a disruption. No, man, you can’t come into the conference and start doing like political polling, petitioning, and signature collection, even if I agree with you, and might have considered setting aside a booth for you if you’d arranged for it ahead of time.
As conference organizers, we have a duty to platform ideas worth being heard and balance that with the person presenting them. The most effective way to protest a person or a presentation is not to attend, and the second most is to occupy space in silence with a to-the-point message on an unobtrusive sign or article of clothing. Anything more disruptive, and you’re creating a scene that will get you kicked out according to the customs of the organizer team, the terms of attendance, and the laws of venue if the organizers enforce their code of conduct. I’ve never physically thrown someone out of an event in my 24 years of event organization, but I’ve gotten temptingly close and been fortunate that someone with a cooler head yet more formidable stature intervened (and I was 6’2” 250 lbs at the time!).
° A tenet of my conf organizer org is “pay people for their work.”
Or you can just acknowledge that a person has done enough damage in their life that you won’t let them shout out any other weird takes.
It’s not like FOSDEM is mainstream enough that it needs to have people who are more well-known outside of FLOSS circles to keynote. There are enough figureheads (often people who spent decades of their life doing good things) who would be more well-suited. This is not some random enterprise conference where you may invite any random startup CEO to shill their stuff. FOSDEM should do better. (I remember phk, and it was great)
that’s cool but literally nobody had heard of block’s involvement in open source until this was announced, so i don’t know what ideas you’re referring to
Respectfully I disagree with this. Peaceful protest can be non-disruptive but still effective. If Jack is talking the whole time with people on the stage in protest around him, I think a lot of attendees will inevitably read Drew DeVault’s article and understand his argument.
Drew lied about FOSDEM taking money from Dorsey for the keynote. You should take anything he says with a large grain of salt.
Where did he say that? I don’t see that claim anywhere in the article.
There you go.
What? How is that a “lie”? Dorsey’s blockchain bullshit company is a main sponsor of FOSDEM this year.
I was going to point this out, but then I realized the concern is more likely the “presumably” in “presumably being platformed”. In other words, they’re not saying that Block’s sponsorship is a lie, they’re saying FOSDEM did not accept a bribe: Block’s sponsorship is not the reason Dorsey got a keynote as DeVault alleges.
Yes, that could be it. If so, a bit disappointing to see such misconstrual or misrepresentation of DeVault’s clear statement of presumption as a statement of fact. (There’s also a lot of grey area between “bribe” and “total neutrality”. Patronage is a thing.)
Would you agree then with the statement that “presumably DeVault lied when he construed Block’s main sponsorship as the reason Dorsey got the keynote selection”? Would presumably have made @talideon ’s comment acceptable?
The linked article makes it clear that patronage is not in play for this event.
Not trying to be adversarial, just trying to highlight how others are reading DeVault’s statement in light of the clear answer from FOSDEM that Block’s sponsorship had no role in his keynote. I don’t care one way or the other about the keynote, have no feelings either way about DeVault (who seems to be the most polarizing figure on this site), and will not be at FOSDEM. But when I read DeVault’s wording, I generally understood he believes FOSDEM accepted Block’s sponsorship in return for a keynote address and “presumably” is there to avoid any legal issues from making such a claim.
No, because there’s no need to presume anything when you can just go look at his words. None of it gets anywhere near “lie”, to me. It strikes me as easy and reasonable to take his words as a true statement of his belief.
Thanks for pointing that out.
We’re on roughly the same page here. To me, he’s definitely casting aspersions on the integrity of the selection process, though I wouldn’t go so far as to say there’s a clear belief of quid pro quo: this is the bit where the grey area is.
I like the fact that they demoted Mr Dorsey to a ordinary main track! Free speech but no billionaire privilege!
That’s fair. Have a wonderful day!
Protesting is the act of clearly communicating that you don’t like something. Effective protests are those where the specific ideas being communicated are convincing enough, or the people doing the protest are important enough or widespread enough, that you take the communication seriously.
Communication takes many forms and some of them are more disruptive than others. But it is a popular fiction that disrupting and shouting down events because you don’t like the speakers is an effective form of protest - in fact there could be nothing more ineffective than associating your side of the argument with something that would make an ordinary attendee annoyed. Only if an ordinary attendee would side against the speaker by default would this be a good strategy.
But when employed, usually this is not about actually communicating that you protest against the thing, it’s about attempting to get your way by force, just with a 1A-tinted veneer. Protesting is allowed as long as it is actually protesting, instead of trying to take control.
This is just another example of the black and white thinking responsible for a large part of the awfulness in the world.
Effectiveness isn’t binary. The allowed form of protesting probably not as effective as it could be in a different form than what is allowed. That doesn’t make it ineffective.
Protesting is making your disagreement and numbers visible. If you’re physically stopping somebody from doing whatever it is they’re planning and you’re against, it is not a protest, it’s just disrupting. And by implication it means you have the power to stop it and are not the oppressed underdog you are likely proclaiming to be.
That’s a very narrow definition of protest which - afaict - isn’t in line with how that word is used by the rest of the English-speaking world.
‘Protesting is allowed as long as it is ineffective’ is an appeal to the moral valence of the particular action of protesting. If you change the meaning of the word to refer to a completely different thing, you cannot keep the moral valence. Physically stopping something from occurring is not something anyone has to tolerate just because you used a certain set of Latin glyphs/mouth-noises to identify it.
Could be, I’m not very pleased with how I worded that so I might not be very clear about what I meant exactly, but it seems disingenuous to go back and edit it, given I’m not sure it’d end up better anyway. I’m trying to point at a meaningful difference between bringing attention to an issue and the size of the cohort in agreement with you, versus unilaterally acting to stop people doing something simply because you wish they wouldn’t. Of course everybody thinks that they’re in the right, therefore their actions are justified, but that can’t really be the case all that often, since everybody thinks it.
The latter sounds like direct action. It is widely regarded as a kind of protest. Protests typically involve a small minority of the population deliberately causing a disruption to force a response. Often, a majority of the contemporary population disagree with the aims of protest, even protests that we consider good and/or effective in retrospect.
Further reading:
this comment didn’t age well
note that
unset FLY_REGIONS["a100-40gb"]is incorrectly quoted: it gets expanded as a glob (likels files.[ch]pp). for instance it will match a file calledFLY_REGIONSaif it exist, so you’ll end up unsetting the variableFLY_REGIONSainstead of indexa100-40gbofFLY_REGIONSyou want something like
unset "FLY_REGIONS[a100-40gb]"Never change,
bash, never change.Yeah also the bit about
${!A[@]}for keys and${A[*]}for values is wrongIt is
${A[@]}for valuesExample - add a space to the values:
This is probably not what you want:
This is what you want – there are 2 values in the container:
This is the same lesson from Thirteen Incorrect Ways and Two Awkward Ways to Use Arrays (2016)
Associative arrays have also changed a bunch during bash 4 – there are a bunch more footguns.
Koichi Murase (who is also a bash contributor) has recently fixed a ton of issues in the OSH implementation, so I’m excited that we will be extremely compatible in this regard.
On the other hand, YSH also has nestable dicts and lists, not just bash-style data structures. In bash, you can’t have an array as the value of an associative array, or vice versa. They can only be strings.
In YSH you can arbitrarily nest, which is a lot more powerful.
In particular, bash assoc arrays can’t express JSON, but YSH dicts and lists can.
Might as well post how you do it in YSH:
Output
Main differences from Python/JS:
()goes around the variable(FLY_REGIONS), because you can also iterate over shell “words”for k, vdoes the “right thing”And of course in YSH, all the good parts of shell work! Starting processes, pipelines, redirects, etc.
OSH should run the bash code exactly as is, but YSH has a cleaner syntax, and more powerful data structures, because they can be nested.
ShellCheck will warn about this. If anybody is writing non-trivial amounts of shell, I highly recommend using it.
It also has a detailed page about every possible warning explaining why it is there and what to do to address the problem. For this example, here’s the page describing warning SC2184.
Each of my utility scripts contains:
… which causes the script to check itself, and exit upon errors.
this is literally just the cheapest insertion: it greedily inserts at every step and never backtracks to find better solutions. i don’t know why they called it dynamic lookahead since it’s clearly fixed to 1
i am somewhat surprised that it does significantly better than the existing solvers
I don’t see any evidence that it does any better than classic 2/3-opt (both by looking at the algorithm description and noticing the lack of citation), let alone the usual benchmark primal heuristic solver, http://webhotel4.ruc.dk/~keld/research/LKH/
EDIT: also “random” instances can be misleading. A lot of NP-complete problems are hard only around a phase transition from trivially feasible to obviously infeasible instances. I’d look at instance sets like http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/
Can you expand what you mean by this?
https://cstheory.stackexchange.com/questions/33550/how-common-is-phase-transition-in-np-complete-problems
Yeah, this is actually very similar to—but substantially simpler than!—the Sparse A* search algorithm, which is the kind of thing a naïve recent graduate with effectively no knowledge of CS can implement quickly and easily. (That’s not a hand-wave statement; it’s a description of me a bit over 15 years ago!) It’s surprising it does as well as it does on these cases, but like you and @pkhuong’s sibling comment it would be extremely surprising to me if this were what it claims. Indeed: I would be rather surprised, given the history of the problem, if it hadn’t been tried before, likely by precocious undergrads! (I have been surprised before, but even so: suspicion is appropriate.)
what the fuck
the docs look excellent, good work on that!
https://en.cppreference.com/w/cpp/compiler_support
I did not realize msvc was so far behind on c++23. I guess even in 2024 for “portable” code it might be best to stick with C++20 unless one can use gcc/clang on Windows.
Seems reasonable. I mean, if no one wants it and no one wants to work on it, don’t do it. Personally, 99% of curl usage for me is
curl | shand reduces to “is the host compromised” and “is tls safe”, not “is curl memory safe” so while it’d be nice to answer that last one with “yes” it’s not something I’m overly concerned with.It’s probably used by a ton of scripts that come installed by default with your distro, some of which can run in privileged contexts. It’s not just
curl|sheither, the library is used by a lot of projects because it’s so featureful and reliable. A memory safe curl would have reduced the attack surface for a lot of people, and it’s sad to see this end.Even in the case where it’s used by scripts, how often is it talking to an untrusted resource? Sometimes but I don’t expect it to be very often for things like setting up a docker image, etc.
Having curl be memory safe would be good and cool, in case it’s not clear that I feel that way. But if no one wants to work on it and users aren’t pushing hard for it it’s not surprising that it isn’t happening.
Unless you’re curling localhost you’re using it to access an untrusted resource. So, basically always
I don’t think that’s true at all. If I curl an HTTPS site with no redirects etc I feel fine reducing my security to “the site is compromised” or “TLS is compromised”. Neither is a memory safety issue unless something else has already broken.
Except that you’re receiving and processing bytes from unknown third parties long before you have any idea if the TLS is valid. Once the TLS validates then sure, after that point, if you trust the certificate and the site (both very big ifs!) then you might treat it as trusted from that point. But there’s a lot of space for issues before that
In curl? Where? Do you mean at the TLS layer? That can use rust already and wouldn’t be covered by curl anyways.
Space at the HTTP layer? I mean, some, but curl does HTTP.
The attacks could be worse though
curl https://hackedsite.com/, and curl is memory safe, then at worst it can do write unintended data to stdout or a filecurl https://hackedsite.com/, and curl is not, and the attacker knows how to exploit it, then they may be able to execute arbitrary code on your machine. Like plant a root kit, advanced persistent threats, etc.I think it is better to look at it from the ecosystem perspective than a personal perspective.
Something that is exposed to arbitrary input and does a lot of parsing, which curl is a prime example of, should definitely be memory safe. (Whether it’s written in Rust is a slightly different question, but that was the motivation here)
Yes, of course, I would like curl to be memory safe.
Yeah although now that I look at https://curl.se/ , it’s a project to aspire to:
Although when I look at a sample bug, there does seem to be a ridiculously large surface area to curl, like https://curl.se/docs/CVE-2022-42915.html
i.e. seemingly infinite combinations of features.
At the very least, the command line tool should be sandboxed. I wouldn’t be surprised if it is on OpenBSD, e.g. with pledge(), but I have not heard about that on other Unixes.
After doing all the setup, and before receiving input over the network, it should drop privileges so it can only write to the requested file / descriptor.
Even if curl had most components in Rust, it should still be sandboxed, for defense in depth. It is very widely used, like OpenSSH, and we just saw what a juicy target that was. It’s easy to imagine millions of dollars of resources on the other side.
I was curious and had a look at OpenBSD, since I too half-expect it by now. But! I don’t see any evidence of
pledgebeing used, at least from the outside; no patches applied in the OpenBSD ports tree, and no such call in the upstream. Somehow I’m not convinced and want to spin up a VM just toktraceit …Hm, if I had time I would try to find an old CVE and exploit it … the double frees and such are very reliably caught by an ASAN build IME. And then it would be interesting to know if pledge() can stop it in production before that. I guess it might be hard to turn the CVE into a real exploit, but there are probably some repros out there
I’m not an OpenBSD user but I have been meaning to give it a whirl, after the xz / SSH / systemd issue, which affected both Debian and Red Hat …
Feeling a bit tired after six straight hours of code; going to try the VM install now, just for funsies! Will report back later. I don’t expect a sneaky pledge found its way in, but I’ve been more surprised before. I would expect sanitizers to be used by default, too, given their security-over-speed posture.
It’s a very nice system. I haven’t had the sense with any other OS that I could feasibly fit the entire thing in my head, notwithstanding the rate of change (or churn, less charitably) of most others. OpenBSD feels like a rock. A really good rock. You can turn it over in your hand for days and feel like you know every nook and cranny of it, and that’s quite a nice feeling to have for something you intend to expose to the Internet.
edit: I confirm there is no
pledgemade by thecurlpkg on OpenBSD 7.6. Fwiw, the “native” such tool in OpenBSD is calledftp(1), which does of coursepledge("stdio rpath dns tty inet"when tested with an HTTPS URL).Which scripts, which distros?
https://i.ibb.co/QD4HHzD/curldeps.png this is a dependency tree generated on my ubuntu with
debtree --rdeps-depth=50 --condense libcurl4t64. i do not know if it’s complete but it should help convey my point.left = things that depend on libcurl
right = things libcurl depends on
libcurl is incredibly widely used by applications as an HTTP client. It’s the go-to HTTP library for applications written in C and a few other languages.
I guess maybe it’s not clear from my post but I was talking specifically about why my use of curl doesn’t change much whether it’s memory safe or not. I’m very aware of what curl is and what libcurl is.
Memory-safe X is like a red flag to me at this point about rewriting something that has been working fine for decades.
The sad fact is that pretty much no software has been “working fine for decades”. Rather, curl (and pretty much all software, this isn’t special to curl) has been “sort of working for decades, but if an anonymous hacker from the internet pokes it the wrong way it gives them complete access to your system”, for decades. That’s not fine.
Curl does an unusually good job at presenting information about this, here’s a chart.
I’ll cheer on pretty much anyone working to replace C code, but it’s true there are a handful of projects (numbering in the single digits) that have demonstrated the ability to meet the extraordinarily rigorous high bar needed to successfully develop and deploy nontrivial C code, and curl is one of them. Not that they never make mistakes, but that the mistakes are relatively rare, and they own up to them and learn from them.
I would certainly not generalize from this because I don’t think it’s realistic for others to replicate the success of curl.
What are some other projects in this list?
Dovecot. They use an incredibly stylised C, which uses a parallel data stack for returning large things by value and giving arena-like properties, which makes most lifetime issues trivial to reason about.
Sqlite probably
SQLite and Lua are the main other ones I can think of off the top of my head. I’ve heard others say qemu belongs on this list but I can’t vouch for it myself. Potentially also openssh?
Note that both SQLite and Lua have a relatively closed development process where they don’t like to accept patches from outside the maintainer team unless they are specifically bugfixes; curl is honestly the only C project I know that welcomes outside contributions but still somehow manages to maintain a very high level of code quality.
As a QEMU developer, no not really. We do our best but we’re totally not on the level of curl or SQLite.
Well, thanks for your honesty!
“sort of working for decades…” I think is a bit too strong. If we follow your rationale my car has been sort of working for a decade, but if someone pokes it the wrong way with a big iron spike the whole thing falls apart.
I think this points to the various way someone can define what is a software that’s “working fine”. If it accomplishes its duty for decade with few incidents, is that working fine? Sure, it is not bullet proof and we all agree that ideally it would be impervious to attacks and free of bugs but what software is?
I think that analogy has multiple issues.
It’s not just the car “falls apart”. It’s more like “the car explodes”. It doesn’t just not work, it damages things. Like putting the gasoline tank somewhere where it explodes during typical car crashes.
The other is that people are actively trying to attack it. The internet is a hostile and largely lawless environment, unlike a typical western city. If I’m forced to stretch the analogy to fit it’s taking that car that explodes really easily and driving it into a warzone where you are a target. It’s not fit for that purpose.
There is software that doesn’t operate in that sort of hostile environment (like say the code that generates sqlites parser from a DSL), but most software, including curl (and sqlite proper), operates in a hostile environment where security is a requirement.
Or to put it another way, “working fine” includes being able to respond to contingencies. My car isn’t “working fine” if the airbag is broken, even though I’ve never had to use the airbag. The fact that the happy path is working fine isn’t sufficient.
Really gets at the core of what a lot of software development improvements. We haven’t achieved this. We don’t know how to make software that “works fine” at scale, and this comes at an incredibly high cost to society. Billions or trillions of dollars lost to security incidents. Many peoples lives upended. Far more of both if you count non-security bugs as well. Projects like rust are an imperfect attempt at improving the situation, but we all know that we haven’t solved it yet.
I’m fine rewriting something to be memory safe, I’d like for this to have worked as well, I’m just more concerned with the TLS layer than the HTTP layer when it comes to curl.
Curl even supports a Rust-based TLS backend,
rustls.But the probelm is Rust is treated by some like a magic bullet when we still see memory errors in these programs. There are other options that offer similar safety but don’t slap it all over their marketing. It’s like the crowd that noticed “X written in Rust” tagline became a meme then all just switched to “Memory-safe X”.
Name literally anything that fills Rust’s niche.
Ada is probably the most main stream and in the same vein as Rust, and that’s ignoring things like Zig and Hare because they have slightly different safety properties. Things slightly more far afield are ATS and FStar (esp via KaraMeL), which have really interesting applications. There are a bunch of other research and other academic languages in the space as well.
I do agree that it’s limited, but it isn’t the sole operator in the space either.
Which is to say, strictly worse. Hare doesn’t consider memory safety a feature and is betting on default spatial memory safe APIs being enough to make up for totally not-safe temporal memory safety.
I don’t recall Zig specifically but I recall it being somewhat similar, perhaps with hardened allocator support being encouraged.
Well, and also to avoid comments that aren’t about Ada, ATS, or the like haha.
But yes, they don’t have the exact same properties; Zig is closest, but certainly not at the same level.
Honestly, there’s also Cyclone, which is sadly dead, but which felt decidedly more accessible to me. Alas, we left that direction.
Yeah, I agree with Ada maybe (I don’t know Ada’s safety properties when you manage memory yourself). I just wanted to point out that Zig and Hare definitely aren’t memory safe in the way that one might expect when comparing to Rust.
Generally, and I’m not a hardcore Ada user, you:
it should provide most of the same mechanisms Rust does, in a different style. Cyclone was similar, but you would use Regions to track allocations and had various pointers with different semantics. it’s definitely a rich space.
I do agree that Zig and especially Hare aren’t quite where Rust are tho! I suspect that’s roughly ok based on what they’re aiming for.
I find this remark interesting because a major idea of Rust’s lifetime system was to be a more accessible (not necessarily in the same sense) version of Cyclone.
Well, admittedly I’ve written much more of other languages, but I didn’t find it difficult at the time; I do know that the Cyclone folks haven’t restarted their work since Rust has come along, since they feel it has completed what they set out to do.
You can’t even use pointers in Ada without getting UB if you do something wrong. SPARK solved this recently by adding borrow checking, but that’s the same approach as Rust.
That’s the big thing that unifies most (all?) languages from before Rust: being way more restrictive. Rust’s innovation was combining safety and convenient features in one language.
I apologize, you may know this better than I, but in order to hit that UB, you have to reach for the Unchecked Conversions for Access Types, right? Like a Unconstrained Array to Pointer via Unchecked Conversions. But yes, agreed, and I believe that’s generally because Access Types were meant to be an item you reached for with other mechanisms first. To your point however, this is a trade off on restrictiveness!
I do agree tho, Rust’s Big Idea was to make things much more accessibly safe and we are better for it, but I was originally attempting to answer what else could fill the niche, not putting Rust down in anyway. Rust absolutely has done great things, but there are other safer languages out there as well. Like what the FStar people have accomplished with KaraMeL is interesting too!
Ada is not mainstream, despite being regularly mentioned when discussing Rust safety.
It ranks very low or below threshold in most usage metrics. Can you name any Ada software I might want on my laptop or server ? Any device I might buy with Ada firmware ? AFAICT, Ada has been used for some military contracts and that’s it.
For better or worse, Rust has succeded where Ada hasn’t: being a reasonable and desirable language choice for normal projects.
Well, saying “the most mainstream” was not meant to convey “popular, well used, &c.” but rather “I wasn’t picking some language that no one has heard of.”
I don’t disagree tho, Ada is not incredibly popular, but it’s one that has a community around it, people know it enough to argue about it solely of the languages I mentioned, and it certainly has enough staying power that there’s a community to support it.
Wrt Ada projects, I think the one I’m most interested in would be Ironclad currently, but I’m sure there’s an Awesome Ada list with other things there. Whilst I understand that Ada isn’t the most popular, the question was “name anything else that fills the niche,” which is why it came to mind.
I wrote and use septum on my laptop. I wrote it in my spare time, and used it at work on large software projects, but once people heard it was written in Ada there was an instant response of “Ewww! Why’d you write it in Ada?” Works just fine for me, even though I never optimized it, sifting through massive projects (10+ million lines of code).
Jeremy Grosser has done a lot of work with RP2040
What about the subset of C defined by MISRA and similar rules?
MISRA disallows
malloc()so hopefully you know exactly how much memory your program will need to work (and by extension, no flexible array members in a structure). In fact, a lot of the standard C functions are excluded from use (no use ofstdio.hortime.h, the functionsabort(),getenv(),bsearch()orqsort()Each function must have one return statement, so no early returns either. No recursion, even if you can prove a bounded worse case (think balanced trees). No unions, no variable argument functions. MISRA is very restrictive standard and I doubt it comes close to how rust is used.By now I’m honestly not sure if the unreflecting fanboys or the anti-fanboys that need to bring up how bad the fanboys are at every opportunity, even if not really related to the topic at hand, are the more annoying crowd. (I.e. if you think this complaint is on-topic here, do you think Daniel took this project on because he believed it to be a magic bullet? Or maybe because he thought it was a worthwhile thing to try?)
(and I say that as a primarily C++ dev)
I would worry a lot less about “the crowd”. People are idiots, it’s not really a Rust issue.
why does man need the ability to compute things at all
to fly to the moon
A question asked by many a schoolboy faced with a difficult arithmetic problem.
man doesn’t, but it’s based on troff which is a fully-featured typesetting language, with a similar amount of power as TeX or PostScript.
The modern mdoc macros are rich enough that I don’t remember needing to drop down to low-level troff directives in my man pages, but the old man macros are comparatively feeble. Sadly the mandoc renderer needs to implement large amounts of troff for compatibility with the weird tricks that man pages use to fill gaps in the man macros.
does the macro expansion really need to run on every invocation? can’t you just ship pre expanded man pages that just need minor reformatting (for alignment or whatever) and printing out?
Yeah, most systems have catman directories of preformatted man pages.
The macros would be used to adapt to different widths and scenarios, so preformatting them does kinda of miss the point. That’s a trade-off individual systems can make.
Groff is fairly slow because every invocation is a macro for the raw troff(7) requests. Mandoc is faster because it already knows the man(7) and mdoc(7) languages and doesn’t lower everything to troff. For example, on the 31,000 line gcc manual page, mandoc requires 87ms and groff 261 ms, on my machine.
Not really a reason to preformat, unless you have a 1980s VAX.