NetBSD has blocklistd which does away with log parsing but instead requires the services to be patched to pass the client fd to the blocklistd daemon which then does rule matching on the client fd (e.g: remote address) and offending clients are blocked by updating the firewall (npf). To aid patching the services, It comes with a helper library, libblocklist. I think it has been ported to FreeBSD as well.
I’m waiting to have a few consecutive spare hours to upgrade the server in my garage from 13 to 14 (in case anything goes wrong). But this makes me more impatient to do so…
Now… the article provides a lot of benchmarks, but no details about why 14 is faster than 13. Does anyone have a summary of what has changed to explain the better performance?
Now… the article provides a lot of benchmarks, but no details about why 14 is faster than 13.
That’s par for the course for Phoronix. Their highlights include:
Concluding gcc was much faster than clang because they benchmarked gcc at -O2 against clang at -O0
Concluding that an upcoming FreeBSD release would be slower than the previous one because they benchmarked the newer one with all of the debug features in the kernel enabled (they’re turned off for the releases).
Some of it is probably due to a newer version of clang in the base system, so everything is compiled with that. Some of it due to ifunc things in libc and the kernel selecting CPU-specific optimised variants of hot functions. Some of it is due to work done to optimise locking in the VM subsystem recently. It looks like there’s a bigger speedup on AMD systems. These introduced a broadcast TLB invalidate, which avoids an IPI on a load of hot code paths for the VM subsystem. I’m not sure if the support for that made it into 14 (it definitely isn’t in 13) but that would likely give a difference of several percentage points.
Phoenix benchmarks are not very insightful though the site itself is useful as an OSS news aggregator. lmbench and will-it-scale are a better set of benchmarks to run across releases to find out more on the actual improvements/regressions.
End of an era for Mercurial? I feel a twinge of sadness/nostalgia because I used Mercurial for my personal projects before Git (c.2006-2010?) and even wrote a Mac GUI client (Murky). I remember Git seeming super awkward when I started using it for work, but now I wouldn’t go back — staging files for commit, and switching branches in a single checkout, are so useful. (Or maybe Mercurial has added those in the meantime?)
As for GitHub — jeez guys, it’s a distributed VCS, so this is a nonissue as far as code management. OK it’s a proprietary bug tracker, code review and CI system, but I have yet to come across an open source bug tracker or code reviewer that isn’t shit, so…
Google still uses hg for now - we don’t even teach g4 anymore to new hires. The team is still sending occasional work upstream, but for the most part it’s “done” and working. A Googler is working on https://github.com/martinvonz/jj which looks like a promising future, and has a lot of hg feel to it (and a native git backend.)
Facebook had an unfortunate falling out with the hg community and is doing their own thing.
There’s parts of hg being written in Rust, but I don’t think it’s going to be “hg implemented in Rust” - more a minimal subset of hg that’s useful for shell/build-system integrations I think? I’m a little out of touch, I’ve been doing more Rust work than I have source control work for a couple years now.
Oh yes I didn’t recognize your handle, but I googled for it :-)
Those were the days when I was using Subversion and then Mercurial ! Sadly, Google Code being shut down pushed my personal projects onto git. Glad to see that there’s still some choice and diversity though.
I went the other way for my personal projects mainly because of hg absorb. Bundles, rollbacks, and the built-in webserver have all come in handy. The bumbling idiot in me doesn’t miss git one bit :)
That’s inspired by hg absorb but their description of the algorithm makes me confident in saying it’s not going to produce similar results (I ported the core absorb algorithm from C to Python for inclusion in hg.)
It constructs a datastructure inspired by an SCCS Weave, using an algorithm learned from BitKeeper. That lets you do very fast blames and some other things, but then we abuse that in absorb by putting current history into even-number revs, then interpolating the edits into the odd-number ones, and then replay the odd-numbered revisions into the new history.
It’s hard to write it up briefly. The implementation is at least somewhat comprehensible - I tried to link to the most relevant section of the file.
None of the git absorb replacements I’ve used work perfectly. For example, they need to handle fixing all the branch pointers after the absorb because with git every commit might be a stacked branch.
What do you mean by “healthy”? The folks on the Sapling Discord seem to be responsive if you’re having issues. Possibly the builds on GitHub are not in a good state, but people seem to be using Sapling fine despite that 🙂.
A friend’s review (the same one that told me about git-branchless) was that it was pretty good but you could get stuck in bad states (he didn’t get it resolved on Discord), and also that the word on the street was that some things (I think it was reviewstack) got hit by the layoffs and were ownerless.
A lot of the issues on GitHub have no comments.
I was about to say “that and they stopped doing releases last May after doing more than 1 release a month” but they just posted a pre-release to github 1 hour ago. That and this thread about the lack of releases has a reasonable response now: https://github.com/facebook/sapling/issues/724#issuecomment-1792973633. So maybe I jumped the gun on my assessment. My apologies!
This looks like E with a fancy syntax. Unsurprising given his involvement with E/Joule. E itself is a fine dynamic language with async and object capability baked in but the implementation(s) are bitrotting sadly.
If there is any demand for classic E, I can set up a Nix flake which bundles a JRE and starts a shell. However, it’s not a very interesting language as implemented. I’m currently working on a Nix flake for Monte, which is still incomplete but at least able to do some basic networking and filesystem access.
Read this during COVID. Then stumbled upon Herb Gross’s Calculus Revisited course from OCW (filmed in the 70s). His smile was infectious! Great course to go with this book. Remember reading a comment or two from him, in his 90s, on YouTube, offering encouragement to the viewer. Passed away that year.
Although it is news from a business, hence business news, there’s a lot of technical commentary inside too. IMO it’s not filled with buzzwords and gives enough for us to discuss their offering.
What it gives us, beyond a couple tidbits about fans and simpler electrical design, is mostly just a business pitch dressed up to feel attractive to developers.
There are interesting articles linked off of it, yes–but I think precedent here is typically to submit those articles directly (see also why we don’t really do newsletter submissions…better to submit the stories directly).
I have tremendous respect for the Oxide folks and what they’ve accomplished; make no mistake, though, that this is marketing and just a press release.
To me, this is on the bubble. There’s almost enough information about the hardware in their rack to count as a Lobster’s story, but I think it’s maybe just a hair shy. But maybe someone else will have a different opinion.
The golden rule: does reading it make people better programmers? I think some of the discussion here about how to think about hardware and software in tandem feels like this is on-topic (But I’m biased, cuz I listen to their podcast, am bought into the idea, and so know more context than what someone just reading this might get).
The entire software stack seems to be open source. That alone makes it relevant here…no? I personally have a ton of questions just skimming through a couple of repos there… I’d say this is wonderful stuff!
This requires that all elements in the array are pointers, right? In other words, using Go’s syntax, this would be a []*T rather than a []T? If I’m correct, I think this would be improved by supporting the []T case which allows for storing either pointers or values. You would have to store the interval between items in the backing array–I think this is called the “stride” and I’m not sure how to get that in C.
yes, it stores only pointers to objects. Like you said, you can store values as well by tracking the size of the object to be stored and do a byte-wise copy of the passed in object into the array slots. Hanson’s “C Interfaces and Implementations” details the design and implementation of a container library. Highly recommended if you are new to this.
I downloaded the same PDF a couple weeks ago, and I’ve been meaning to read through it :) Someone who knows this stuff should really write an article like “here are some programs you can write in Rust but not Hylo and vice versa”.
The literal rule for what you can’t express in Hylo is: any Rust function that, if you made its lifetimes explicit, has at most one lifetime parameter. If that lifetime parameter is used in the return type, you need a “subscript”, otherwise a regular function will do. Oh, and references can only appear at the top level, that is very important. Vice-versa is simpler: Rust is strictly more expressive.
But that doesn’t quite answer “what programs you can write” b.c. there’s the question of writing “the same thing” a different way.
A Rust function without references can obviously be written in Hylo. If it has references only on the inputs, that can be expressed in Hylo:
fn foo(arg1: &T, arg2: &mut T) -> T // works in Hylo
Every Rust function with references only in its inputs can be written with a single lifetime parameter. Put differently, lifetime parameters are only ever needed when there’s an output reference. Thus the above function can be written with only one lifetime parameter:
fn foo<'a>(arg1: &'a T, arg2: &'a mut T) -> T
If a Rust function has references on both its inputs and output, then it’s expressible in Hylo only if there’s at most one lifetime parameter when you write out the lifetimes. So:
fn foo<'a>(&'a mut self, &'a T) -> &'a T // works in Hylo, needs to be implemented with a "subscript"
fn foo<'a, 'b>(&'a mut self, &'b T) -> &'a T // does not work in Hylo
The other, bigger, restriction is that references have to be top-level. You can’t nest them. So no Option<&str>, or HashMap<i32, &User>. And no iterators over references! No such thing as split_words() -> impl Iterator<Item = &str>!
Instead of a function returning an Option<&str>, the function would take a callback and invoke it only if the option is populated. And iterators need to be internal iterators: to iterate over a collection, you pass a closure to the collection and the collection invokes that closure once per element.
Please ping me
I’ve added a file to my blog post ideas folder, and it says to ping you, so I’ll definitely ping you if I write it. Though no promises, as that ideas folder grows more than it shrinks…
This was the earliest published work that I’m aware of on the language that became Objective-C. For a long time there were no publicly-available copies of this document (I had a paid-for copy for years).
Be grateful for what the syntax became by the time it became popular.
Brad Cox wrote a lot of things around this time that I found insightful 20 years later. In particular, his view on languages for implementing components versus languages for building systems out of components is something that we keep kind-of building accidentally.
He also had a proposal for blocks in Objective-C that had syntax that I liked a lot more than the C-like version Apple eventually went with.
Wow - nice find. Yeah, this syntax is pretty different. I wonder how the syntax evolved into what it is today. Adding syntax to c and c++ seems pretty challenging, yet, they succeeded.
This is a great article, the gap buffer is surprisingly efficient compared to the fancier data structures.
I would have loved to see my favorite text data structure measured as well, namely the array of lines. It is briefly mentionned at the beginning but I think including it would have given interesting results. It works suprisingly well because on most text (especially code) lines are relatively small, which means you get a pretty balanced rope for free. Many text operations are very fast because it matches what people do with text:
Inserting/removing lines is O(n) in the number of lines, but very fast O(n) because you are just shifting line pointers around.
Editing is O(n) on the length of the line being edited, which is very fast for typical lines
Finding a text location is O(1) (if per byte/line pairs)
I was surprised that there’s a single buffer in the gap buffer case. I’d have thought that this approach would work better with more than one. If you start with, say, 8 gaps scattered throughout the file and you move the closest one, possibly with some bias against the least-recently used one, you’ll end up copying a lot less data. If you keep the gaps sorted, finding the next available one is cheap. The worst-time complexity is the same (assuming a fixed number of holes), but I’d expect the average latency to be better. It would also be easier to extend to allow concurrent mutation with fine-grained locking (probably unimportant for a text editor, but who knows with EMACS).
I’m also curious what happens if you use Linux’s mremap for the moves above a threshold size (e.g. 128 MiB or possibly L3 cache size). If you’re only ever shunting around data within a couple of pages, but are shuffling pages in the page table, this might be faster in the worst cases.
On FreeBSD, these buffers would almost certainly benefit from transparent superpage promotion. On Linux, you probably get a benefit from requesting superpages explicitly.
I was thinking that when a file gets big enough that it might be worth considering mmap shenanigans, instead shard the file into several smaller gap buffers. That would have a similar effect to using multiple gaps, with the restriction that they have to be spaced out a certain amount.
That will cost an extra layer of indirection to find the shard, but might be worth it. Iteration (as in the search) becomes a nested loop, but if the inner loop is over a few MiBs then that’s probably noise.
Reading your comment made me think: Are there any text editors that have two models, and the editor picks the best one when the file is opened?
Say that at file read time you quickly check whether the file looks like a multi-megabyte log file, or whether it looks like a JSON file without whitespace/newlines, or whether it looks like a standard code file. Then you pick a representation that optimizes for the kinds of operations most likely given that file (searching a big file, editing a code file, etc)
With a language like Rust it’s probably not even terribly verbose to put it behind an abstraction so the rest of the editor logic is unaware of the concrete implementation.
Reading the article made me think of a data structure in similar vain. A gap buffer of gap buffers. I’d expect an array of lines to behave poorly when inserting new lines in the middle, but a gap buffer of lines would alleviate that issue without much overhead.
I wonder how much of an advantage the contiguous memory of a single gap buffer is in practice. Is regex’s slice API required by its inner workings, or would it perform the same given an iterator?
Anyway I suppose the difference of ropey in helix and array of lines in kakoune is the reason why kakoune chugs on humongous files, but helix just stops completely.
That’s an interesting benchmark. Adding line numbers in kakoune is actually pretty good (%<a-s>ghi<c-r>#), however adding an empty line after each line (%<a-s>o) is burning a lot of CPU and still running.
Helix just eats the whole CPU at the sight of inserting a character.
Adding an empty line after each line is unfortunately not implemented in the most efficient way, we insert the first line, shift everyting down by one, insert the second line, shift everything down again… The datastructure itself would support a much more efficient implementation where we shift each lines to their final location once, but its a pretty chunky refactoring of the codebase to make this possible.
Also, when profiling here most of the time when doing that operation is actually taken by the indentation hooks, not the buffer modification.
Is regex’s slice API required by its inner workings, or would it perform the same given an iterator?
It needs to be reworked to handle the iterator or streaming case. There is some discussion about it here. The regex author was doubtful it could ever be made as performant as the slice version, but we will have to wait and see. contiguous memory is strong advantage for the search use case.
I would be surprised if ropey was the bottle neck in helix. ropes should handle large files very well.
I thought emacs used array/list of lines instead of a big gap buffer. There was a time when both emacs and vim struggled with lots of long lines (log files mainly) but that is not a common use case. nvi, which is a beast of an editor when it comes to pathological input, also uses list of lines backed by a tree of pages inside a real database (a cut-down copy of BDB).
The article mentions storing the tracking information in a tree. This could be stored in another parallel gap buffer as well. There is a neat trick, which IBM (still?) has a patent on, where the position in the run array entries after the gap are switched to refer from the end of the document. The book “Unicode Demystified” explains this in detail. The SWT StyledTextCtrl makes use of this trick as well (possibly because it was written by the patent holders).
This could be stored in another parallel gap buffer as well. There is a neat trick, which IBM (still?) has a patent on, where the position in the run array entries after the gap are switched to refer from the end of the document.
So if you moved the gap buffer from the front to the back, would that mean you need go through and update every item to point to the front?
yes, if the gap moves then the text buffer contents change so the corresponding style/linestart gap-buffer needs to be updated as well. But these are run entries so in practice, updating these entries is not expensive.
Is zfs perfect? Of course not, and to my mind one of the most shocking things is that nothing else has even bothered to try and come close.
Although no other filesystem tempts me away from ZFS, I question this claim. Linux may in the foreseeable future try to come close with bcachefs, and DragonFly BSD may have come close already with its HAMMER2.
Btrfs also tried to come close (and, in some ways, to be better).
I think a big part of the problem is that ZFS works well because of how the whole system interacts. Until you have a complete ZFS implementation, you don’t get most of the benefits. That makes it quite hard to develop in an environment that wants to do incremental improvements. ZFS
This is part of the reason that I think bcachefs is the most likely to succeed from the replacements. It didn’t start trying to solve the same problems, it started trying to solve a problem that is independently useful and which can then be used as a building block for something ZFS-like.
Yes, those are the first 2 projects that came to my mind, too, and I agree.
OTOH, ZFS is here, mature, stable, and runs on half a dozen OSes. It’s in Solaris, Illumos and derivatives, FreeBSD, Ubuntu, Void Linux, NixOS, Arch, and optionally other distros.
on NetBSD, Windows and macOS too. The portability story is just crazy. bcachefs will succeed of course as Linux has this emergent effect caused by the extraordinary churn… suddenly pieces fit, APIs/UIs happen around it and it takes over.
It also helped that the license was weak copyleft, so it was compatible with everything except an incompatible strong copyleft kernel. Permissively licensed and proprietary kernels could all use it. Apple got quite far but apparently were worried about Oracle patents and dropped it (they got as far as announcing it as a feature in their next OS release before dropping it, which was a shame, though APFS feels a lot like what ZFS would be if it had been created primarily for single-disk machines and not massive storage servers).
Sadly bcachefs is GPL’d and so is unlikely to be merged anywhere else (though may end up as an external module for other things).
Do you have any source on Apple being worried about Oracle? From what I remember, they are/were on quite friendly terms, and ZFS has an absolutely permissible license. It’s just Linus that has issues with it (even Ubuntu is fine with including it).
That’s what I heard from Apple folks back around then: the lawyers were worried Oracle would do something. Back then, Apple had the XServe and XSAN line and a solid ZFS offering from them would have competed in the server space with Oracle, using Oracle’s technology. I’m not sure exactly what they were worried about, possibly Oracle patenting things in their ZFS version and not open sourcing them so the Apple version couldn’t keep up. Or possibly the reciprocal terms in the CDDL made them nervous of giving up rights to Oracle.
Was the Solaris portability layer developed on FreeBSD during the initial port? Also a question for ZFS experts: is the DMU layer usable as a KV store? Came across some old posts talking about an API to enable that but can’t see anything. Closest is openebs’s cstor engine which, last I checked a few years back, added a network layer on top of the ztest (zfs in userspace) to provide volumes for k8s containers.
Was the Solaris portability layer developed on FreeBSD during the initial port?
I think so, yes.
is the DMU layer usable as a KV store?
I’ve never seen it exposed to userspace as anything other than ZVOLs or via the ZFS POSIX Layer (i.e. as a filesystem). I wondered a few times whether it would be possible to port SQLite into the kernel and have a SQL interface that used the transactional storage from ZFS as the storage back end.
For a small fee IIRC. That was how I had setup OmniOS on a dedicated server 6 years ago. Ran without a hiccup until Hetzner suspended my account 2 years ago without any explanation. Happy with vultr and worldstream.nl now.
QtCreator is actually a pretty capable IDE on its own, even if one doesn’t do Qt development at all. It’s blazingly fast and at some point had better “intellisense” than Eclipse’s CDT (before clangd came out at least).
It has support for CMake, works with Conan, integrates with vcpkg, and has an integrated pretty capable FakeVim mode.
I think that naming it like it’s named hurts the project, because people assume it’s for Qt development and nothing else, which is not true.
agreed on all points. CDT however has one distinguishing feature: it can parse build logs and pick up include paths and definitions which makes it quite useful for projects with non-standard build systems. The CDT indexer is also flexible in that you can just throw a tree of sources and get reasonable code navigation without configuring and building the project first.
NetBSD has blocklistd which does away with log parsing but instead requires the services to be patched to pass the client fd to the blocklistd daemon which then does rule matching on the client fd (e.g: remote address) and offending clients are blocked by updating the firewall (npf). To aid patching the services, It comes with a helper library, libblocklist. I think it has been ported to FreeBSD as well.
I’m waiting to have a few consecutive spare hours to upgrade the server in my garage from 13 to 14 (in case anything goes wrong). But this makes me more impatient to do so…
Now… the article provides a lot of benchmarks, but no details about why 14 is faster than 13. Does anyone have a summary of what has changed to explain the better performance?
That’s par for the course for Phoronix. Their highlights include:
Some of it is probably due to a newer version of clang in the base system, so everything is compiled with that. Some of it due to ifunc things in libc and the kernel selecting CPU-specific optimised variants of hot functions. Some of it is due to work done to optimise locking in the VM subsystem recently. It looks like there’s a bigger speedup on AMD systems. These introduced a broadcast TLB invalidate, which avoids an IPI on a load of hot code paths for the VM subsystem. I’m not sure if the support for that made it into 14 (it definitely isn’t in 13) but that would likely give a difference of several percentage points.
Phoenix benchmarks are not very insightful though the site itself is useful as an OSS news aggregator. lmbench and will-it-scale are a better set of benchmarks to run across releases to find out more on the actual improvements/regressions.
freetype. ’nuff said.
Happy to see the term “100-year X” catching on: http://len.falken.directory/100-year-programs.txt - I feel Zig isn’t there, but could become a 100-year language with the right moves.
It’s interesting, I feel Hare meets all the checkmarks that StandardML has met for me:
How are those three properties related to longevity? Particular the size of binaries seems unrelated to the longevity.
Which Standard ML implementation do you use? I, honestly, had to wonder today why I am not using it…
I use Poly/ML for my tiny projects. SML# and MLKit are also actively maintained in addition to mlton.
F# is joy. If only dotnet(core) ran on any of the BSDs.
End of an era for Mercurial? I feel a twinge of sadness/nostalgia because I used Mercurial for my personal projects before Git (c.2006-2010?) and even wrote a Mac GUI client (Murky). I remember Git seeming super awkward when I started using it for work, but now I wouldn’t go back — staging files for commit, and switching branches in a single checkout, are so useful. (Or maybe Mercurial has added those in the meantime?)
As for GitHub — jeez guys, it’s a distributed VCS, so this is a nonissue as far as code management. OK it’s a proprietary bug tracker, code review and CI system, but I have yet to come across an open source bug tracker or code reviewer that isn’t shit, so…
The era ended when google and facebook stopped contributing to mercurial development. After that the writing was on the wall.
Anyone know what happened with that experiment?
Isn’t there also a Mercurial in Rust effort?
Google still uses hg for now - we don’t even teach g4 anymore to new hires. The team is still sending occasional work upstream, but for the most part it’s “done” and working. A Googler is working on https://github.com/martinvonz/jj which looks like a promising future, and has a lot of hg feel to it (and a native git backend.)
Facebook had an unfortunate falling out with the hg community and is doing their own thing.
There’s parts of hg being written in Rust, but I don’t think it’s going to be “hg implemented in Rust” - more a minimal subset of hg that’s useful for shell/build-system integrations I think? I’m a little out of touch, I’ve been doing more Rust work than I have source control work for a couple years now.
Oh I didn’t know they use hg now, that’s interesting! I saw something about an experiment a long time ago but haven’t heard about it since.
It seems like Mercurial could have a “who uses it?” page like many open source projects
OK I see it now, couldn’t find it from the home page:
https://www.mercurial-scm.org/who
https://wiki.mercurial-scm.org/ProjectsUsingMercurial
Yeah, Mercurial as a project has always been kind of weak on publicity.
(BTW: I doubt you remember me, but we briefly worked together on Google Code…small world.)
Oh yes I didn’t recognize your handle, but I googled for it :-)
Those were the days when I was using Subversion and then Mercurial ! Sadly, Google Code being shut down pushed my personal projects onto git. Glad to see that there’s still some choice and diversity though.
Gitea has come a long way, it isn’t crap on mobile unlike GitLab either. You can check out Codeberg which is Gitea.
I went the other way for my personal projects mainly because of
hg absorb
. Bundles, rollbacks, and the built-in webserver have all come in handy. The bumbling idiot in me doesn’t miss git one bit :)hg absorb
sounds great, but someone has already ported it to git.That’s inspired by
hg absorb
but their description of the algorithm makes me confident in saying it’s not going to produce similar results (I ported the core absorb algorithm from C to Python for inclusion in hg.)How does it work in mercurial then? I would have expected something similar.
It constructs a datastructure inspired by an SCCS Weave, using an algorithm learned from BitKeeper. That lets you do very fast blames and some other things, but then we abuse that in absorb by putting current history into even-number revs, then interpolating the edits into the odd-number ones, and then replay the odd-numbered revisions into the new history.
It’s hard to write it up briefly. The implementation is at least somewhat comprehensible - I tried to link to the most relevant section of the file.
None of the git absorb replacements I’ve used work perfectly. For example, they need to handle fixing all the branch pointers after the absorb because with git every commit might be a stacked branch.
@arxanas didn’t implement absorb into git-branchless because Sapling has it, but Sapling doesn’t seem to be healthy.
What do you mean by “healthy”? The folks on the Sapling Discord seem to be responsive if you’re having issues. Possibly the builds on GitHub are not in a good state, but people seem to be using Sapling fine despite that 🙂.
A friend’s review (the same one that told me about git-branchless) was that it was pretty good but you could get stuck in bad states (he didn’t get it resolved on Discord), and also that the word on the street was that some things (I think it was reviewstack) got hit by the layoffs and were ownerless.
A lot of the issues on GitHub have no comments.
I was about to say “that and they stopped doing releases last May after doing more than 1 release a month” but they just posted a pre-release to github 1 hour ago. That and this thread about the lack of releases has a reasonable response now: https://github.com/facebook/sapling/issues/724#issuecomment-1792973633. So maybe I jumped the gun on my assessment. My apologies!
This looks like E with a fancy syntax. Unsurprising given his involvement with E/Joule. E itself is a fine dynamic language with async and object capability baked in but the implementation(s) are bitrotting sadly.
If there is any demand for classic E, I can set up a Nix flake which bundles a JRE and starts a shell. However, it’s not a very interesting language as implemented. I’m currently working on a Nix flake for Monte, which is still incomplete but at least able to do some basic networking and filesystem access.
Read this during COVID. Then stumbled upon Herb Gross’s Calculus Revisited course from OCW (filmed in the 70s). His smile was infectious! Great course to go with this book. Remember reading a comment or two from him, in his 90s, on YouTube, offering encouragement to the viewer. Passed away that year.
This is ulitmately business news. While they have a lot of geek cred, I don’t see how this post is particularly relevant here.
Although it is news from a business, hence business news, there’s a lot of technical commentary inside too. IMO it’s not filled with buzzwords and gives enough for us to discuss their offering.
What it gives us, beyond a couple tidbits about fans and simpler electrical design, is mostly just a business pitch dressed up to feel attractive to developers.
There are interesting articles linked off of it, yes–but I think precedent here is typically to submit those articles directly (see also why we don’t really do newsletter submissions…better to submit the stories directly).
I have tremendous respect for the Oxide folks and what they’ve accomplished; make no mistake, though, that this is marketing and just a press release.
To me, this is on the bubble. There’s almost enough information about the hardware in their rack to count as a Lobster’s story, but I think it’s maybe just a hair shy. But maybe someone else will have a different opinion.
The golden rule: does reading it make people better programmers? I think some of the discussion here about how to think about hardware and software in tandem feels like this is on-topic (But I’m biased, cuz I listen to their podcast, am bought into the idea, and so know more context than what someone just reading this might get).
The entire software stack seems to be open source. That alone makes it relevant here…no? I personally have a ton of questions just skimming through a couple of repos there… I’d say this is wonderful stuff!
Just the awareness that there exist a product that is like “on-prem turn-key cloud” is a informative, IMO.
This requires that all elements in the array are pointers, right? In other words, using Go’s syntax, this would be a
[]*T
rather than a[]T
? If I’m correct, I think this would be improved by supporting the[]T
case which allows for storing either pointers or values. You would have to store the interval between items in the backing array–I think this is called the “stride” and I’m not sure how to get that in C.yes, it stores only pointers to objects. Like you said, you can store values as well by tracking the size of the object to be stored and do a byte-wise copy of the passed in object into the array slots. Hanson’s “C Interfaces and Implementations” details the design and implementation of a container library. Highly recommended if you are new to this.
The code from the book is here: https://github.com/drh/cii/
any thoughts on mutable value semantics?
https://www.jot.fm/issues/issue_2022_02/article2.pdf
https://www.youtube.com/watch?v=QthAU-t3PQ4
I downloaded the same PDF a couple weeks ago, and I’ve been meaning to read through it :) Someone who knows this stuff should really write an article like “here are some programs you can write in Rust but not Hylo and vice versa”.
The literal rule for what you can’t express in Hylo is: any Rust function that, if you made its lifetimes explicit, has at most one lifetime parameter. If that lifetime parameter is used in the return type, you need a “subscript”, otherwise a regular function will do. Oh, and references can only appear at the top level, that is very important. Vice-versa is simpler: Rust is strictly more expressive.
But that doesn’t quite answer “what programs you can write” b.c. there’s the question of writing “the same thing” a different way.
I’ll consider writing this blog post.
I want to make sure I understand the literal rule: Is “at most” a typo?
Yes I’d love to read that post. Please ping me on Twitter/GMail/whatever (same handle everywhere) if you write it!
Gah, sorry, yes, I wrote that backwards.
A Rust function without references can obviously be written in Hylo. If it has references only on the inputs, that can be expressed in Hylo:
Every Rust function with references only in its inputs can be written with a single lifetime parameter. Put differently, lifetime parameters are only ever needed when there’s an output reference. Thus the above function can be written with only one lifetime parameter:
If a Rust function has references on both its inputs and output, then it’s expressible in Hylo only if there’s at most one lifetime parameter when you write out the lifetimes. So:
The other, bigger, restriction is that references have to be top-level. You can’t nest them. So no
Option<&str>
, orHashMap<i32, &User>
. And no iterators over references! No such thing assplit_words() -> impl Iterator<Item = &str>
!Instead of a function returning an
Option<&str>
, the function would take a callback and invoke it only if the option is populated. And iterators need to be internal iterators: to iterate over a collection, you pass a closure to the collection and the collection invokes that closure once per element.I’ve added a file to my blog post ideas folder, and it says to ping you, so I’ll definitely ping you if I write it. Though no promises, as that ideas folder grows more than it shrinks…
Pretty cool. The only other game I know of using Lisp is Abuse https://github.com/videogamepreservation/abuse
coughs
And the Autumn 2023 one is going on right now!
Don’t forget Naughty Dog’s games like Jak and Daxter and Crash Bandicoot.
shinmera makes games (and supporting libs) in CL.
https://kandria.com/ https://github.com/Shirakumo/kandria
This was the earliest published work that I’m aware of on the language that became Objective-C. For a long time there were no publicly-available copies of this document (I had a paid-for copy for years).
Be grateful for what the syntax became by the time it became popular.
Brad Cox wrote a lot of things around this time that I found insightful 20 years later. In particular, his view on languages for implementing components versus languages for building systems out of components is something that we keep kind-of building accidentally.
He also had a proposal for blocks in Objective-C that had syntax that I liked a lot more than the C-like version Apple eventually went with.
His book Object Oriented Programming: An Evolutionary Approach presents most of the ideas in good detail.
I have a first edition copy of that, with the original ICpak examples! It’s super cool!
Wow - nice find. Yeah, this syntax is pretty different. I wonder how the syntax evolved into what it is today. Adding syntax to c and c++ seems pretty challenging, yet, they succeeded.
The Atkinson Hyperlegible font is exactly what I needed. Thank you! :grinning-like-a-cheshire-cat:
This is a great article, the gap buffer is surprisingly efficient compared to the fancier data structures.
I would have loved to see my favorite text data structure measured as well, namely the array of lines. It is briefly mentionned at the beginning but I think including it would have given interesting results. It works suprisingly well because on most text (especially code) lines are relatively small, which means you get a pretty balanced rope for free. Many text operations are very fast because it matches what people do with text:
I was surprised that there’s a single buffer in the gap buffer case. I’d have thought that this approach would work better with more than one. If you start with, say, 8 gaps scattered throughout the file and you move the closest one, possibly with some bias against the least-recently used one, you’ll end up copying a lot less data. If you keep the gaps sorted, finding the next available one is cheap. The worst-time complexity is the same (assuming a fixed number of holes), but I’d expect the average latency to be better. It would also be easier to extend to allow concurrent mutation with fine-grained locking (probably unimportant for a text editor, but who knows with EMACS).
I’m also curious what happens if you use Linux’s
mremap
for the moves above a threshold size (e.g. 128 MiB or possibly L3 cache size). If you’re only ever shunting around data within a couple of pages, but are shuffling pages in the page table, this might be faster in the worst cases.On FreeBSD, these buffers would almost certainly benefit from transparent superpage promotion. On Linux, you probably get a benefit from requesting superpages explicitly.
I was thinking that when a file gets big enough that it might be worth considering mmap shenanigans, instead shard the file into several smaller gap buffers. That would have a similar effect to using multiple gaps, with the restriction that they have to be spaced out a certain amount.
That will cost an extra layer of indirection to find the shard, but might be worth it. Iteration (as in the search) becomes a nested loop, but if the inner loop is over a few MiBs then that’s probably noise.
Reading your comment made me think: Are there any text editors that have two models, and the editor picks the best one when the file is opened?
Say that at file read time you quickly check whether the file looks like a multi-megabyte log file, or whether it looks like a JSON file without whitespace/newlines, or whether it looks like a standard code file. Then you pick a representation that optimizes for the kinds of operations most likely given that file (searching a big file, editing a code file, etc)
With a language like Rust it’s probably not even terribly verbose to put it behind an abstraction so the rest of the editor logic is unaware of the concrete implementation.
Reading the article made me think of a data structure in similar vain. A gap buffer of gap buffers. I’d expect an array of lines to behave poorly when inserting new lines in the middle, but a gap buffer of lines would alleviate that issue without much overhead. I wonder how much of an advantage the contiguous memory of a single gap buffer is in practice. Is regex’s slice API required by its inner workings, or would it perform the same given an iterator? Anyway I suppose the difference of ropey in helix and array of lines in kakoune is the reason why kakoune chugs on humongous files, but helix just stops completely.
list of lines works fine even for editing at the top of the sqlite3.c file (which contains almost 240k lines and is around 8mb)
inserting in the the middle (or even at the start) of a line string is also fine even for lines that span the entire screen (with wrapping)
the first version of my editor used this data structure and i could easily open and edit mb sized text files
That’s an interesting benchmark. Adding line numbers in kakoune is actually pretty good (
%<a-s>ghi<c-r>#
), however adding an empty line after each line (%<a-s>o
) is burning a lot of CPU and still running.Helix just eats the whole CPU at the sight of inserting a character.
Adding an empty line after each line is unfortunately not implemented in the most efficient way, we insert the first line, shift everyting down by one, insert the second line, shift everything down again… The datastructure itself would support a much more efficient implementation where we shift each lines to their final location once, but its a pretty chunky refactoring of the codebase to make this possible.
Also, when profiling here most of the time when doing that operation is actually taken by the indentation hooks, not the buffer modification.
It needs to be reworked to handle the iterator or streaming case. There is some discussion about it here. The regex author was doubtful it could ever be made as performant as the slice version, but we will have to wait and see. contiguous memory is strong advantage for the search use case.
I would be surprised if ropey was the bottle neck in helix. ropes should handle large files very well.
I thought emacs used array/list of lines instead of a big gap buffer. There was a time when both emacs and vim struggled with lots of long lines (log files mainly) but that is not a common use case. nvi, which is a beast of an editor when it comes to pathological input, also uses list of lines backed by a tree of pages inside a real database (a cut-down copy of BDB).
The article mentions storing the tracking information in a tree. This could be stored in another parallel gap buffer as well. There is a neat trick, which IBM (still?) has a patent on, where the position in the run array entries after the gap are switched to refer from the end of the document. The book “Unicode Demystified” explains this in detail. The SWT StyledTextCtrl makes use of this trick as well (possibly because it was written by the patent holders).
Looks like patents in the US last 20 years, and the “Unicode Demystified” book was published in 2002, so it’s probably expired by now!
So if you moved the gap buffer from the front to the back, would that mean you need go through and update every item to point to the front?
yes, if the gap moves then the text buffer contents change so the corresponding style/linestart gap-buffer needs to be updated as well. But these are run entries so in practice, updating these entries is not expensive.
Although no other filesystem tempts me away from ZFS, I question this claim. Linux may in the foreseeable future try to come close with bcachefs, and DragonFly BSD may have come close already with its HAMMER2.
Btrfs also tried to come close (and, in some ways, to be better).
I think a big part of the problem is that ZFS works well because of how the whole system interacts. Until you have a complete ZFS implementation, you don’t get most of the benefits. That makes it quite hard to develop in an environment that wants to do incremental improvements. ZFS
This is part of the reason that I think bcachefs is the most likely to succeed from the replacements. It didn’t start trying to solve the same problems, it started trying to solve a problem that is independently useful and which can then be used as a building block for something ZFS-like.
Yes, those are the first 2 projects that came to my mind, too, and I agree.
OTOH, ZFS is here, mature, stable, and runs on half a dozen OSes. It’s in Solaris, Illumos and derivatives, FreeBSD, Ubuntu, Void Linux, NixOS, Arch, and optionally other distros.
on NetBSD, Windows and macOS too. The portability story is just crazy. bcachefs will succeed of course as Linux has this emergent effect caused by the extraordinary churn… suddenly pieces fit, APIs/UIs happen around it and it takes over.
It also helped that the license was weak copyleft, so it was compatible with everything except an incompatible strong copyleft kernel. Permissively licensed and proprietary kernels could all use it. Apple got quite far but apparently were worried about Oracle patents and dropped it (they got as far as announcing it as a feature in their next OS release before dropping it, which was a shame, though APFS feels a lot like what ZFS would be if it had been created primarily for single-disk machines and not massive storage servers).
Sadly bcachefs is GPL’d and so is unlikely to be merged anywhere else (though may end up as an external module for other things).
Do you have any source on Apple being worried about Oracle? From what I remember, they are/were on quite friendly terms, and ZFS has an absolutely permissible license. It’s just Linus that has issues with it (even Ubuntu is fine with including it).
That’s what I heard from Apple folks back around then: the lawyers were worried Oracle would do something. Back then, Apple had the XServe and XSAN line and a solid ZFS offering from them would have competed in the server space with Oracle, using Oracle’s technology. I’m not sure exactly what they were worried about, possibly Oracle patenting things in their ZFS version and not open sourcing them so the Apple version couldn’t keep up. Or possibly the reciprocal terms in the CDDL made them nervous of giving up rights to Oracle.
Was the Solaris portability layer developed on FreeBSD during the initial port? Also a question for ZFS experts: is the DMU layer usable as a KV store? Came across some old posts talking about an API to enable that but can’t see anything. Closest is openebs’s cstor engine which, last I checked a few years back, added a network layer on top of the ztest (zfs in userspace) to provide volumes for k8s containers.
I think so, yes.
I’ve never seen it exposed to userspace as anything other than ZVOLs or via the ZFS POSIX Layer (i.e. as a filesystem). I wondered a few times whether it would be possible to port SQLite into the kernel and have a SQL interface that used the transactional storage from ZFS as the storage back end.
btrfs is the one I think about. and hammer2 is such a cool thing.
Hetzner will also attach KVM and plug in a flash drive baked with ISO of your choosing for a 3 hours long window.
For a small fee IIRC. That was how I had setup OmniOS on a dedicated server 6 years ago. Ran without a hiccup until Hetzner suspended my account 2 years ago without any explanation. Happy with vultr and worldstream.nl now.
I think there was a fee in the past but I’ve used it recently and haven’t seen any related fees added to invoices.
IIRC, it’s currently free for the first 3 hours, and an extra fee if you want to reserve it for a longer period.
long overdue. I only use the kj toolkit and the v2 API proposals look very promising.
QtCreator is actually a pretty capable IDE on its own, even if one doesn’t do Qt development at all. It’s blazingly fast and at some point had better “intellisense” than Eclipse’s CDT (before clangd came out at least).
It has support for CMake, works with Conan, integrates with vcpkg, and has an integrated pretty capable FakeVim mode.
I think that naming it like it’s named hurts the project, because people assume it’s for Qt development and nothing else, which is not true.
agreed on all points. CDT however has one distinguishing feature: it can parse build logs and pick up include paths and definitions which makes it quite useful for projects with non-standard build systems. The CDT indexer is also flexible in that you can just throw a tree of sources and get reasonable code navigation without configuring and building the project first.
and then proceeded to write a very nice actor implementation for squeak