Transparent superpage support was added in FreeBSD 7.0 (2008) and has been enabled by default since then, without the performance issues that the Linux version seems to have had. I am not sure why the Linux version has had so many problems, but it looks as if theyāre not demoting pages back after promoting them. For example, as I recall, if you fork in FreeBSD and then take a CoW fault in a page, the vm layer will instruct the pmap to fragment the page in the child and then copy a single 4 KiB page, so you donāt end up copying the whole 2 MiB. Thereās also support for defragmentation via the pager, though I donāt think thereās anything if memory is not swapped out, which can help recombine pages later.
Huge page page faults are absolute hogs. The fault handler emit TLB shootdowns for every single 4k page in the huge page region, which takes about 200us for every fault for a 2MB HP.
Because of this, THP are usually bad news for latency critical applications as this behavior will cause absurd tail latencies. Itās even worse when using them with allocators that do not natively support them. Even those that supposedly do (eg jemalloc) show iffy tail behavior.
From my experience, the best use case for HP are ring buffers (either SW/SW or SW/HW) where the capacity is known in advance and the pages can be pre-faulted. But thatās a very tailored situation that doesnāt broadly apply.
Huge page page faults are absolute hogs. The fault handler emit TLB shootdowns for every single 4k page in the huge page region, which takes about 200us for every fault for a 2MB HP.
Iām not sure I understand why you need to do all of the shootdowns, but at least in FreeBSD situations that need to shoot down more than one page are batched. The set of addresses is written to a shared buffer and then the pmap calls smp_rendezvous
on the set of cores that is using this pmap and does an INVLPG
on each one. Hyper-V also has a hypercall that does the same sort of batched shootdown. Iām not sure how this changes on AMD Milan with the broadcast TLB shootdown.
FreeBSDās superpage support on x86 takes advantage of the fact that all AMD, Intel, and Centaur CPUs handle conflicts in the TLB, even if they architecture says that they donāt. This means that promotion does not have to do shootdowns, it just installs the new page table entries. As I recall (and Iām probably mixing up Intel and AMD here), on Intel architectures the two entries coexist in different TLBs, on AMD the newer one evicts the older, and on Centaur the cores detect the conflict, invalidate both and walk the page table again.
I did not dig any deeper and my root-cause analysis could be wrong. Here is a relevant ftrace if you are curious: https://gist.github.com/xguerin/c9d97ef50701bd247a219191cb37ec8a. Total latency is 271us. Largest cost centers are: 1/ get_page_from_freelist
takes a whoopy 120us; 2/ clear_huge_page
takes another 135us (admitedly 2/ is not stricly required as part of the overall operation).
Desk. I use the iPad with the Planck as a mobile workstation.
GATs are pretty huge right? I feel like Iāve seen āwe could do X if we had GATsā all over the place.
It will allow us to specialize our callbacks with their owning type and therefore rely on static dispatch instead of dynamic dispatch.
Callback evaluation for asynchronous I/O through a bespoke I/O runtime. Think something like Socket<Delegate>
where Delegate
is your callback-handling trait, in places where you have a HttpProtocol
that specializes on Socket
and needs to self register as the Delegate
. Impossible without HKT, but GATs enable this with a little trait tomfoolery.
The funny thing is at last job I needed GATs to do something tricky. Now for the life of me I canāt remember the details, but itās a pretty big deal to have associated types that are easily generic. Just the lending iterator alone can allow things that are rather simple in scripting languages but restricted in earlier versions of rust.
This is the client Iām talking about: https://github.com/nooberfsh/prusto
I find building something not only alone but for oneās own benefit to be an incredible source of creative freedom. Free from external constraints, one can really explore designs and architectures that fit oneās requirements and expectations. I find myself doing that a lot when healing from bouts of burnout from time to time, and it has always been of tremendous help.
No need to make the rules optional. Itās an incentive to not understand them in the first place. Just remember that the rules are made for Man, not Man for the rules.
Or⦠donāt perform a generative stabilization loop in your destructor. This action is sufficiently meaningful to deserve its own method.
[[tangential]] whatās with all the swearing nowadays? Has it become impossible to drive a technical point without swearing? Iām maybe becoming one hell of a grizzli bear but I find that to be a major turn-off.
Thereās no swearing here by typical standards. There is a Bowdlerized expletive in the title, but that word has explicitly been neutered. The author committed a crime against words in order to avoid swearing in front of you.
If someone comes to you and ask you to make things sloppier, your answer should be no.
Words to live by.
I am recurrently trying to build a minimalist LISP. I use it as a platform to try a few things like continuation-passing style, GC vs ARC, async vs MT, etc. Itās also a place of my own where I can be as anal about the code as I want to, some kind of engineering-oriented mental bachelor pad.
Discussing the future of operating systems is fine, but not while ignoring the past. There had been at least 20 years of OS research before linux with a lot of interesting ideas wildly different from the āeverything is a fileā world view.
For instance, z/OS. It has been powering large mainframes for 40 years. Every user run in its own VM (z/VM) where he has access to a full OS. Everything is not a file but a database. Itās the grand-daddy of cloud OSes.
Symbolic operating systems, like those on Lisp machines, also are a lot of fun to read about.
Thatās VM/CMS (CMS, Conversational Monitor System, being the per-user VM). z/OS is a batch processing system historically, still used for that role.
Server-side I would bet that in this day and age the POWER fleet is probably the largest after x86/arm. Z might not be weighing much in terms of raw population count but itās powering a lot of large, critical systems. In the deep-embedded world STM chips are pretty widespread.
The larger STM32, yes. Not the 8-bit/16-bit micro-controllers. That being said, the generations I was used to are now marked ālegacyā (ST7/ST10), so I might very well be wrong.
I get the impression new designs arenāt using the 8 or especially 16-bit MCUs anymore. Cortex-M0 (and soon RISC-V) ate their lunch. Of course, theyāll be around forever, as embedded designs tend to be.
Racket:
(for/fold ([p #f]
[cnt 0]
[counts null]
#:result (cdr (reverse (cons `(,p ,cnt) counts))))
([c (in-string "aaaabbbcca")])
(if (equal? p c)
(values c (add1 cnt) counts)
(values c 1 (cons `(,p ,cnt) counts))))
Also using fold, with mnm.l
:
(def challenge (IN)
(foldr (\ (C ACC)
(let ((((V . N) . TL) . ACC))
(if (= C V)
(cons (cons C (+ N 1)) TL)
(cons (cons C 1) ACC))))
IN NIL))
Racketās group-by
is wonderful but I usually want to group consecutive equal items into clumps as they arise rather than a single monolithic group.
(define (group xs)
(match xs
[(cons x _)
(define-values (ys zs) (splitf-at xs (curry equal? x)))
(cons ys (group zs))]
[_ null]))
(define (encode xs)
(for/list ([x xs])
(list (first x) (length x))))
(encode (group (string->list "aaaabbbcca")))
I never understood the value proposition of ReasonML. Was it really to make ML more palatable to a corpus of engineers broken to imperative-style languages (and more specifically JS)?
The slogan I heard was āReact as a languageā, which makes sense.
React is a framework that encourages use of a functional style for your apps. And React is often written in dynamically typed JS. So it makes sense to write React-style apps in a statically typed functional programming language.
OCaml is arguably small (itās definitely not easy to find a job for it), but Iāve been using it for web development, hardware development, and even to play around some graph algebra. I love ML in general, but I am particularly fond of OCaml for the above-average quality of the ecosystem (and PPX extensions!).
At 10 years in, I find myself agreeing with everything. Itās sufficiently rare an occasion to warrant this shamelessly me-too-ing comment.
There is no point in using std::vector
if the size remains contant. There is std::array
for that, which has a (mostly) compile-time interface that is comparable to primitive C arrays.
Two things to note though are that you have to know the size at compile time, and std::array
has automatic storage duration, so depending on size and environment that may be prohibitive.
I think the problem is the intersection between needing a large fixed size array and not being able to pay the overhead of storing the one extra pointer in a std::vector
is fairly small.
The year of the mainframe, as a computing model if not as a technology: modern personal computers as dumb terminals, large batched jobs, large data sets stored remotely.
Itās the same architecture as the Infiniband verbs (WR+CQ, a primer), one of the (if not the) best asynchronous I/O interface Iāve ever used. It definitely looks like a promising piece of work.
Helix, with almost default config, just some custom keybindings:
But a confession: I use IntelliJ for work projects :P
which terminal emulators do you use? On various computers/platforms?
On macOS, I use Kitty, mainly because the split and tab functions.
I also use the same helix config on github codespace. It works really well.
I find it works really well in Windows Terminal.
Yep pretty much like me and for the opposite reason of @eBPF using Neovim above: because it really does not need any plugins for my use cases.
Same config, give or take a line š¤