VSCode is amazing and scary at the same time, iām sure iāve already been hacked through some extension XD
just shooting my mouth off, but yea hardware is looking so attractive compared to $70k/mo cloud bill. the thing that depresses me is that it would mean going back to old school ops practices. all we got out of this cloud era is garbage tools kubeternes & docker. nobody in their right mind would run kubernetes on their own HW (which is probably by design), itās terribly designed. fortunately tools like Rust & even Pythonās poetry have really started fixing the code isolation, I would 100% feel safe running Rust apps on a server without docker (90% safe for Python XD). but man the orchestrationā¦. what happens if I need to upgrade the kernel or add a new HDD? etc etc etc
Hmm, I feel like thereās too much to respond to here. Iāll chose the second/third sentence.
the thing that depresses me ⦠all we got out of this cloud era ā¦
I guess if you are asserting you donāt like the current era of tools, you would rewind history back to where you liked it. So if we roughly went (pardon the reduction in parens):
Then you can rewind time to an approach you like. But then you are asserting that we made a mistake somewhere. I see some people going all the way back to dumb terminals even now in a way. Thereās cloud gaming (screen painting) and rumors of Windows 12 being cloud-only. Itās not all or nothing but the pendulum of control vs flexibility or economies of scale I think are fairly pure. You want to centralize for control but then your costs are very high (surprise! thatās what #1 -> #2 was).
So if you rewind time to #3 and use ansible, you probably arenāt going to saturate your servers. This wasnāt entirely the point of era #4 but there was some sales pitch at that time. āDonāt ssh in to manage your servers! Use chef/puppet/ansible! No pets!ā. So if you rewind past cattle, you have pets and youāre saying pets are ok for you. And mixed in here are many other things I canāt mix into this layer like where does cloud fit in? Cloud sort of forces your hand to use more software definitions. The vendor probably has an API or some kind of definition tool. Itās interesting that this didnāt happen in the same way around #3 because you werenāt entering their domain to rent their servers where you are a guest that must conform because they have many customers and you are just one.
This is mainframe vs app mesh debate is very current in the HPC world. I think many have settled on hybrid. This isnāt surprising to me. I personally err on hybrid, always. I guess Iām a hybrid absolutist (I guess Iām an absolutist?). Get your infrastructure to the point where you can make an infrastructure purchase decision in your own colo or the cloud. Have connectivity/tools/skills/culture/money ready to do either at any time. Mix and match. Of course, there are always caveats/trade-offs.
Idk how to unpack the bits about poetry, kernel, HDD without writing a ton more.
There are several forks/flavors of Kubernetes, like minikube, KinD, and k3s, which can be run at home on commodity hardware. I used to run k3s at home, and probably will again soon.
i played with io_uring through tokio in rust (⦠a year ago or more?), shit was so fast. i kept thinking something was broken in my benchmarking code. i still donāt believe it tbh
voice control systems need less delay, I wish they were virtually instant. The pause between speaking & waiting to see if it does what you want just canāt beat how instantaneous the keyboard is for me
Agreed, especially if it misinterprets what you say. You then have to wait for it to be wrong on top of everything else.
I once asked a voice assistant to refer to me as Hawk, it replied: Did you say HOK?
love me some VHDL, it looks so lovely. i always wanna do more hardware, and always think reconfigurable computing is missed opportunityāGOTTA GO FAST š¦
i love the sentiment, i hate python, but i write a lot of it. so i posts like this makes me feel people are working on improving my life UwU
For many of us, Python is an integral part of our career. I have yet to work somewhere that did not have a Django or Flask application.
How do we make sure that the unit vm only executes N instructions before it suspends itself? A pretty lightweight solution that requires very little setup is to insert a check that we havenāt exceeded the number of operations every K ops and after every label (in the unit VM you can only jump to labels, so we catch all loops, branches and sketchy stuff with this).
Love the post! very succinct & easy to read
Why not use interrupts? Seems like CHECK_LIMITS
adds a lot of overhead if itās required after every label ā especially the small looping example. Iāve only used interrupts on AVRs (really easy to use!), so this may be a really naive question. But Iād imagine you may also be able to use it to estimate instructions ran :O
A question to ask in return: How do you raise an interrupt (or a signal, for that matter, I will use the term exchangably) after a certain number of VM instructions have executed?
Assuming you get access to an interrupt, it will probably be raised after some time duration has passed or after the VM host has executed a certain number of native instructions. If I understand correctly, the author wants to check against a maximum of guest instructions executed. It will be difficult to find that a condition on that limit, except for an explicitly coded check. Thus think that CHECK_LIMITS
fits well into the solution chosen.
A good thing is that the most of the times the check will pass, allowing branches to be predicted accurately for the dominating number of VM instructions executed.
Depending on the precision required, the limits could be checked less often, e.g. only every 5 VM instructions. Special attention needs to be paid to the basic blocks of the VM code, otherwise e.g. a short loop could completely bypass the check if it is comes to lie between two checks.
If I understand correctly, the author wants to check against a maximum of guest instructions executed.
I think instrumenting is the best way to do that, to calculate accurate statistics. but I assumed the author wanted the code to run AS FAST AS POSSIBLE, bc they were saying they wanted to run a lot of sims simultaneously, but then wanted to stop runaway bad code & report on itāso I believe interrupts would be the best way to do that
but then again, itās interesting technical question, but how does this fit into the game theyāre developing XD
Iām actually not completely sure how interrupts work in x86 at all so I cannot really answer that, but I believe that would require OS context switches since I donāt think you can have interrupt handlers in userspace. I guess your idea would be to still increase rax
and then from time to time execute an interrupt that checks how we are doing with the counter right? I believe that works a lot better in microcontrollers since the amount of time that an execution takes is a lot better defined (due to not having concurrency and multiprocessing). So, and I may be very wrong here, we would have to run the interrupt every X time (with X being very very small, somewhat smaller than the smaller possible instruction limit?) so the amount of context switching may be more overhead than the cmp && jl
combination of most checks. In AVR there is no context switch cost almost so that may be a better solution there.
Also checking with perf
it seems like the branch predictor works pretty well with this program (<0.5% of branch misses) so most of the time the cost of the check is fairly small since itās properly pipelined.
Let me know if my (very uninformed) answer makes sense, and if someone else with more context can add info even better.!
Thanks for the kind words!
I donāt think you can have [hardware] interrupt handlers in userspace
That sounds right, after researching a bit, it seems like POSIX offers a way to software interrupt your program with alarm
or setitimer
, which I think the OS triggers from underlying hardware interrupts. And then you could pull out virtual time to estimate the cycles. Or see if software profilers are doing something more sophisticated to estimate time.
https://linux.die.net/man/2/alarm
sleep(3) may be implemented using SIGALRM;
https://pubs.opengroup.org/onlinepubs/007904875/functions/setitimer.html
Intuitively I think this would improve performance quite a bit. But measuring a real workload with & without instrumentation should tell how much performance could be gained.
the main idea is a Real-time strategy game (RTS)
Sounds very cool!
Signals have a lot of troublesome baggage, though. It varies by OS, but itās often very dangerous to do things in signal handlers because the thread might have been interrupted at any point, like in the middle of a malloc call so the heap isnāt in a valid state ā on Mac or iOS you canāt do anything in a handler that might call malloc or free for this reason, which leaves very little you can do!
If your handler tries to do something like longjmp and switch contexts, it might leave a dangling locked mutex, causing a deadlock.
As far as I know, most interpreted thread implementations use periodic checks for preemption, for this reason.
Theyāre also really slow. They need to find the signal stack for the thread, spill the entire register frame context there, and update a load of thread state. Taking a signal is often tens of thousands of cycles, which is fine for very infrequent things but absolutely not what youād want on a fast path.
cat flo.txt | grep 'bw=' | awk '{ print $4 )* | sed -e 's/K.*//' -e 's/\.//'
An independent 3rd party professional benchmarking firm was stripping the unit, and deleting the decimal point
https://skylab.org/~ryanm/screenshot/battlestation_2022.png
Paste this in your JavaScript console if you wanna quickly see all the images! (except the ones that go to HTML pages :P)
[...document.querySelectorAll('.comment_text a')]
.map(x => x.href)
.filter(x => x.match(/png|jpe?g/i))
.forEach(x => document.write(`
<a href="${encodeURI(x)}">
<img src="${encodeURI(x)}" style="max-width: 32%; float:left">
</a>
`))
š³ Sounds like theyāre trying to figure out async semantics for some WIP language called Encore [1]. async in this case meaning waiting for some concurrent computation (not IO). Seems kinda of cool, the syntax look awful though. Julia syntax looks nicer for this type of thing [2].
I havenāt really played with these types of distributed semantics too much, but Iād imagine a majority of my effort would be messing around with data locality & movement.
I also find the async computation semantics a bit odd, because the concurrent pieces are already encoded in the logic. Example
[x * 2 | x <- [1..1000]]
thex * 2
can already been seen as concurrentāso is it faster to distribute out thex
data, computex * 2
concurrently, and bring the results back? Think of all the fun you can have figuring that out!The other piece sort of hanging out in the back of the room is incremental computation. Great, I wrote the most optimized way to concurrently compute some dataānow do it again with just a bit more added. It also seems like a piece of the data locality issue, if later on a mutated copy of data can be computed concurrently, only moving the changes to do that seems more efficient (but maybe not!).
Looking at stuff like this makes me feel like the gap between where Iād like programming and where weāre at is massive. So FWIW these days I just use a massive machine with 112 vCPUs and lots of ram, so I donāt have to deal with data locality.
[1] https://stw.gitbooks.io/the-encore-programming-language/content/
[2] https://docs.julialang.org/en/v1/manual/distributed-computing/#Multi-processing-and-Distributed-Computing