Threads for peter-leonov

    1. 7

      So, in fact, Go is a sort semantically faster JS (given the duckily typed interfaces and reflection) if compared to C/Rust on a large rather conservative (no evals, method_missing and friends) codebase.

      1. 2

        This is especially useful in bigger function that tend to undergo refactorings. And the try/catch gotcha imo is the worst understood part of JS as of today leading to an almost one per 1000 LOC in my experience.

        1. 3

          If I correctly understand Rust history and the deliberate move away from green threads, one of the reasons was that supporting both native threading and other type of threading in the same codebase turned out to be too costly complexity and performance wise.

          Surely, async Rust is not the same as green-threaded Rust from the past, but I sense some similarities in the introduced new challenges.

          Does anyone else see this?

          Disclaimer: I love async Rust :) I’m just not sure it has to be mixed with OS threads by default in popular libraries.

          1. 1

            I thought they didn’t want to ship a specific runtime and green threads would require it,, so they went with a runtime agnostic model.

            1. 1

              It also would’ve added overhead to interacting with C code.

          2. 2

            Interesting.

            So what is actually wrong with a program that has an Rc live across wait points? As in if we disable static checks and just ran the program (like a C compiler) could there ever be thread contention and undefined behaviour? It seems like only one thread would use the Rc at a time… so an Arc seems like overkill at first blush.

            Can the cores of a CPU really be so out of sync that the refcount set from before the work stealing is not reflected correctly in the core that picks up the work? How would that happen? Maybe If it was stealing it back and had an earlier version of the future in cache? If so, you’d have similar issues with the other plain old data in the future… so it can’t be that.

            I can’t tell if this is a compiler false positive (completeness issue) or saving us from actual UB.

            1. 4

              The compiler simply enforces what the type promises, and in this case Rc says that it isn’t safe to send elsewhere. You could have some kind of RcSendable type constructible from an Rc with a refcount of one, or you could have some kind of structure containing Rcs that guarantees that they can’t be leaked to some other part of the program, and have them be Send, but making Rc itself Send in a limited set of circumstances would be difficult, for questionable gain.

              Keep in mind that it’s impossible to make a compiler that allows all correct programs but rejects all incorrect programs. So since Rust wants to reject all programs that have UB, it must also reject some programs that don’t have UB. Efforts are ongoing to increase the number of correct programs that Rust allows, but adding special logic to the compiler to allow fresh Rcs to be Send seems not worth it.

              1. 4

                So what is actually wrong with a program that has an Rc live across wait points?

                The problem is elsewhere.

                An Rc automatically frees its contents. It uses a refcount which is adjusted when the Rc is cloned or dropped. If Rc were Send then you could clone it into multiple threads. The refcount adjustments don’t use atomic instructions so they are likely to go wrong and cause use-after-free errors.

                1. 1

                  If a send an Rc that I own to another thread, won’t it be moved (neither cloned nor dropped - the refcount stays constant)?

                  (And then my question was if every last clone of a given Rc was sent/moved as part of a single owned value, in this case a future, to another thread, mightn’t that be technically valid?)

                  1. 5

                    You can’t statically prove it’s the last reference though, so the type system has to disallow the general case where it might not be the last reference. If you only need the one reference, perhaps don’t use an Rc?

                    1. 6

                      As an aside, it would be nice to have functions on Rc<T>/Arc<T> to go between the two types, but only if their reference count is 1.

                      impl<T> Rc<T> {
                          fn into_arc(self) -> Result<Arc<T>, Self> { ... }
                      }
                      
                      impl<T> Arc<T> {
                          fn into_rc(self) -> Result<Rc<T>, Self> { ... }
                      }
                      

                      That would avoid the need to reallocate the inner value. It seems like their current implementation has exactly the same memory representation (with the small exception that AtomicUsize can have more stringent alignment requirements than usize, although it should be trivial to just make sure the RcInner struct is aligned properly)

                      1. 4

                        I believe system allocators frequently align to at least 8 bytes, which means that in practice the RcInner should already end up aligned suitably for ArcInner.

                        Given that, you could implement this yourself. This assumes of course that RcInner and ArcInner don’t ever change layouts (or if they do, that the layouts stay identical).

                        fn rc_to_arc<T>(mut rc: Rc<T>) -> Result<Arc<T>, Rc<T>> {
                            // first, check to make sure we have unique ownership
                            if Rc::get_mut(&mut rc).is_none() {
                                return Err(rc);
                            }
                            // next, grab the raw pointer value
                            let p = Rc::into_raw(rc);
                            // check to make sure it's aligned for AtomicUsize.
                            // this pointer points to the value, not the header,
                            // but the header's size is a multiple of the AtomicUsize
                            // alignment and so if the pointer is aligned, so is
                            // the header.
                            if (p as *const AtomicUsize).is_aligned() {
                                // the memory layout of RcInner and ArcInner is identical.
                                Ok(unsafe { Arc::from_raw(p) })
                            } else {
                                Err(unsafe { Rc::from_raw(p) })
                            }
                        }
                        

                        That said, this is a very niche use-case.

                    2. 2

                      It’s pessimistic because it can’t prove at compile time that those things are safe at run time.

                      Moving an Rc across threads isn’t necessarily the problem, it’s what happens to the Rc before and after, how it is shared.

                  2. 3

                    You definitely can have “memory value is V0, CPU 1 writes V1, CPU 2 reads V0”, and you’re exactly right that applies to any memory location.

                    If you want to ensure writes made by one CPU are visible to another with certainty you need to issue instructions for that. Otherwise your write may be in a cache or memory queue not visible to the other CPU, or your read may be from a stale cache

                    1. 2

                      As far as I remember, Rust targets some abstract common denominator of existing memory models. This allows the compiler to make valid choices while checking the higher level code (what LLVM does on the lower level is highly platform specific though). That memory model is quite conservative, thus the errors. What would happen in reality for the potentially racy construct is sort of UB as it’s naturally not defined :)

                      1. 1

                        So any value that is Send and that is moved to another thread might have some instructions added so that it is read coherently?

                        That sounds fine in this specific case so long as the same instructions were applied to RcInner. But that is pointed to with a NonNull, for which the compiler doesn’t want to mess with and which isn’t Send

                        Am I on the right track?

                        1. 6

                          You make this sound like it’s automatic. The guts of Rc use raw pointers, which aren’t Send, therefore Rc is not Send. If you take something like Arc, the guts are also not Send, but Arc (unsafely) implements Send explicitly as a manual promise. Arc’s methods are manually coded so that the overall type behaves in an atomic/coherent way.

                          The whole point of Rc is that it doesn’t go to all that trouble, which has a cost as well, but it still allows an object to be referenced from multiple locations which are accessible to only one thread. (Including objects of types which themselves are not Send.) Yes you could modify Rc to implement Send. Congratulations, you implemented Arc.

                          Many primitive types such as i32 are also Send, and so are types derived from them. Not because the compiler inserts any special instructions or something, but because the mechanism by which it’s moved from one thread to another is assumed to be safe (e.g. channel, mutex, etc. - safe here usually meaning it uses a memory barrier of some kind), and there’s nothing about the type itself that needs special treatment.

                    2. 3

                      One is lucky to have distinct hats like these. In my operational modes it either I’m a tri-head beast wearing three most diagonally distant or at least also wearing a different hoody, pants and different socks. All on that day!

                      1. 8

                        I wanted to use the best technologies available

                        […]

                        I built the frontend using React and TypeScript

                        I was furious for a couple of seconds.

                        1. 1

                          I feel you, but the React stack nowadays is so deep and diverse that it feels like it’s gonna be soon affecting browsers’ architecture. They literary solved all the hard problems in a rather elegant way. I compare to my days with PHP3 and jQuery 🙂

                          1. 4

                            While my blog is in PHP. I really enjoy React actually. Also, I very much like this component library: https://mantine.dev/

                            1. 2

                              I don’t think it’s elegant by any means. In practice.

                              1. 3

                                It basically made functional UIs mainstream, which greatly improved testability, and correctness.

                                I do remember the millions of websites/small GUIs where you could easily end up in inconsistent states (like a checkbox not being in sync with another related state) and while UI bugs are by no means “over”, I personally experience less bugs of this kind.

                                (Unfortunately, API calls are still a frequent source of errors and those are often not handled properly by UIs)

                                1. 1

                                  Why not? Any points against? What would you use for complex web apps?

                                  1. 1

                                    I mean react itself, not your particular pick of options inside that stack.

                                    1. 1

                                      React itself is also cool.

                            2. 1

                              It actually means that performing the query and getting the thread scheduled again took 20 milliseconds

                              Normally, there would be a distributed trace that shows that the client span took 20ms and the actual query took 1ms. If such a margin shows up a lot some service mesh aware tools might raise an alarm and help the devs prioritize looking into the app server performance.

                              In some cases there might be network congestion involved (and let’s admit you’ve heard “this is network!” a lot when it wasn’t) but not likely for the whopping 19ms for P50. This type of discrepancy would quickly get demystified on a highly loaded app, IMO.

                              Also, if the telemetry is mature enough the traces should also show some system health metrics for the machine that ran the client span where we would see that the CPU consumption was near 100%. That’s a good sign that we cannot trust the time on that machine much, thus the client span is not reliable either. The next step would be running a profiler on the busy node to find out what’s going on. Likely there will be something about spending a lot of time in some JSON module. So, again not much space for mystery.

                              If the overall CPU consumption is low then yeah we’re in trouble with the app’s performance. But if it fits the SLOs and does not lead to any incidents or an excessive cloud bill, then… let it be?

                              1. 2

                                JavaScript is stuck with the same API for almost 30 years

                                Given how much the JS language, the JS engines, the overall JS tooling has progressed over the last decade, what do you think took Date so long to get fixed?

                                My guess is that we’re finally done migrating desktop apps into the cloud and can again focus on making the Web a global platform once again. And what’s more global than the properly localized time?

                                1. 2

                                  This looks real cool, thanks for sharing.

                                  One thing that confuses me in the string literal type magic is that we literary have to re-implement the underlying logic in another functional language that is not itself verified. Is it time for TypeScript for TypeScript? Or maybe a new imperative TS to type level TS compiler?

                                  Jokes aside, I’ve tried recently to verify a type rich code produced by a generator with another generator that would sort of fuzztest the first one while also being validated by the TS compiler. The rules to the unit test generator themselves quickly started looking like a language on their own :D Awesome times!

                                  1. 4

                                    There are two things that make me feel hopeful for the non-scripting programming languages world: Rust and WASM. Both share the common denominator of being secure by default, I guess we longed enough for this moto :)

                                    1. 2

                                      A pedantic “pretends to have studied visual design” point of view. Now “this” is even more visible through being the boldest character on screen with the most saturated spot of pixels. Such a ligature might even trick the eye’s motion recognition and peripheral vision pattern recognition leading to slower reading times than with the old boring same weight “this” word.

                                      1. 1

                                        I misunderstood the title as “writing a tiny Linux” meaning the author wrote a minimal Linux compatible kernel. Which brings a question if at the basic level Linux is in any way distinguishable from any other POSIX Unix with ELF binaries?

                                        1. 2

                                          Spectre all over again? Next time I add any kind of cache I’d be thinking what kind of info is gonna be leaked through it. What content is more popular? Where are the most active users located? How often do different parts of the app deployed, when and how many active teams may this identify?