1. 39
  1.  

    1. 8

      I dunno, my intuition was “this is a trick question, can it just be 1?” and then worked up from there.

      1. 1

        Yeah, my guess was 1 too. I’m imagining a case where the both functions are running on two threads and the temp = n +1 and n = temp statements run roughly at the same time, setting n to 1 in both threads. I’m actually surprised the minimum is 2!

        1. 3

          I think the accesses to n have similar semantics as relaxed atomic reads and stores in the C++ memory model, so the accesses have a globally consistent order.

      2. 5

        Got nerd sniped.

        It can be reproduced in Python, by writing a kind of “scheduler” forcing the two coroutines to schedule according to the pattern given by the op, (and yes it gives 2 at the end). It also make it easy to try (or brute force) other patterns:

        n = 0
        
        class Cooperate:
            """Used to mark explicit "scheduling" points inside P and Q."""
            def __await__(self):
                yield
        
        
        async def P():
            global n
            for _ in range(10):
                await Cooperate()
                temp = n + 1
                print(f"P sets {temp=}")
                await Cooperate()
                print(f"P sets {n=}")
                n = temp
        
        
        async def Q():
            global n
            for _ in range(10):
                await Cooperate()
                temp = n + 1
                print(f"Q sets {temp=}")
                await Cooperate()
                print(f"Q sets {n=}")
                n = temp
        
        
        def exhaust(coro):
            """Just run the given coroutine until the end."""
            while True:
                try:
                    coro.send(None)
                except StopIteration:
                    break
        
        
        def run(p, q, pattern):
            for procedure in pattern:
                if procedure == 'p':
                    p.send(None)
                if procedure == 'q':
                    q.send(None)
            # Then run both coroutines to their end.
            exhaust(p)
            exhaust(q)
        
        run(P(), Q(), "p" * 2 + "q" * 19 + "pq")
        print(f"At the end, {n=}")
        

        Running run(P(), Q(), "") gives n=20 because P runs to the end, then Q runs to the end.

        1. 4

          It’s not safe for multiple goroutines to mutate shared state without synchronization, even if GOMAXPROCS is set to 1. In the example Go program, neither reads nor writes against n are guaranteed to be atomic operations, and so can be interleaved across goroutines by the scheduler in a way that can produce data races, and invalid state.

          1. 3

            My initial expectation was that the minimum was INT_MIN because a program that contains a data race like this in C is undefined behaviour and the compiler is allowed to replace it with anything. More practically, the compiler is likely to replace the loop with a non-atomic increment of 10, so I’d expect a real implementation to end with 10 or 20 as the two plausible values if compiled with optimisation.

            Then I saw that I was wrong because the test program was Go, where data races are undefined behaviour and the compiler is free to replace them with anything, but the name of the INT_MIN constant is different.

            1. 4

              Yeah, the author clearly intended the memory model for this question to be sequential interleaving only. That should have been stated up-front I think. And the Go program does not provide these semantics, as @peterbourgon remarked in a different comment.

              If the memory says these are data races, it’s of course UB, which is boring. But I wonder what the answer would be if they’re all relaxed atomic loads and stores? Can you get n = 1?

            2. 2

              Very silly CSS for the images. They scale to a small % of the width of the viewport regardless of zoom level, so the code snippets are hard to read with my browser window sized to fill half the screen. (If I make the window even narrower, eventually it switches into mobile layout and makes the image 100% wide.)

              1. 2

                Anyway, I found it not only counterintuitive but fascinating, even though we may not encounter it in practice.

                Actually, I think you could encounter this in practice, at least on weak enough hardware. Something as bad as the two threads being that hilariously out of step with each other? Maybe not, but I feel like a 9 or an 8 is realistically possible.

                1. 1

                  I wonder if you could get this behavior from some scheduler which has prioritization, and one or more additional tasks with different priority and varying time to yield. But my brain is not in a state to do more than wonder this week.

                  1. 1

                    It’s always nice to see formal methods in use, even if I agree with hcs that I didn’t find the answer all that surprising.

                    I did find the pseudocode somewhat unclear here; real languages tend to have much more semantics. (Is this C-like undefined behavior, Python’s ad-hoc concurrency guarantees, are our caches even coherent, …)

                    1. 6

                      I think what makes this interesting is that even if we assume the best possible case of no weaker language semantics, implicit data races, etc and that every single line is an atomic instruction, we still get this extreme outlier race condition.