1. 10
  1.  

  2. 4

    It’s apparently time for me to try to really Grok coroutines; I’ve learned how they work several times in my life and every time I completely fail to figure out any actual use for them. A couple years go by, I forget the details, and if I am reminded of them again I try learning them again and go through the same patterns. So by now usually I shrug and say “guess it wasn’t that important”.

    But someone recently mentioned using them for implementing animations in video games, which is something that is otherwise a fiddly little swarm of irritating interlocking state machines. So I was inspired to take another stab and boy are they cool for animations. I was referred to this article on using them for I/O as well, which I am still trying to absorb but is pretty neat.

    The gist seems to be that you can have different functions but the same API for blocking vs non-blocking I/O. Their example uses database queries:

    function get_links(user_id)
      local users = query("select * from users where id = ? limit 1", user_id)
      local profiles = query("select * from profiles where user_id = ? limit 1", users[1])
      return query("select * from profile_links where profile_id = ?", profiles[1])
    end
    

    So in their unit tests, query() is blocking. When it’s called, the get_links() function stops what its doing and the OS puts the calling thread to sleep until its results are available. But in production, query() uses nonblocking I/O, but still presents a blocking API. When called, the get_links() function stops what it’s doing… and the event loop in the calling thread (and the database thread?) go on about their business doing other stuff. When the database call has data to return, epoll() or whatever lets it know, it looks up which coroutine created that request, and it just calls coroutine.resume(calling_coroutine, the_returned_values). …I think? The mechanics still don’t quite jibe in my head, I’ll probably have to write my own version someday.

    1. 2

      I come from Ruby and it’s weird to see coroutines as the hot thing now because we had them in Ruby 1.8 and everyone complained about it: green threads. Essentially it’s a thread that’s scheduled by the program and not the OS.

      Ruby 1.9 introduced native thread support (though GVL/GIL limits their parallelism to only during IO) and at some point green threads were rebranded as: fibers. A fiber is a coroutine. You can pause context and switch between them at will from within the coroutine.

      The main benefit of fibers is they live in the same process so don’t incur the overhead of pausing a CPU executing a thread and storing it’s state while loading another thread and having to deal with cold CPU cache. Also there’s no system limits on the numbers you can have (versus native threads have hard system limits).

      NodeJS uses an event loop instead of threads for scheduling (and I believe it’s a similar execution structure that allows for this).

      In Ruby fibers only run one at a time so aren’t hugely useful unless you also have a scheduler and an event loop. The Async gem provides an event loop implementation and a version of ruby introduced an api to bring your own scheduler that allows fibers to be auto switched on IO (similar to how native threads can do this in Ruby already).

      The main downside is that because the OS can’t preempt execution if you mess up your program one fiber can block the event loop for a long time. This is a common perf pitfall to watch for in node.

      That’s about everything I know about coroutines. From that limited perspective it doesn’t seem like coroutines by themselves do all that much, but rather you need some kind of reactor/scheduler. So It sounds like with this lang and other ones where coroutines are hugely popular (golang) come with one out of the box. Versus in other langs like Rust or Ruby you get the flexibility to bring your own runtime scheduler but cannot rely on having a reactor to schedule work into all the time.

      So in the example you gave, query doesn’t just schedule the IO, it schedules AND pauses the current coroutine execution (or maybe it waits until the developer tries to access users, either way, the transfer is automatic). When the IO is done, it signals to the reactor “I’m done, put me back in coach I’ve got data for my parent coroutine”. Then when the currently executing fiber/coroutine suspends it can jump back in and the CPU stays saturated.

      Yes each query must happen in order and wait for each other but the main benefit is that the transfer of execution allows another coroutine/fiber to execute while we wait. If not, then the event loop is blocked and the CPU sitting idle when it could be doing other stuff for us.

      Versus with async/await or mess up your callback wrapping you can block the reactor. They’re touting that you can’t do that in this lang.

      I think it’s a little reductive to say this all came from “coroutines being a first class feature”. Rather it’s to say for coroutines to be a first class feature it should:

      • come with a scheduler/reactor by default
      • auto switch execution on IO

      It’s still a little unclear if it switches execution on IO or when you try to access the variable you got from IO. Kind of an auto-future unwrapping feature (would would be pretty neat TBH)

      Either way, there no special “future” object or reference to the asynchronous execution, versus in other languages you have to call a method like “future.value()” or something to finalize execution and get the value (like thread.join) which buys us:

      • no special syntax needed working with async values.

      If it just switches based on IO and not on variable access the the only difference between this and Ruby’s support is that it’s got a reactor out of the box.

      1. 2

        I think the problem is the example given above (reading from a DB) is only a good demonstration of coroutines if you’re assuming you’re working in a system that doesn’t have preemptive multitasking. If you were working in (say) Ruby 1.8, where you have a choice between green threads and coroutines, green threads would be a better fit.

        But there are lots of other use cases where coroutines are a great fit and preemptive multitasking wouldn’t help at all. For instance, writing a conversation system for a game where you want the code to read in a linear way, but you are constantly yielding to the user for input, or running a REPL that yields when it wants to read input.

        You can do neat things with them to handle I/O in certain contexts, but they’re not for I/O.

        1. 2

          Hey! We worked together for a time at Heroku.

          where you have a choice between green threads and coroutines, green threads would be a better fit

          If I’m understanding correctly the difference between the two is the scheduler? As in green threads (as opposed to fibers) will preempt but coroutines will not, they must yield from within.

          Did I get that right? That seems like an important distinction, thanks for pointing tinpot.

          writing a conversation system for a game

          I’ve not heard that example before, but I like it since in a conversation, even if there are multiple people you (should) wait your turn to speak and only one person speaks at a time.

          1. 2

            Hey, long time no see. =)

            If I’m understanding correctly the difference between the two is the scheduler? As in green threads (as opposed to fibers) will preempt but coroutines will not, they must yield from within.

            Yep! You could do a diagram:

                           |   preemptive   | not preemptive
            ---------------+----------------+---------------
            concurrent     | native threads | programs on different computers I guess?
            not concurrent | green threads  | coroutines
            

            (In this context, “concurrent” is used in its original sense of “actually happening at the same time” rather than the recent redefinition that has become more commonly-used in certain programming communities.)

            I’ve not heard that example before, but I like it since in a conversation, even if there are multiple people you (should) wait your turn to speak and only one person speaks at a time.

            Here’s some code from a game I made a few years ago that reads linearly but uses functions which yield and resume according to gameplay events: https://p.hagelb.org/adam.fnl.html It’s a little twisty because of inherent complexity in the branching nature of the conversation, but the code can focus on the inherent complexity of branching etc rather than the details of how to interact with the dialog system. The system itself is at http://p.hagelb.org/dialog.fnl.html