1. 35
  1.  

  2. 5

    This is quite good. It compares favorably to the Joy of Clojure coverage of similar material. I also didn’t know that clojure had a futures library built in. Coming from scala, it makes me a little nervous that clojure is happy to just let you make a new thread for every future, although presumably the pool it uses is configurable. The sections on promises + futures might be more clear to readers who haven’t learned promises or futures before if you point out how futures are sort of like special cases of promises.

    1. 7

      Future means something a little different in Clojure than in most Java libraries–e.g. Netty Futures are more akin to Clojure promises plus (kind of) core.async channels. For queued (as opposed to directly threaded) execution in Clojure, one typically reaches for agents, which have both an unbounded IO and bounded CPU pool. Or just fire up a ThreadPoolExecutorService or similar.

      One of the reasons futures are unbounded, by the way, is because bounded threadpools with arbitrary dependencies on other tasks can lead to deadlocks. Clojure opts for safety in that respect. In practice the tradeoff seems to work well.

      1. 4

        Since I’ve never used clojure futures before, this may be completely wrong. Please let me know if I’m 100% off base.

        Most of my work with futures is with scala futures or twitter futures. Promises are pretty similar to clojure Promises in both cases. Here is an example of how you might use a promise with twitter futures.

        def apply[A](f: => A): Future[A] = {
          val p = Promise[Unit]()
          val t = new Thread(new Runnable() {
            def run() {
              p.setDone
            }
          })
          t.start()
          p
        }
        

        However, Futures just represent asynchronous values–fundamentally, they are a poor man’s data-flow variables. Promises are special cases of Futures in scala land.

        In clojure, it seems a little different. Futures spin up a new thread, and aren’t very flexible. They could be implemented with Promises in the background, but aren’t exactly Promises (they can’t be fulfilled except by their own thread) and Promises aren’t exactly Futures (they don’t spin up a new thread, and anyone can satisfy them). I guess my statement, “futures are sort of like special cases of promises” is strictly wrong, in that futures aren’t promises. However, it’s easy to think of futures as promises where only someone else can control delivery.

        With respect to threadpools, I think it’s a case of picking your poison. If you have something cpu intensive to do, and you stick it in a future, you’re going to be losing something to context-switching unless you explicitly limit your max concurrent requests to the number of logical cores on your machine-ish. When using bounded thread pools, you’ll simply increase your latency, which is typically a pretty nice form of backpressure. It’s also easy to imagine thread leaks from unbounded queues, or cases where you queue infinitely and end up with incidentally adding latency by just having way too many threads for your scheduler to deal with. The deadlock case is definitely valid, and I have personally found that deadlock bugs are very difficult to track down (maybe I’m just dumb though) so in general I would say that bending over backward to avoid deadlocks is probably worth it. Maybe it would make more sense to just have a flexible bounded threadpool which can grow (+ log many error messages) in case of deadlocks.

        Edit: I work mostly in soft real-time distributed systems, so I phrased everything in terms of requests and responses. Most of my points can be rephrased in terms of non-distributed systems, like processing a lot of data you’re scraping from somewhere, or similar.

        1. 5

          You’re correct; spawning unlimited threads can be problematic, which is why Clojure programs tend not to spawn unlimited futures. Just think of it like this: in Clojure, “future” means “thread plus promise”.

          Twitter and Netty futures, on the other hand, are, as you’ve keenly observed, dataflow variables for chaining together asynchronous side effects. In Clojure, there are a few different paths to async dataflow.

          1. Agents provide state plus queued functions to transform that state, executed by either a bounded (CPU) or unbounded (IO) threadpool. (send fun agent arg1 arg2 …) calls (fun current-agent-value arg1 arg2) to obtain a new value of the agent. We use (send) for the CPU-bound pool, and (send-off) for the IO pool. This provides asynchronous, sequential state transformation–like actors, but purely reactive; no autonomous behavior. Agents are integrated with the atom/ref transaction system, and you can attach observer functions to agents which are called when the value of an agent changes, just like you can attach callbacks to Java Futures. You can actually register callbacks and validators on vars, atoms, and refs too, by the way.

          2. Lamina, one of Zach Tellman’s libraries, provides dynamically reconfigurable async channel topologies for dataflow/asynchronous programming. You can attach arbitrary callbacks to Lamina channels as well as apply higher-order channel combinators. Lamina channels also integrate error propagation.

          3. Core.async also provides dataflow/asynchronous channels, but are simpler–they don’t offer callbacks or error propagation. Instead, core.async follows more of a goroutine approach; channels are unidirectional synchronization points, supporting either bounded or unbounded depth, with or without blocking reads, and with or without backpressure. Core.async uses code transforms to rewrite sequential-looking functional code into asynchronous finite state machines, executed on a bounded threadpool; neatly solving both the value-synchronization and thread starvation deadlock problems.

          There are other options too, depending on what kind of concurrency pattern you need. Kilim and Jetlang are readily accessible, as is all the j.u.c. tooling. I tend to mix-and-match depending on the problem. All this stuff will be introduced later in the book; it’s just a bit much to throw at new programmers right out of the gate. ;-)