1. 12

A couple years back, I was reading a blog post by Raganwald, where I read this quote:

A very senior Microsoft developer who moved to Google told me that Google works and thinks at a higher level of abstraction than Microsoft. “Google uses Bayesian filtering the way Microsoft uses the if statement,” he said. —Joel Spolsky, Microsoft Jet

That got me thinking very literally. What would it look like if we have probability statements to use natively like we have “if” statements? How would that change how we code? That would mean we could make decisions not just on the information we have on hand, but the prior information we saw before.

However, when implementing any machine learning code, you have to count and manipulate the samples yourself. I wrote Prolly so you can express probabilities that you want directly in code. So P(Color = blue) would be written as

Pv.rv(color: :blue).prob

Probabilities are generally easy, but can get harder as you write entropy and information gain. H(Color | Texture = rough) would be written as

Pv.rv(:color).given(texture: :rough).entropy

I implemented a decision tree with it, but will implement naive bayes and an HMM with it later.


  2. 3

    I think this is a cool idea. Do you have any experience with Prolog, which I believe has the capability to express probabilities.

    1. 1

      Not too much. I’ve only read code samples for intros to Prolog, but have never actually written any Prolog. Do you have any links?

    2. [Comment removed by author]

        1. 1


        2. 2

          This is awesome, you’re awesome, everything about or around this is awesome. I’m going to build so many neat things with this. Holy crap, this is awesome.

          I’ve been wanting, for many years, to rebuild something I built for a project in college that revolved around HMM’s for Music Generation, and this just made the barrier to entry drop significantly. I’m so excited now.

          1. 3

            I can’t tell if it’s sarcastic, but assuming it’s not–thanks. :)

            I haven’t built an example for HMM yet, but will do so.

            Be warned that I didn’t do any optimizations at all to make it fast, which I’ll do so later. Right now I’ve only implemented two stores: RubyList and Mongodb. RubyList is fast to insert but slow to query, since it uses Array’s count(). Mongodb is slower on insert, but faster on query. But once again, I didn’t optimize it at all with indicies, so it’s just using a BasicCursor to count, so its query can be much faster.

            In addition, right now, I’m just storing the entire joint distribution. In the future, I can split it up into different probability tables to balance between storage and speed of query.

            I’ll implement stores for mysql and postgres in a bit. I’d like to have a store for redis, but I haven’t yet figure out how to store the data and query it for a key value store.

            1. 1

              Not sarcastic at all, seriously, this is awesome. I’m really happy to see this as it removes some of the difficulty, it’s much easier for me to just manage it all with prolly than trying to map it myself, and the persistence you’re talking about is really awesome.

              If work weren’t so mindsearingly busy right now, I’d probably dig in and try to help out w/ that Redis store, I’ve used redis quite a bit (it’s pretty easy to use). The redis bindings for ruby are pretty straightforward, tbh I wouldn’t be surprised if you have an easier time getting it to work w/ redis than w/ a RDBMS. There is, as I recall, a “SET” structure in redis w/ a fast cardinality call. I’ve used it before to build a straightforward counting-set impl. There are also some interesting probabilistic datastructures which might be easy to build using what you already have + a memory store backend, that could easy building the others – I recall starring a Rust repo w/ impls recently, I’ll see if I can find it.

              1. 1

                The rust repo I was talking about: https://github.com/codahale/sketchy

                1. 1

                  Awesome, I’ll take a look at it. I did look at hyperloglog in redis, but it counts the union of sets, not the intersection of sets. I had heard of bloom filters as well, but I haven’t seen the other ones before, so I’ll defn take a look.

          2. 1

            I got this comment from one of my friends, that I thought would be interesting as a link:

            One formalism you might want to check out is weighted finite state transducers. FSTs can be used to encode decision trees, hidden markov models, n-grams, and more, and you can build up complex models from simple parts with the same finite state operators you learned in undergrad, like union, intersection, composition, determinization, and minimization.

            They’re widely used in speech recognition and machine translation. Google’s implementation is open source: http://www.openfst.org/twiki/bin/view/FST/WebHome

            And this is one of the canonical papers on how they’re used: http://repository.upenn.edu/cgi/viewcontent.cgi...