1. 14
  1.  

  2. 11

    My hard-earned response to this is: just don’t do it. There is a minefield of gotchas under Mnesia, and they will maim you and your production system. Mnesia was built for configuration management, not for OLTP. You wouldn’t suggest using Apache Zookeeper as a production database, why suggest mnesia?

    1. 4

      I’d be interested in some examples of gotchas, documentation references, and the like, if you have specific references handy. I know of a few basic ones like the disc copies vs ram only vs disc only, but I haven’t used it enough to encounter other gotchas. I hold most of that knowledge from reading books or documentation.

      1. 4
        1. Two-phase commit.

        2. Since Mnesia detects deadlocks, a transaction can be restarted any number of times. This function will attempt a restart as specified in Retries. Retries must be an integer greater than 0 or the atom infinity. Default is infinity.

        This is from: http://www1.erlang.org/documentation/doc-5.1/lib/mnesia-4.0/doc/html/mnesia.html
        In practice this means that a big transaction can be preempted in perpetuity by an onslaught of smaller transactions on (a subset of) the same data.

        1. When you get “Mnesia is overloaded” warnings in production. At 4 am.

        2. Bad performance on sync transactions -> you move to async -> then move to async_dirty. Now you could have simply be optimistically !-ing to other nodes’ ets-owning processes without the headaches of mnesia cluster setup.

        Most oldtime erlangers have good mnesia stories, talk to them and be amazed :)

      2. 2

        This is a fascinating assertion. Thanks for chiming in, I’ll read up a bit more.

        1. 2

          This discussion is timely for the thing I’m currently building. I had already come to the conclusion that Mnesia had too many gotchas for me to handle but I’m still hesitating between using lbm_kv and going the riak_core+partisan route. Both options seem built on top of Mnesia, iirc riak uses a patched version of Mnesia.

          I got thousands of long-running processes updating their state every few seconds. This will soon take too much memory (because of process heap size) and later on it will have to be distributed anyway. My idea was to store state as native Erlang terms (to preserve read/write perfs and be responsive enough), and have drastically less processes than one per “state” with a pool of workers that would update the store instead and be released to the pool to move to the next thing to do. I also think that it will make it easier to move to distributed later on.

          Do you have thoughts on this?

          1. 2

            riak did not use Mnesia. neither does partisan’s version of riak_core, iirc.

            1. 1

              I think the Whatsapp scaling video has some details on this. Not sure if it is 100% relevant to you but it is worth a try.

              https://www.youtube.com/watch?v=FJQyv26tFZ8

              1. 2

                I rewatched last week and it’s not that relevant IMO but thanks anyway :)

                The thing is, my current challenge is going from a single node to distributed, their challenge was to overcome the practical limits of the maximum number of nodes in a cluster (fully connected mesh), so basically going from a ~1000 nodes cluster to >10k nodes cluster. What I’m working on is too specific to ever reach close to 1000 nodes but unfortunately still a bit too big to stay on a single node (or to be more precise: we could probably scale up but one machine with loads of ram is more expensive than a few smaller machines).

                Still a great talk for people interested in pushing things to the extreme!

                1. 2

                  Yes, sorry I could not be more useful. I think the riak_pg might be still useful to you when going from 1 node to N. Or maybe you are not trying to go this route either. Anyways, if you write a blog post about your experience solving this problem I would be happy to read about it. I need to jump this hoop soon with my pet project so it is relevant to me.

                  https://github.com/cmeiklejohn/riak_pg

          2. 3

            Love Mnesia. The story behind its name is droll – upper management at Altel/Ericsson said: `there’s no way in heck we are calling this DB Amnesia!’ So the core Erlang team re-named it to what it is today.

            Small niggle,

            Another important thing to note is that Mnesia stores all of the Erlang data types natively…

            I believe, under the covers, DETS (or when Mnesia is used to write to disk) converts atoms to strings, which has a performance penalty. Some info on that here: https://www.slideshare.net/ErlangSolutionsLtd/erlang-meetup-19-september-2017

            1. 1

              I think mnesia is one of the killer apps of the erlang ecosystem, I wish I had something like it in all my other languages. I’ve not yet learned elixir, but this has increased my interest.

              1. 3

                I kinda wish Elixir would wrap Erlang stuff more. I know you can use it, but it feels different and just so slightly unidiomatic enough (often Erlang stuff doesn’t work nice with |> because of position of arguments) it doesn’t feel great to use.