1. 20

  2. 25

    Somebody trying to do joins in MongoDB, should just be using a relational database. If there are enough well defined fields that it’s meaningful to do joins, then the data clearly isn’t schema-less, and it’s probably wasting a bunch of space losing a lot of performance using Mongo.

    A PostgreSQL table with a JSON field or two is almost certainly faster, cleaner, and better tested than trying to do it in Mongo.

    1. 40

      To be fair, someone trying to use MongoDB at all should probably just be using a relational database.

      1. 1

        Are there relational databases that you can easily configure to automatically scale up/down by adding/removing shards?

        1. 15

          There’s nothing easily configurable or automatically scalable about MongoDB.

          1. 12

            I think that a lot of mental energy is spent on worrying about scaling the database when huge, huge wins in application code are much easier to address. There are very few workloads that a large machine running Postgres is not sufficient to handle; and very few applications that cannot afford the minutes of downtime that switching to a warm replica costs.

            1. 6

              Yes. I admin many postgres databases larger than 10TB. There are definitely use cases that postgres isn’t great at, and the single-threaded querying is a significant drawback, but I shudder to think how much hardware would need to be thrown at a similarly-sized Mongo install. Not to mention the data consistency issues, and admin time and effort.

            2. 1

              When you reach the limits of vertical scalability, you can use something like Vitess (the sharding system for MySQL designed and used by YouTube), but it’s not without some drawbacks.

          2. 1

            as an application scales, maybe it was okay to build without joins first, but as requirements evolve, and needs grow – it would be easier to just stick to mongo if that’s what you were using first, instead of rebuilding with sql.

            1. 12

              I find that often in tech engineering “easier” quickly becomes the same as “delaying pain.” Technical debt is an interesting term used here, but I think delayed pain is more relatable and accurate. At best one can shrug it off as “that’s for the next guy to deal with” and it’s true, but that’s just an admittance of wrongdoing that we are pretending isn’t a confession.

              tl;dr just bite the bullet and rebuild it with sql if the time comes

              1. 2

                well, instead of “easier”, maybe a better word would be “fast”, “less costly”, and depending on how entrenched the application was on mongo, you’d never know how much you were breaking by switching out the database and how much the application would have to change to encompasses the refactor – if I may better clarify what I was trying to say.

                1. 2

                  Props for clarifying. I still think it’s a bad business move in any long term view. I understand your perspective in that it is sometimes very hard to convert such an integral tech point. I also get that management is often loathe to spend money on the effort (this is something my current day-job is struggling with, and we’re running on an old Zope2 stack). These trends are still, in my managerial experience anyway, short sighted and ultimately incorrect. Our collective confirmation bias about how these things normally go down probably isn’t helping.

                  In the end this may be a difference of opinion for a while. I think what little research we have done in this area (beyond tech blog anecdotes and my OWN anecdotes) will hand-wavingly support my position, but I think it’s still pretty weak and I’d like more data before I can say you are actually wrong rather than I think you are wrong. I’m willing to wait for that data and differ with you as a colleague before that happens. If we HAVE had more data and I just missed it, I’m up for that as well.

          3. 6

            if, as the article suggests, this could push people towards slamdata as a solution, it would not be to mongodb’s benefit. one of the things on slamdata’s feature list is

            Avoid lock-in to the MongoDB query APIs by staying with industry-standard SQL.

            and on their front page they list a bunch of other backends in beta access. so if this is mongo’s attempt to lock people into their enterprise edition, this is one clear mode in which it might well backfire.