1. 96

  2. 39

    For anybody needing to flee the sinking ship migrate from Mongo to PG, especially if you need to run systems in parallel as you’re cutting over, having good tools is super helpful.

    Two that we’ve used to amazing success:

    • Variety – You point this at a Mongo collection and it’ll figure out all the keys and value types, and more importantly, how many documents have them. It’s super useful for figuring exactly how much shit you have to migrate and build tables for, and also for distinguishing one-off malformed/legacy docs from core things that have to be migrated. Invaluable.

    • MoreSQL – This tails the Mongo oplog and can write into tables and remap data, and if you have more advanced tricks you use “jump” tables with insert/update triggers alongside functions/stored procedures to do more interesting transforms than the tool itself supports.

    1. 10

      Foreign Data Wrappers could also help; you can use MongoDB as storage backend through PostgreSQL until you’re fully migrated. And many other NoSQL solutions as well.

      1. 1

        Sounds like the nosql hell is just as fully realized as I suspected, since these tools acutally exist. I fortunately dodged almost all of it during my career (so far…).

      2. 18

        I have made plenty of terrible technical decisions in my 25 years, but I am so, so, so happy to have completely disassociated myself from web stuff before all this node/Mongo business took off.

        1. 7

          I have a counter-point: I worked on some terrific MongoDB/Node apps. All those had one thing in common “easy developer setup” was never their reason to choose them. One of them was a high-performance system that made great use of capped collections and crash safety was not their concern: if the system crashed, all data would be “old” anyways and not needed anymore. Node doesn’t get as much credit as it deserves: It’s the PHP of the new age and both have undeniably moved the expected level of technology forward.

          The biggest problems of both systems is that they have become kind of a FOSS Oracle: no one ever gets fired for choosing them.

          1. 6

            “It’s the PHP of the new age” - apparently you mean that as praise?

            1. 16

              I actually do. PHP did bring web development with easy deployment to the masses.

              There’s a lot to not like about the language, but I also think its often undersold.

              1. 12

                There is a lot to like also. A lot of people jumped on the hate bandwagon just because it’s popular, but…

                • Closures were there all along. It’s basically Scheme (not really of course :D)
                • Nice integration of async/await into the whole thing (especially compared to the mess with event loops in Python 3)
                  • most APIs were always asynchronous (with callbacks), so there’s no “we have to reinvent the whole world in async” problem
                • Template literals are nice (examples: SQL on the server, HTML in the browser)
                • I like the syntax in general, it’s pretty clean — automatic semicolon insertion, powerful object literals, no dollar signs…
                • Multiple great runtimes with fast JIT compilers, quick start time, relatively small memory footprint and decent portability

                As for the downsides… well, there’s nothing nearly as offensive as PHP’s “errors are not exceptions” and “there’s an operator that silences errors but you can override that in global settings but you can override that in the current state of the interpreter…”. Just the regular weak dynamic typing stuff. I like strong static and very expressive type systems, but I also like taking a break from them sometimes :)

                1. 2

                  no dollar signs

                  Are you talking about PHP?

                  1. 3

                    JS of course, compared to PHP (and Perl in this case, though Perl has even more signs :D)

              2. 11

                While PHP looks like a big hairy mess now, you could see things rather differently in the 90s. Instead of comparing to other popular scripting languages now and how web development has evolved, back in the 90s, the competition would be C/C++, Perl, and Shell. Compared to how error-prone and detail-oriented C family languages are and how Perl and Shell then had even messier syntax and less organization, PHP looks like a savior.

                1. 2

                  And ColdFusion if one had money.

                  1. 3

                    ColdFusion, please no more… Once upon a dark day in hell I ended up moving a primeval web app from ColdFusion to Perl to get around the atrocious limitations ColdFusion posed (at least back then, no idea if it evolved later), e.g. no nested queries. Doing the same thing in Perl was a breeze compared to the mess that was ColdFusion. This made it all the more clear that the ’net was ‘owned’ by free software and that all those commercial wanna-be’s had no place there. This project ended up being the start of a rather big move from commercial to free software at a former state-monopolist Telco in Europe, at least for where it concerned internet operations.

                    1. 4

                      I wrote a Cold Fusion web app in 1999 that had loads of frames, because it was 1999 and frames were cool and the simplest way to get a layout with various panels updating independently. It was a shopping app and so it had a shopping cart, and it used a server-side session to manage the cart.

                      At certain points in the app, seemingly somehow connected but never actually pin-down-able, it would just crash - and I don’t mean the page or the app, I mean the Cold Fusion server would throw hairy black-and-white error pages with C++ stack traces warning about scary-sounding things that didn’t seem like things a Cold Fusion developer should be having to worry about, semaphores and mutexes and ritual sacrifice.

                      We had a Cold Fusion support contract and so I spent a while trying to get them to explain what was going on. It took a lot less time than I expected; they came pretty much straight out and said “oh, yah, that’s because the session implementation’s not thread-safe, the server uses threads to manage connections but there’s no real locking or anything in the session code. We did that on purpose to make it faster, it was really slow when we tried doing it. Just don’t use sessions in frames and it’ll be fine”.

                      It was a shame because at that time it seemed like there were some good things about CF; in some senses the dev experience was quite polished, and there was a lot that came out of the box. Also it was neat that it had all this stuff designed to make it easy & familiar for web devs just using tags, but you could still drop into CFSCRIPT and use the same objects in an ECMA style. But of course all that’s for nothing if you do stuff like the above.

                      1. 1

                        Glad yall were able to get off it haha. The software was a waste of a perfectly-good name, too. Maybe something using metaprogramming that fused many styles at once with zero-cost abstractions and optimizing compiler that kept your CPU from running hot for no reason. Missed opportunity.

                2. 2

                  I don’t doubt it! But I’ve never had to touch Javascript or Mongo or <insert web tech stack thing here> and have still been able to footgun myself over and over. I shudder to think what enormities I could have committed.

                  1. 1

                    I read this as satire and irony. Surely you’re not serious?

                    1. 3

                      I am.

                  2. 10

                    MongoDB turned out to be an expensive (thanks to needing to keep so much index in RAM) decision for one project I worked on, but I stand by the choice at the time. In 2011, MongoDB was the only product that met our needs, which were mainly automated unattended failover and high availability so I could run ops by myself (and by “by myself” I mean ops would run themselves so I could focus on the app backend) and still get some sleep at night.

                    The product is shut down now, and I haven’t used MongoDB for a big project since then, but those automatic elections and remastering saved us many times.

                  3. 9

                    It’s a very well written write up . I am however disappointed with no performance metrics - even seeing a comparison of storage used would be interesting.

                    1. 2

                      I assumed from the lack of writeup around this that the answer was probably along the lines of “no clearly interesting difference one way or the other”. The declared motivation for moving was operational smoothness rather than performance.

                      1. 2

                        no clearly interesting difference one way or the other

                        That is an interesting information in itself for anyone planning a database migration. Especially if you consider that they most likely (or did they?) didn’t yet optimize their application and tune PostgreSQL for their workloads.

                    2. 5

                      I was a huge Mongo supporter in my own projects before I started my current job. At my current job we use SQL. After getting more advanced at that, I’ve also made the switch from MongoDB to PostgreSQL.

                      I’ll be honest, I don’t know much about databasing. I don’t have a good reason for picking PostgreSQL specifically. But, after coming off node as my primary language, I realized there was a lot more integrations and support on other languages for SQL DB’s.

                      1. 3

                        Anyone has experience with JSON support in Postgres?

                        1. 8

                          It’s good. This was already several years ago (9.4? 9.5?), but while doing other things my team converted existing text fields containing json to jsonb for a moderately large number (millions) of rows, continued ingestion of new rows with these native columns, and queried against them. I recall we had to fiddle with indices but performance was sufficiently good that I don’t remember thinking about it again.

                          1. 3

                            I’ll echo the same thing about indices–you’ll need to get them setup for good querying. PG will let you do indices using almost normal syntax on JSONB columns, which works well.

                            One thing to watch out for is that the JSONB columns don’t support the more advanced/interesting types, including datetimes–so, you’ll have to account for that in your queries.

                            1. 1

                              I’ll echo this. On a number of occasions at my current job we’ve decided to store the JSON response of an API call (rather than only the relevant extracted bits) in the event that we need to use more of the return payload later, or double-check our extraction code, etc. and doing this in Postgres has been painless both from a coding and performance standpoint.

                              1. 2

                                store the JSON response of an API call (rather than only the relevant extracted bits) in the event that we need to use more of the return payload later, or double-check our extraction code, etc

                                That’s exactly what we were doing. There were columns we populated and then the unplanned, changing, weird, or one-off that needed other bits and bobs from the response.

                          2. 3

                            That prompted me to make another donation to The Guardian. Mostly because they publish good articles, of course, but that technical post was surprisingly nice :-)

                            Consider donating too, if you can!