This is a pretty good high level overview. As a long time relational db and backend dev recently mandated to use javascript and mongo on a client project, here’s some further notes:
mongodb has no transactions between document collections. This means that you have two choices about your data.
Second, you can, instead of designing your data in a normalized way, create superdocuments that contain all entities and relationships that might be changed together in a single transaction. For example, the same senior js/mongo people proposed to me that we essentially design the database (were it to be a shopping application, e.g.) to include all users' shopping carts and the available inventory of all items in a single document. Because mongo does have transactional semantics in a single document, this would work around the problem. However, it would mean that any read access to any of the contained entities would cause the entire structure to be read into memory and likely transmitted to each reader. In the case of this scenario, I mentioned that the application servers would be constantly reading and writing a 41M mongo document, and was told that mongo was fast enough that nobody would notice. Perhaps this is true, but I harbor doubts.
document-oriented data design requires more up-front analysis and thinking than a relational design. Because there are no joins and no transactions, you need to understand all of your queries much sooner in the process than you might in a relational mode. While you might still have to refactor and/or add indices to a relational design when new query requirements are uncovered, the relative friction of rewriting your application when the data model has to change is a little higher.
It’s apparently idiomatic in js/mongo to move a lot of the query heavy lifting – filtering, mapping, reducing, updating – into the application tier. There’s no real analogue to a multi-table update, so you don’t have the possibility of server side ETL in the same way you might in SQL. This could be good, could be bad, depending on your workload. In my experience, however, databases tend to be IO bound rather than CPU bound, and the benefits of data locality are very high, so the cost of two DB/network copies and a higher duty cycle on the application tier seem likely to generally be a negative on the architecture.
On the positive side, in your typical three-tier application involving a browser, you don’t have to write any ETL code at all if you have mongo, so that’s good. But if you have a non-web client, such as a mobile device, and/or you want your data footprint to be light, then you’re a little bit screwed. One wonders what a protobuf-native database would be like.
It will probably come as no surprise that I recommend against mongo or javascript as a production server side solution for anything other than low volume web-only scenarios. But, you sometimes do go to war with the troops you have.
This is a pretty good high level overview. As a long time relational db and backend dev recently mandated to use javascript and mongo on a client project, here’s some further notes:
First, you can do the naive relational thing and try to normalize your data according to entities and relationships, as one does. But if you do that, then you will necessarily have to attempt to reimplement transactions at the application level. Recently I’ve had senior js/mongo people propose to me (a) do locking in a separate redis instance, (b) do locking on a per table basis manually by keeping fields in the database and making every application write access to the database aware of, and 100% compliant with, the convention; © instead of modifying documents directly, add a document relationship table which keeps a kafka-like list of pending writes to each table, and do the transactions offline in highly guarded code (essentially emulating an AP mode), and somewhat shockingly (d) not being upset if there’s data loss, because customer service will handle issues and complaints, and the chance of getting the database into a completely application-inconsistent state is “pretty low”.
Second, you can, instead of designing your data in a normalized way, create superdocuments that contain all entities and relationships that might be changed together in a single transaction. For example, the same senior js/mongo people proposed to me that we essentially design the database (were it to be a shopping application, e.g.) to include all users' shopping carts and the available inventory of all items in a single document. Because mongo does have transactional semantics in a single document, this would work around the problem. However, it would mean that any read access to any of the contained entities would cause the entire structure to be read into memory and likely transmitted to each reader. In the case of this scenario, I mentioned that the application servers would be constantly reading and writing a 41M mongo document, and was told that mongo was fast enough that nobody would notice. Perhaps this is true, but I harbor doubts.
document-oriented data design requires more up-front analysis and thinking than a relational design. Because there are no joins and no transactions, you need to understand all of your queries much sooner in the process than you might in a relational mode. While you might still have to refactor and/or add indices to a relational design when new query requirements are uncovered, the relative friction of rewriting your application when the data model has to change is a little higher.
It’s apparently idiomatic in js/mongo to move a lot of the query heavy lifting – filtering, mapping, reducing, updating – into the application tier. There’s no real analogue to a multi-table update, so you don’t have the possibility of server side ETL in the same way you might in SQL. This could be good, could be bad, depending on your workload. In my experience, however, databases tend to be IO bound rather than CPU bound, and the benefits of data locality are very high, so the cost of two DB/network copies and a higher duty cycle on the application tier seem likely to generally be a negative on the architecture.
On the positive side, in your typical three-tier application involving a browser, you don’t have to write any ETL code at all if you have mongo, so that’s good. But if you have a non-web client, such as a mobile device, and/or you want your data footprint to be light, then you’re a little bit screwed. One wonders what a protobuf-native database would be like.
It will probably come as no surprise that I recommend against mongo or javascript as a production server side solution for anything other than low volume web-only scenarios. But, you sometimes do go to war with the troops you have.