I guess this still has the problem where another thread attempts to concurrently update the document in question. Consider the following sequence, thread 1 uses Fawn, thread 2 uses anything else:
In this case, we’ve lost the change made by T2. Transactions generally protect against this by locking the document (or row) in question until T1 has completed a rollback or a commit.
You’re right. It’s also mentioned in the MongoDB docs:
Because only single-document operations are atomic with MongoDB, two-phase commits can only offer transaction-like semantics. It is possible for applications to return intermediate data at intermediate points during the two-phase commit or rollback.
So it really depends on the use case. Most of the time in an application, you only have one update point per collection. If the app has multiple update points for a single collection, using two phase commits does become sketchy.
I’ve seen this happen in a variety of ways, most notably an error in a client leads to them sending the same request multiple times in quick succession, or the client has retry logic, or a user hits a button twice…
Why not just use the other databases that were designed to have transactions from the start instead of bolting it on later?
My concern is that as we add these sorts of features to Mongo, we keep pushing it outside of what it’s good at. One of the things I’ve seen in a similar way, for example, is almost immediately adding some kind of schema mechanism on top of Mongo–schemaless being one of its nominal strongpoints.
If you need transactions, use a real database.
Edit: Must. Be. Positive.
My concern is that as we add these sorts of features to Mongo, we keep pushing it outside of what it’s good at.
You know, in years of experience cleaning up the use of Mongo at various employers, I’m still not clear on what that is, other than marketing to people who don’t know better.
To a first approximation, none of the shops who chose to use it seem to have actually wanted a non-transactional document store with data so heterogenous and unreliable that they (should have been but didn’t) write extremely granular key-existence checks for every time they needed to deal with a record.
All they really wanted was “DB but fast”, which they mistook Mongo for. But yeah, I agree, don’t staple a two-phase commit protocol to Mongo – just use something less ill-suited for anything anyone actually wants in real life.
Well, if you like using a tool that doesn’t quite have all the features you want, there’s really nothing wrong with adding those features yourself; that’s why the MongoDB docs include a description for two-phase commits as a way to handle transactions. Also, most of these added features (schemas for instance) are implemented via third party libraries so if they fit your use case, great, otherwise you can completely avoid them and use the DB as you see fit.
Be careful before using this, it’s trivially vulnerable to server side request forgery (i.e. if your server tries to fetch a user-submitted url they can pass urls or ip addresses on your private network).
Good point. There should definitely be some care taken with the urls. I don’t think it’s that much of an issue though in this case. Jsdom only accepts html and xml formatted responses. Worst case scenario, it returns all the open graph and twitter tags from an internal document. Accepted urls should be monitored closely but sensitive information really shouldn’t be placed in Open Graph tags.
/quitquitquitand/abortabortabortthat kill services - if your package was deployed internally there someone could use it to take down production services.Not trying to be harsh - I think it’s a useful library if you make it secure by default. There are existing libraries that do SSRF filtering for you - you could swap in one of those to fetch the page contents and pass that to JSDOM.
Libraries really have to be secure by default, otherwise developers will mess it up, 100% guaranteed.
That’s really not harsh at all. I’m a student; I learn through conversations like this.
I see what you mean. It would make sense to have an ssrf option and I’ll work on adding that. I’m pretty sure it’s only ever going to send get requests though and get endpoints that kill services should be heavily protected. I get that “should” doesn’t imply “would” but having unauthenticated get endpoints that kill internal services has to be seen as careless.