1. 4

  2. 2

    So my takeaway is that research and planning can save months of effort.

    I’d love to see a rundown of the final architecture vs the original and an analysis of where the actual improvements came from.

    Incidentally the kind of analysis in this article is a great way to extract insights from a - let’s say - suboptimal series of choices.


      Indeed! The final architecture is a greatly simplified system where all the data goes through only one system, Monolith, before being written to Redis, then asynchronously read back in large batches (so we don’t spin up 8 threads for a single message and waste our time). This Monolith program also does all the data annotation, that was previously done by Research. Basically, I figured out how to put things together without making them slow. Originally, it had a bunch of scripts all over the place because I couldn’t find the time to learn async, or because I didn’t want to, because of the sunk cost fallacy of learning Twisted’s “async” with Deferreds and callbacks. There was an analysis tool that ran over the data after it was ingested into the database.

      Additionally, my original problem was that clicking different pages would run the query again, which was fixed by caching, not a fancier DB. It would have taken 5 minutes of planning with a pen and some paper to realise this. It took around 30 minutes to implement. This alone would have solved my problem, but the superfluous database optimisation led me to restructure everything else in a very positive way, before doubling down on the initial database decision. I’m glad I learned about Manticore, it was awesome! I’ll certainly use it in the future if I need a quick database. And hey, I can always switch back by changing “ELASTICSEARCH” to “MANTICORE” in my app :) Druid was also cool, but rather overkill for what I needed, and not really budget-friendly in terms of RAM/CPU usage.

      I still think most improvements were gained not to my project, but to my knowledge of application structure, prioritisation and asynchronous programming :)

      It’s interesting you say that, I’m assuming you mean a user on IRC slips up and connects without a VPN one time 10 years ago but doesn’t notice. This was one of the ideas in my mind when I created it too. Realising this was possible really made me up my anonymity game, right up until I realised that keeping things secret is what gave others power to hurt me with them, so I up and came out with my real name and place of residence. Nobody had shown up yet :)