1. 1

    A couple of clarifying questions:

    1. You state that if you haven’t received an ack within X milliseconds, to mark the current message as sent and proceed. If you don’t care about retries, why not remove the requirement to listen to acks in the first place?
    2. How important is event ordering to you? For most event architectures, it’s worth it to quash that requirement due to increased complexity.
    3. What’s worse: a user not receiving a message, or a user receiving more than one copy of a message?
    1. 2
      1. I get acks 85%-90% of the times. So, I would like to optimise it so that it is ordered for maximum number of users and let it go out of order for few. Also, by adding this X amount of delay, the message is usually sent to user as ordered. The messages are going out of order when I send them instantly.

      2. The current system is unordered and works really well (scale, maintainability). However, a lot of messages are sent out of order. So, ordering is very important. My naive solution is to add a delay of X ms after every message and it should solve for most cases. However, I would be slowing down simply and I don’t want to do that.

      3. User not receiving a message is worse. But I would try not send multiple times either.

      1. 4

        Have you considered enabling PubSub ordering, with the ordering key being the user/room? Some of the tradeoffs are that you will be limited in your throughput (1MB/s) per ordering key, and will be vulnerable to hot sharding issues.

        After enabling ordering, if the ordering issue still impacts a greater fraction of users than you would like, then the problem is most likely on the upstream side (Insta/WhatsApp). AFAIK there is no ordering guarantee for those services, even if you wait for acks.

        My advice: if the current solution is working great without ordering, I would strongly suggest sticking with it.

        1. 2

          Once I enable ordering on the queue, it becomes difficult to distribute the work within multiple workers, right?

          if the current solution is working great without ordering, I would strongly suggest sticking with it.

          I so wish to do this, but seems I can’t :(

          1. 3

            Has someone actually quantified how impactful out of order messages are to the business? This is the kind of bug that a less-technical manager or PM can prioritize highly without doing due diligence.

            Another suggestion is to make a design, and be sure to highlight whatever infrastructure costs are changing (increasing most likely), as well as calling out the risks of increasing the complexity of the system. You have the agency to advocate for what you think is right. If they decide to proceed with the design then go ahead and get the experience and you’ll find out if the warnings were warranted over time.

            Quantifying the impact is a good exercise for you to do anyway, since if you build the system you can then put an estimate of the value you created on your resume.

            1. 2

              Correct; you will only be able to have one worker per ordering key, or you lose your ordering guarantee.

          2. 2

            If you want to avoid duplicates and lost messages, the only solution is to use idempotent APIs to send messages. If you do not receive an ack within some time, resend the message idempotently; lost sends/acks are repeated and the API provider filters the duplicates. Only proceed to sending message N+1 once you eventually succeed sending message N.

            If your API provider does not provide an idempotent API, then you could try to query their API for your last sent message and compare it with the one you plan to send. But this is slow and, since it’s not atomic / transactional, is very racey.

        1. 2

          I usually find most of these already in the lint stage. Except instead of magic strings we just use TODO as most tools already look for this and prevent merge. Likewise, no-console or similar are mostly automatically there when you start new projects and shouldn’t be an issue.

          Comments in the build file are somewhat problematic. Sometimes I want a comment there, so how you distinguishthe needed ones vs the forgotten ones?

          Code Review is important for this stuff.

          1. 2

            Indeed. The usefulness of this technique I think depends an awful lot on which language/tooling you’re using. Some languages have amazingly powerful linters that can catch most or all of these types of things. Others, not so much.

            1. 2

              One quite workable approach is to have a linter that uncomments one comment block at a time, and then tries to parse the AST. If it parses, it’s likely code.

              1. 1

                I’d suggest a slight modification: instead of looking for all comments except those starting with #-#, only look for comments that start with # -, since those are more likely to be commented-out commands. I guess you’d still get false positive from lists, though.

              1. 15

                The author presents code equivalent to char* p = malloc(8); char* q = p + 4; *q; and argues that it’s not useful to null-check q — because if malloc() returns NULL, then q would be 0x4, not 0x0 (thus passing the null check).

                However, this is begging the question — q is only able to have an invalid but non-null value after neglecting to null-check p in the first place.

                I don’t find the article convincing. Just because memset() needs to be usable with any mapped address (which might include 0x0) doesn’t mean that we shouldn’t check for NULL ever.

                1. 10

                  While his example of passing 0x4 might be bogus, the overall message isn’t. He’s not saying “never check for NULL” (because his sample function dup_to_upper() does check for NULL when it calls malloc()) but the immediate reflex to check the input pointer for NULL won’t really help when there are a large number of invalid addresses that aren’t NULL that could be passed in.

                  The point he made was better made in the book Writing Solid Code. That book changed how I code C, and now, instead of:

                  char *foo(char *s)
                  {
                    if (s == NULL)
                      return NULL;
                    ...
                  }
                  

                  but instead:

                  char *foo(char *s)
                  {
                    assert(s != NULL);
                    ...
                  }
                  

                  In my mind, unless I have a very good reason, passing in NULL to a function is a bug. Checking for NULL and returning an error is quite possibly hiding a real bug. Why are you passing NULL? How did that NULL get there in the first place? Are you not checking for NULL elsewhere?

                  1. 2

                    Out of interest, why assert() instead of using the ptr and allowing the SEGV signal handler and/or core give you similar info?

                    1. 9

                      Some reasons off the top of my head:

                      • To catch the null pointer as soon as possible. Otherwise, it may be propagated further and cause a segfault much farther away, which is harder to debug. Much more so if a null/corrupt pointer is stored in some data structure instead of just passed to callees, since the stack trace is no longer useful in that case.
                      • To document the expectations of the function to future developers.
                      • You can redefine ASSERT() in debug builds to fail unit tests if they are running, which is more dev-friendly than crashing your test system.
                      1. 1

                        Beside future developers, they’re also useful for documenting expectations for static analysis tools. The Clang static analyzer, for example, takes them into account.

                        1. 1

                          The “find the NULL as soon as possible” makes most sense to me. I guess I was thinking that using it (straight away) would provide this, but I agree we may do non-dangerous things (like storing it somewhere) before we deref it.

                          Thank you.

                        2. 1

                          Some systems are configured to allow mapping memory at NULL, which would open up a potential NULL ptr deref vulnerability wherein arbitrary data was stuffed at 0x0.

                        3. 2

                          I love using assert. It’s simple and concise. In a project I wrote to integrate with Pushover, I use assertions at the beginning of any exported function that takes pointers as arguments.

                          Sample code:

                          EXPORTED_SYM
                          bool
                          pushover_set_uri(pushover_ctx_t *ctx, const char *uri)
                          {
                          
                          	assert(ctx != NULL);
                          	assert(uri != NULL);
                          	assert(ctx->psh_uri == NULL);
                          
                          	ctx->psh_uri = strdup(uri);
                          	return (ctx->psh_uri != NULL);
                          }
                          

                          Also: I almost always use calloc over malloc, so that I know the allocation is in a known state. This also helps prevent infoleaks for structs, including compiler-introduced padding between fields. Using calloc does provide a little perf hit, but I prefer defensive coding techniques over performance.

                          1. 1

                            The one issue with using calloc() is that a NULL pointer on a system does not have to be all-zero (C standard, POSIX requires a NULL pointer to be all zeros). Yes, in source code, a literal 0 in a pointer context is NULL, but internally, it’s converted to whatever the system deems a NULL address.

                            I’ve used a custom malloc() that would fill memory with a particular value carefully selected per architecture. For the x86, it would fill the memory with 0xCC. As a pointer, it will probably crash. As an unsigned integer, it’s a sizable negative number. As an unsigned integer, it’s a large number. And if executed, it’s the INT3 instruction, aka, breakpoint. For the Motorola 68000 series, 0xA1 is a good choice—for all the above, plus it can cause a misaligned read access for 16 or 32 bit quantities if used as an address. I forgot what value I used for MIPS, but it was again, for the same reasons.

                          2. 2

                            The use of assert() here goes against sanitary program behaviour. Returning an error (while printing out useful information) is preferable (for example in Koios I used an enum for errors, null is represented as KERR_NULL) because the program can then act on that and save state, even if some of it might be non-useful, saving as much as is possible is just generally good behaviour. It also allows you to do more with that. Maybe you want to recheck assumptions and reload data, or some other kind of error-avoidance/compensation.

                            How would you like it, as a user, if Firefox or Libreoffice, or other programs in which people tend to commonly work, just up and failed for (to the user) no observable reason?

                            I don’t see how this is good guidance for anything but the simplest of C programs.

                            edit: I forgot about NDEBUG.

                            But that in effect makes the check useless for anything but (isolated) testing builds.

                            Checking for NULL and returning an error is quite possibly hiding a real bug.

                            I really don’t see how. Reporting an error and trying to compensate, either by soft-crashing after saving state, or using other / blacklisting data-sources, is not ‘hiding’ a bug. It’s the bare minimum any program should do to adapt to real world cases and scenarios.

                            1. 4

                              The use of assert() here goes against sanitary program behaviour. Returning an error (while printing out useful information) is preferable (for example in Koios I used an enum for errors, null is represented as KERR_NULL) because the program can then act on that and save state, even if some of it might be non-useful, saving as much as is possible is just generally good behaviour. It also allows you to do more with that. Maybe you want to recheck assumptions and reload data, or some other kind of error-avoidance/compensation.

                              Assertions are intended for catching bugs not recoverable errors. An error that isn’t recoverable or was unexpected has occurred and it isn’t safe to continue program execution.

                              How would you like it, as a user, if Firefox or Libreoffice, or other programs in which people tend to commonly work, just up and failed for (to the user) no observable reason?

                              Many large systems including Firefox, the Linux kernel, LLVM, and others use a combination of assertions and error recovery.

                              Assertion usage is one of the rules in NASA’s The Power of Ten – Rules for Developing Safety Critical Code.

                              1. Rule: The assertion density of the code should average to a minimum of two assertions per function. Assertions are used to check for anomalous conditions that should never happen in real-life executions.
                              1. 1

                                An error that isn’t recoverable or was unexpected has occurred and it isn’t safe to continue program execution.

                                A null pointer error doesn’t automagically invalidate all of the state that the user put into the program expecting to get it out again.

                                Many large systems including Firefox, the Linux kernel, LLVM, and others use a combination of assertions and error recovery.

                                Of course.

                                Assertion usage is one of the rules in NASA’s The Power of Ten – Rules for Developing Safety Critical Code. […]

                                Right, but that’s not the usage of the above null check, which advocates for never saving state on null, ever.

                              2. 2

                                If the documentation says “this function requires a valid pointer”, why would I bother checking for NULL and returning an error? It’s a bug if NULL is passed in. The assert() is there just to make sure. I just checked, and when I pass NULL to memset() it crashed. I’m not going to blame memset() for this, as the semantics of the function don’t make sense if NULL is passed in for either of its functions.

                            2. 6

                              I just want to salute you for using “begging the question” to describe circular reasoning. I don’t often see that original meaning in the wild.

                              1. 2

                                Assuming NULL is 0x0 is wrong. Of course. The article’s entire argument falls down right there.

                                1. 2

                                  On what architectures is NULL not 0x0? I can’t seem to think of any off-hand.

                                  1. 3

                                    Multics has it be -1, AFAIK. Probably not anything with non-integer pointers either like x86 segmentation.

                                    1. 1

                                      Here’s a good page on exactly that question: http://c-faq.com/null/machexamp.html

                                1. 2

                                  I commited code that broke the build today because I doubly defined something. The worst thing is that I proof-read it. It built, but I didn’t want to run all tests because it was only a six-line diff and I didn’t want to wait for over an hour. It also passed review.

                                  Let’s say I’m having trouble focusing working from home with the heat.

                                  1. 1

                                    This highlights how friction can be an issue — if your tests take a full hour, it must be very trying to one’s patience. Is this something your team could theoretically speed up by throwing more hardware at?

                                    1. 1

                                      I’ve thought of this as well. The simple answer is no – the tests are already run in parallel. The more detailed answer is no – but we could at least run the tests on a different machine than the development machine so that I could check out another branch and do some actual work while the tests are running.

                                      In this particular instance, I could and should have run a specific testsuite which only takes about a minute but is a quick smoke test.

                                      1. 1

                                        With that many tests, ideally during development you run only the unit tests for the files/modules you have edited, and possibly their transitive dependencies. Bonus points if your testing tool watches for file changes and then takes a snapshot of changed files and runs tests on that, instead of on your development directory, allowing you to continue making changes.

                                        Another possibility is to push your complete commits to side branches, and defer merging to your main branch to a bot that rebases them, tests them, and then merges them (assuming Git). Something like bors (combined with git-test), if you’re using Github.

                                        1. 1

                                          Unfortunately, there are no unit tests, and not everything is nicely modular. Every test is an integration test. The test suite consists of hundreds of thousands of integration tests.

                                          This sounds horrible, but there are practical reasons that this is the way things work (and they are not all ‘no time to rewrite everything’).

                                          Another possibility is to push your complete commits to side branches, and defer merging to your main branch to a bot that rebases them, tests them, and then merges them (assuming Git). Something like bors (combined with git-test), if you’re using Github.

                                          We’re not, but that is a terrific idea. I just wonder how well it plays with the way things are currently set-up.

                                        2. 1

                                          Got it.

                                          The context for my question was based on a mistake I made in the past. Basically I was too cheap to pay for sufficient parallelization on our continuous integration server (CircleCI), despite my head engineer saying the tests could be run 5x+ faster. Considering the cost of an engineer’s time vs. a few extra docker container instances, it was a false economy.

                                    1. 1

                                      Perhaps Gitlab might be something for you? Free for open source, and it might be feature complete enough for you:

                                      https://about.gitlab.com/solutions/open-source/projects/

                                      1. 1

                                        I did take a look at Gitlab, but their offerings seem to be closer to development tools; what our project is really suffering from is deployed servers. AFAICT Gitlab doesn’t seem to offer anything in that space unfortunately.

                                      1. 3

                                        In my (limited[0]) experience your best bet is to find a company that actually uses your open source project and ask them if they’d be open to sponsoring something.

                                        I know that this is usually a lot easier for bigger and older projects (until they are too big, then it’s sometimes not so easy to say “it’s nice that you want to sponsor a server, but we have certain requirements, so either you comply or it would be wasted”).

                                        I have no idea about your project and your computing needs, but if bad comes to worse, a few cloud VMs could be all that’s needed. In one project we have a single VM as as demo server (it’s a web-based software) and while the original author does host the rest of the infra on a personal server, I’m pretty sure it could be handled by another 5 EUR VM, just for downloads, website, forum. (So maybe think hard if you really need several servers if forum+wiki+bugtracker only get low traffic, that makes funding a lot easier)

                                        Again it depends on the case, but if one of your employers is open source friendly, it also doesn’t hurt asking if they’d be willing to sponsor a server (in exchange for a “sponsored by” banner).

                                        Depending on how your main contributors’ countries of residency’s laws are, maybe it’s possible to just use Github Sponsors, or Ko-Fi or Patreon so that one person can “collect”?

                                        Also there are several associations that act as proxy for collecting donations for open source projects, https://techcultivation.org/#overview comes to my mind or the “Wau-Holland-Stiftung”, but I don’t know which projects they would support, I guess it depends on size again.

                                        [0]: exhibit a - the software I mention in the text, no big problem here, just the one demo server is nice, for security reasons I wouldn’t want to run several copies of the same software in all versions with public admin access on the infra that hosts the downloads

                                        exhibit b - I contributed to PHP in the past and it’s a massive project, afaik finding sponsors for hosting and hardware has never been a problem (for 20 years), but it’s more this “if you want to be a mirror, you need to fulfill these criteria”

                                        exhibit c… - most other projects I contributed to were fine with what SourceForge/Github/etc.pp offer for open source projects, more and more complemented by cheap cloud VMs where the maintainer(s) simply invest a little money. I know this might be speaking from privilege a bit, but for most software developers 1 EUR for the domain per month + 10 EUR for hosting is doable.

                                        1. 1

                                          Thank you for the thorough response!

                                          Unfortunately corporate sponsorship is not really an option, and we definitely don’t have the clout that PHP has/had (closer to the opposite in fact..). Our computing needs are rather lightweight, apart from needing a fair amount of disk storage; we could pile it onto one VM, and there’s certainly stuff that would fit in the ‘always free’ tiers of various cloud providers. So that is worth exploring.

                                          Managing ownership myself is a possibility, and prices would be affordable, but single-owner servers is what led to our current situation. I’m also reluctant to move to a model where one person is in charge of the entire cash flow unless the process is reasonably transparent. On that front, techcultivation.org looks worth checking out, thanks for the link.

                                        1. 7

                                          I’m currently a Python dev (apparently this is the most recent turn my career has taken), and I’m really bummed out by its web story outside of Django.

                                          My last gig was Elixir, before that Node, and some Rails and Laravel in there. The tooling in the Python ecosystem, especially around migrations and dependency management, just feels clunky.

                                          It singlehandedly sold me on Docker just so I didn’t have to mess with virtualenvs and multiple runtimes on my system and all of that. Like, what happened? Everybody groused about 2-to-3 (which is still hilarious) but like even without that I feel like the ecosystem has been vastly outstripped by “worse” technologies (see also, NodeJS).

                                          1. 4

                                            It singlehandedly sold me on Docker just so I didn’t have to mess with virtualenvs

                                            One thing that made virtualenvs almost entirely painless for me was using direnv: in all my python project directories I have a bash script named .envrc that contains source .venv/bin/activate, and now cd-ing in/out of that directory will enter/exit the virtualenv automatically and instantaneously. It’s probably possible to set it up to switch pyenv environments as well.

                                            1. 3

                                              One of the reasons why Python packaging still feels so clunky compared to other ecosystems is that the Python ecosystem is a lot more diverse thanks to e.g. the scientific stack that has very different needs than the web peeps so there’s never gonna be an all-encompassing solution like Cargo. Pipenv tried and failed, poetry is carving a niche for itself.

                                              But the primitives are improving. pip is currently growing a proper resolver and doesn’t e.g. Ruby still need a compiler to install binary packages? As long as long as you don’t use Alpine for your Docker images, Python’s wheels are great (they’re just a bit painful to build).

                                              1. 1

                                                How did pipenv fail?

                                                1. 4

                                                  Short answer: it’s too complex which makes it buggy and there wasn’t a release in over a year. IOW: It’s falling over it’s own weight.

                                                  Long answer: https://hynek.me/articles/python-app-deps-2018/

                                              2. 3

                                                The tooling in the Python ecosystem, especially around migrations and dependency management, just feels clunky.

                                                Currently working on a Rails app, coming from the Flask ecosystem. You have no idea how much I can miss SQLAlchemy and Alembic.

                                                I agree about dependency management, but certainly not about migrations. Modifying models and auto-generating migrations works much better than the other way around for me.

                                              1. 3

                                                For the past few weeks now I’ve been looking for a code review tool that could be used together with gitolite, and was surprised to find that all of the available options are non-ideal or unusable for me in one way or another. I’m reluctantly turning towards making my own. My requirements are somewhat minimal, but I’d also like to tackle some of the design flaws I think github’s PR review system suffers from, by e.g. storing all versions of a PR, making it easy to compare PR versions, being able to save review progress, etc.

                                                My current implementation is a flask app with a webhook endpoint that takes a repository name; when called, the named repository is fetched, and branches with names matching a given regex are treated as WIP branches. There are some views for diff rendering as well. It’s all refreshingly minimal.

                                                If I manage to get code review working nicely, I’d also like to have test, lint, and code coverage results displayed in-line, per-commit. But that’s a ways off yet.

                                                1. 6

                                                  I’m surprised nobody has mentioned Leo. It’s a tree-based text/code editor. The tree is very flexible, and leaf nodes contain code/text/whatnot. You can even reference leafs from more than one place.

                                                  It’s not code-specific. That is, you can use it for any text-based programming language (though there will be a step to export linear-text files), or you can use it for documentation, or todo lists, or whatever. It’s very flexible.

                                                  I used it a long time back to do some literate programming. It’s evolved quite a bit since then.

                                                  1. 2

                                                    Leo is maybe more an outliner than a tree editor – the nuance being that a tree editor should have a notion of schema, while an outliner can be more freeform. They’re definitely two branches of the same family tree, but I feel there are unique challenges when interactively editing structured data that needs to (eventually) conform to a given schema.

                                                    1. 1

                                                      That looks quite neat; reminds me of a similar terminal program called hnb, although hnb seems more minimal and more targeted towards personal information management. Leo seems much more poweful and worth exploring, thanks for bringing it to my attention!

                                                    1. 2

                                                      The same applies for Java as well. Java’s reference types don’t get passed by reference. Instead, the variables are effectively pointers, and the pointers get passed by value. It’s a subtle distinction but an important one when you’re talking about programming language design.

                                                      1. 2

                                                        Barbara Liskov (for the CLU language) termed this call by sharing, where the callee gets references to the arguments but can’t change what they refer to. A nice thing is that you can understand primitives (that might be passed by value as an optimisation) in the same way, as long as you lack reference equality. Since they’re immutable you can’t change them and discover that they aren’t actually shared!

                                                        Nowadays most people just understand it as ‘references passed by value’, which is less fun. ☺️

                                                        1. 2

                                                          The term call by sharing still reveals a fundamentally mechanistic conception of argument passing. A better way to think about it is to make a distinction must be made between values, which exist in the semantics of a programming language (which is a timeless mathematical object, thus the question of whether it’s mutable doesn’t even make sense), and representations, which exist in computer memory (which obviously exists in time and might mutate). Now things become clearer: there is always a single number 42 and a single string “Hello, world!”, but they may be represented arbitrarily many times in computer memory.

                                                      1. 4

                                                        I’ve been hacking together a small and mostly LDoc-compatible documentation generator for distribution with luakit, so that up-to-date docs can be installed with the main program. Shortcoming: type inference isn’t as nice. Benefit: it’s very easy to tweak the documentation generation and include project-specific information, e.g. auto-generate nice-looking keybinding descriptions.

                                                        1. 6

                                                          The idea of stealthy ad blocking is nice, but I think it will be limited in practice. It will always be possible to draw opaque rectangles over ads, but anything further, such as reclaiming the space used by ads, is fairly easy to detect with JS. Since stealthy ad blocking merely obscures the ad, it also does not save bandwidth, prevent user tracking, save CPU usage, or protect against malware; all commonly cited reasons for blocking ads.

                                                          1. 8

                                                            A thing to consider: most websites do not actually require Javascript, they only claim to. Most of the sites that do require Javascript do it with negative amount of benefit to user. Also, a large fraction looks much better if you kill all their CSS, but that’s another sad story.

                                                            I guess what happens is that an empty-profile container-separated (and not root inside the container, of course) browser instance will get spun up for just enough to load the page, scroll through it and load all the scroll-loaded content, then pass the now-static content to a sanitizer which can remove everything that looks like an ad.

                                                            Tracker handling is a complicated story if websites start checking the responses for more than presence and freshness, until that you can just load them with randomized parameters and random referrer.

                                                            On the other hand, a single large enough click-spoofing false-flag-operation conflict can redraw the web landscape faster than throwaway containers come to browsing…

                                                          1. 2

                                                            This is great; I really liked Luakit when I used it previously before finding out about the massive security problems. I’m glad to see it updated to the Webkit 2 API.

                                                            However, I think the documentation here glosses over the fact that there are still serious practical security problems with this library even after switching to the new API: https://blogs.gnome.org/mcatanzaro/2017/02/08/an-update-on-webkit-security-updates/

                                                            1. 2

                                                              Thanks for the feedback! I’ve added a short summary of that link at the top of the download section.

                                                              1. 1

                                                                Glad to see it. I would love to try out LuaKit again once Debian fixes their Webkit packages.