1. 4

    I’ve had 2 ideas around CI systems:

    1. Make a super opinionated system based around Bazel. Basically your code builds in bazel and the CI just runs build and run targets. Caching is pretty much solved out of the box because of bazel and multi language is supported to a degree. This is similar to the post.
    2. Create a flexible system using Lua as config. All these yaml languages end up being a PITA, so why not just use an actual language that can be sandboxed? Handling a DAG of tasks is running some coroutines. Obviously it’s not super simple because you want to have visualization and debugging, but at least you have actually IDE support.
    1. 2

      (1) is how Google’s presubmit works. It’s been effectively configuration-free in my experience.

      The insight here is that the build system, whether Bazel or something else, already has a dependency graph. If the CI can reuse this dependency graph, and if you add test targets as well as build targets, you get very granular caching and can also skip unaffected tests on each run.

      The thing that makes this work is the exhaustive declaration of all dependencies in Bazel (‘hermeticity’). I’m not sure that this would work with something like Make, where incorrect dependencies leads to “nothing to be done for …” rather than immediate build failure.

      (2) sounds sort of like xmake. I haven’t tried it myself though.

      1. 1

        (1) Yeah that’s similar to how it worked at Amazon as well with brazil ws --dry-run, which also ran automatically when you submit a code review. It would just run the default build and test targets. The part where this gets a bit trickier with bazel is how to handle integration tests that require external services, e.g. postgres or redis. You still need some way of defining that, whether it’s a docker-compose config or NixOS server base or something else entirely. That also breaks the hermetic nature of the tests since you can e.g. forget to cleanup the DB in your tests.

        (2) Oh that looks interesting, I’ll have to take a look. Now that I’ve thought about it a bit more, I’m wondering how to avoid that ending up like Gradle or Jenkins. Both have too big of an API surface area and jenkins in particular suffers fro being difficult to reproduce locally due to plugins and manual configuration. The other big issue there is plugin conflicts due to the Java classpath. I think Lua avoids some of these problems since it can be embedded in a binary and requires explicit imports of other files. I think some other problems can be avoided to some extent by ensuring it can only be configured via code and being more batteries included.

      2. 1

        I mentioned a couple deficiencies of Bazel in a sibling comment here.

        It works pretty well for a tightly coupled corporate codebase, and particularly C++, but I don’t think it works that well in open source. Or even for most small/medium companies, which tend have more heterogeneous codebases that “delegate” to many tools/libraries.

        For example, many people will want to just use NPM in a container. Last I checked (and it’s been awhile), if you want NPM-like things in Bazel you’ll be in for a rude awakening. Most package managers conflict heavily with its model.

        1. 2

          Yeah when I envisioned my Bazel based CI it was specifically for Java server applications, which has the best Bazel support besides C++. For Java servers, you’re just taking dependencies on other projects and libs rather than being depended on directly. This idea was partially due to my frustrations with Gradle, which makes my head spin every time I look at the API docs.

          I think the other important piece is that when you focus on a single language, you can more easily do what’s mentioned elsewhere in the thread where you have a super tight end to end integration. You can have code coverage, linting, format checking, and deployment (to some extent) working without needing to set it up yourself.

      1. 3

        This is a good description of the bare basics of a build system. Where things get messy, though — even in simple projects — is when the source files have dependencies on each other, which are described within those files. In C terms, when a .c or .h file #includes .h files. Then changing a .h file requires recompiling all the .c files transitively dependent upon it.

        No problem, make can do that! Except (unless make has changed a lot since I last used it) those dependencies have to be described explicitly in the makefile. Now you’ve got a very nasty case of repeating yourself: it’s so easy and common to add an #include to a source file during development. But if you ever forget to add the equivalent dependency to the makefile, you’ve broken your build. And it can break in really nefarious ways that only manifest as runtime errors or crashes that are extremely hard to debug. This in turn leads to voodoo behaviors like “I dunno why it crashed, lets delete all the .o files and build from scratch and hope it goes away.”

        So now you need a tool that scans your source files and discovers dependencies and updates your makefile. This is why CMake exists, basically. But it add more complexity. This is a big part of why C/C++ are such a mess.

        (Or you could just use an IDE, of course. Frankly the nay reason I have to deal with crap like makefiles is because not everyone who uses my code has Xcode…)

        1. 6

          None of this is necessary. It’s perfectly normal in make-based C/C++ projects to have a build rule which uses the compiler to generate the dependencies during the first build & then include those build dependencies into the Makefile for subsequent incremental builds.

          There’s no need to keep track of the dependencies for C/C++ files by hand.

          (For reasons which are not entirely clear to me, Google’s Bazel does not appear to do this. Meson does though, if you want a nice modern build tool.)

          1. 2

            Maybe recursive make is where it breaks down. I have fond memories of hardcoding dependencies between libraries in the top level makefile – an activity reserved for special occations when someone had tracked down an obscure stale rebuild issue.

            (I think recursive make, at least done the obvious top-down way, is flawed.)

            1. 1

              Yeah, you never want to be calling make from within make.

            2. 2

              I imagine the reason is that Bazel requires a static dependency graph, including for all autogenerated intermediate files. I’m not sure why the graph is encoded directly in files instead of maintained in a parallel index though.

              There’s internal tooling at Google to automatically update dependencies in BUILD files from source files, but it’s apparently not open sourced.

            3. 4

              You can’t add dependencies on the fly in Make, unfortunately. You can get a list of dependencies of a file in Makefile format in with gcc using -MD and -MF, but that complicates things a lot. Ninja on the other hand has native support for these rules, but from what I’ve heard Ninja is mostly made to be used by higher-level build tools rather than directly. (I mean you can manually write your ninja file and use ninja just like that, but it’s not as pleasant to write and read as Makefiles.)

              1. 5

                from what I’ve heard Ninja is mostly made to be used by higher-level build tools rather than directly. (I mean you can manually write your ninja file and use ninja just like that, but it’s not as pleasant to write and read as Makefiles.)

                That’s an explicit design goal of Ninja. Make is not a good language to write by hand, but it’s just good enough that people do it. Ninja follows the UNIX philosophy. It does one thing: it checks dependencies and runs commands very, very quickly. It is intended to be the target for higher-level languages and by removing the requirement from the high-level languages that they have to be able to run the build quickly, you can more easily optimise them for usability.

                Unfortunately, the best tool for generating Ninja files is CMake, whose main selling point is that it’s not as bad as autoconf. It’s still a string-based macro processor pretending to be a programming language though. I keep wishing someone would pick up Jon Anderson’s Fabriquer (a strongly typed language where actions, files and lists are first-class types, with a module system for composition, intended for generating Ninja files) and finish it.

                1. 1

                  The best tool for generating Ninja files is Meson :P

                  Admittedly not the most flexible one, if you have very fancy auto-generators and other very unusual parts of the build you might struggle to integrate them, but for any typical unixy project Meson is an absolute no-brainer. It’s the new de-facto standard among all the unix desktop infrastructure at least.

                  1. 1

                    I’ve not used Meson, but it appears to have a dependency on Python, which is a deal breaker for me in a build system.

                  2. 1

                    CMake, whose main selling point is that it’s not as bad as autoconf. It’s still a string-based macro processor pretending to be a programming language though.

                    It’s kind of amazing how wretched a programming language someone can create, when they don’t realize ahead of time that they’re creating a programming language. “It’s just a {configuration file / build system / Personal Home Page templater}” … and then a few revisions later it’s metastasized into a Turing-complete Frankenstein. Sigh. CMake would be so much better if it were, say, a Python package instead of a language.

                    I recall Ierusalemchy saying that Lua was created in part to counter this, with a syntax simple enough to use for a static config file, but able to contain logic using a syntax that was well thought-out in advance.

                  3. 5

                    You can’t add dependencies on the fly in Make, unfortunately.

                    The usual way to handle this is to write the Makefile to -include $DEPFILES or something similar, and generate all of the dependency make fragments (stored in DEPFILES, of course) with the -MMD/-MF commands on the initial compile.

                    1. 2

                      You can definitely do this, here’s an excerpt from one of my makefiles:

                      build/%.o: src/%.c | $(BUILDDIR)
                      	$(CC) -o "$@" -c "$<" $(CFLAGS) -MMD -MP -MF $(@:%.o=%.d)
                      
                      -include $(OBJFILES:%.o=%.d)
                      

                      Not the most optimal solution, but it definitely works! Just need to ensure you output to the right file, wouldn’t call it particularly complicated, it’s a two line change.

                      1. 3

                        You didn’t mention the reason this truly works, which is that if there is a rule for a file the Makefile includes, Make is clever enough to check the dependencies for that rule and rebuild the file as needed before including it! That means your dynamically generated dependencies are always up to date – you don’t have a two-step process of running Make to update the generated dependencies and then re-running it to build the project, you can just run it to build the project and Make will perform both steps if both are needed.

                  1. 1

                    A couple of clarifying questions:

                    1. You state that if you haven’t received an ack within X milliseconds, to mark the current message as sent and proceed. If you don’t care about retries, why not remove the requirement to listen to acks in the first place?
                    2. How important is event ordering to you? For most event architectures, it’s worth it to quash that requirement due to increased complexity.
                    3. What’s worse: a user not receiving a message, or a user receiving more than one copy of a message?
                    1. 2
                      1. I get acks 85%-90% of the times. So, I would like to optimise it so that it is ordered for maximum number of users and let it go out of order for few. Also, by adding this X amount of delay, the message is usually sent to user as ordered. The messages are going out of order when I send them instantly.

                      2. The current system is unordered and works really well (scale, maintainability). However, a lot of messages are sent out of order. So, ordering is very important. My naive solution is to add a delay of X ms after every message and it should solve for most cases. However, I would be slowing down simply and I don’t want to do that.

                      3. User not receiving a message is worse. But I would try not send multiple times either.

                      1. 4

                        Have you considered enabling PubSub ordering, with the ordering key being the user/room? Some of the tradeoffs are that you will be limited in your throughput (1MB/s) per ordering key, and will be vulnerable to hot sharding issues.

                        After enabling ordering, if the ordering issue still impacts a greater fraction of users than you would like, then the problem is most likely on the upstream side (Insta/WhatsApp). AFAIK there is no ordering guarantee for those services, even if you wait for acks.

                        My advice: if the current solution is working great without ordering, I would strongly suggest sticking with it.

                        1. 2

                          Once I enable ordering on the queue, it becomes difficult to distribute the work within multiple workers, right?

                          if the current solution is working great without ordering, I would strongly suggest sticking with it.

                          I so wish to do this, but seems I can’t :(

                          1. 3

                            Has someone actually quantified how impactful out of order messages are to the business? This is the kind of bug that a less-technical manager or PM can prioritize highly without doing due diligence.

                            Another suggestion is to make a design, and be sure to highlight whatever infrastructure costs are changing (increasing most likely), as well as calling out the risks of increasing the complexity of the system. You have the agency to advocate for what you think is right. If they decide to proceed with the design then go ahead and get the experience and you’ll find out if the warnings were warranted over time.

                            Quantifying the impact is a good exercise for you to do anyway, since if you build the system you can then put an estimate of the value you created on your resume.

                            1. 2

                              Correct; you will only be able to have one worker per ordering key, or you lose your ordering guarantee.

                          2. 2

                            If you want to avoid duplicates and lost messages, the only solution is to use idempotent APIs to send messages. If you do not receive an ack within some time, resend the message idempotently; lost sends/acks are repeated and the API provider filters the duplicates. Only proceed to sending message N+1 once you eventually succeed sending message N.

                            If your API provider does not provide an idempotent API, then you could try to query their API for your last sent message and compare it with the one you plan to send. But this is slow and, since it’s not atomic / transactional, is very racey.

                        1. 2

                          I usually find most of these already in the lint stage. Except instead of magic strings we just use TODO as most tools already look for this and prevent merge. Likewise, no-console or similar are mostly automatically there when you start new projects and shouldn’t be an issue.

                          Comments in the build file are somewhat problematic. Sometimes I want a comment there, so how you distinguishthe needed ones vs the forgotten ones?

                          Code Review is important for this stuff.

                          1. 2

                            Indeed. The usefulness of this technique I think depends an awful lot on which language/tooling you’re using. Some languages have amazingly powerful linters that can catch most or all of these types of things. Others, not so much.

                            1. 2

                              One quite workable approach is to have a linter that uncomments one comment block at a time, and then tries to parse the AST. If it parses, it’s likely code.

                              1. 1

                                I’d suggest a slight modification: instead of looking for all comments except those starting with #-#, only look for comments that start with # -, since those are more likely to be commented-out commands. I guess you’d still get false positive from lists, though.

                              1. 15

                                The author presents code equivalent to char* p = malloc(8); char* q = p + 4; *q; and argues that it’s not useful to null-check q — because if malloc() returns NULL, then q would be 0x4, not 0x0 (thus passing the null check).

                                However, this is begging the question — q is only able to have an invalid but non-null value after neglecting to null-check p in the first place.

                                I don’t find the article convincing. Just because memset() needs to be usable with any mapped address (which might include 0x0) doesn’t mean that we shouldn’t check for NULL ever.

                                1. 10

                                  While his example of passing 0x4 might be bogus, the overall message isn’t. He’s not saying “never check for NULL” (because his sample function dup_to_upper() does check for NULL when it calls malloc()) but the immediate reflex to check the input pointer for NULL won’t really help when there are a large number of invalid addresses that aren’t NULL that could be passed in.

                                  The point he made was better made in the book Writing Solid Code. That book changed how I code C, and now, instead of:

                                  char *foo(char *s)
                                  {
                                    if (s == NULL)
                                      return NULL;
                                    ...
                                  }
                                  

                                  but instead:

                                  char *foo(char *s)
                                  {
                                    assert(s != NULL);
                                    ...
                                  }
                                  

                                  In my mind, unless I have a very good reason, passing in NULL to a function is a bug. Checking for NULL and returning an error is quite possibly hiding a real bug. Why are you passing NULL? How did that NULL get there in the first place? Are you not checking for NULL elsewhere?

                                  1. 2

                                    The use of assert() here goes against sanitary program behaviour. Returning an error (while printing out useful information) is preferable (for example in Koios I used an enum for errors, null is represented as KERR_NULL) because the program can then act on that and save state, even if some of it might be non-useful, saving as much as is possible is just generally good behaviour. It also allows you to do more with that. Maybe you want to recheck assumptions and reload data, or some other kind of error-avoidance/compensation.

                                    How would you like it, as a user, if Firefox or Libreoffice, or other programs in which people tend to commonly work, just up and failed for (to the user) no observable reason?

                                    I don’t see how this is good guidance for anything but the simplest of C programs.

                                    edit: I forgot about NDEBUG.

                                    But that in effect makes the check useless for anything but (isolated) testing builds.

                                    Checking for NULL and returning an error is quite possibly hiding a real bug.

                                    I really don’t see how. Reporting an error and trying to compensate, either by soft-crashing after saving state, or using other / blacklisting data-sources, is not ‘hiding’ a bug. It’s the bare minimum any program should do to adapt to real world cases and scenarios.

                                    1. 4

                                      The use of assert() here goes against sanitary program behaviour. Returning an error (while printing out useful information) is preferable (for example in Koios I used an enum for errors, null is represented as KERR_NULL) because the program can then act on that and save state, even if some of it might be non-useful, saving as much as is possible is just generally good behaviour. It also allows you to do more with that. Maybe you want to recheck assumptions and reload data, or some other kind of error-avoidance/compensation.

                                      Assertions are intended for catching bugs not recoverable errors. An error that isn’t recoverable or was unexpected has occurred and it isn’t safe to continue program execution.

                                      How would you like it, as a user, if Firefox or Libreoffice, or other programs in which people tend to commonly work, just up and failed for (to the user) no observable reason?

                                      Many large systems including Firefox, the Linux kernel, LLVM, and others use a combination of assertions and error recovery.

                                      Assertion usage is one of the rules in NASA’s The Power of Ten – Rules for Developing Safety Critical Code.

                                      1. Rule: The assertion density of the code should average to a minimum of two assertions per function. Assertions are used to check for anomalous conditions that should never happen in real-life executions.
                                      1. 1

                                        An error that isn’t recoverable or was unexpected has occurred and it isn’t safe to continue program execution.

                                        A null pointer error doesn’t automagically invalidate all of the state that the user put into the program expecting to get it out again.

                                        Many large systems including Firefox, the Linux kernel, LLVM, and others use a combination of assertions and error recovery.

                                        Of course.

                                        Assertion usage is one of the rules in NASA’s The Power of Ten – Rules for Developing Safety Critical Code. […]

                                        Right, but that’s not the usage of the above null check, which advocates for never saving state on null, ever.

                                      2. 2

                                        If the documentation says “this function requires a valid pointer”, why would I bother checking for NULL and returning an error? It’s a bug if NULL is passed in. The assert() is there just to make sure. I just checked, and when I pass NULL to memset() it crashed. I’m not going to blame memset() for this, as the semantics of the function don’t make sense if NULL is passed in for either of its functions.

                                      3. 2

                                        Out of interest, why assert() instead of using the ptr and allowing the SEGV signal handler and/or core give you similar info?

                                        1. 9

                                          Some reasons off the top of my head:

                                          • To catch the null pointer as soon as possible. Otherwise, it may be propagated further and cause a segfault much farther away, which is harder to debug. Much more so if a null/corrupt pointer is stored in some data structure instead of just passed to callees, since the stack trace is no longer useful in that case.
                                          • To document the expectations of the function to future developers.
                                          • You can redefine ASSERT() in debug builds to fail unit tests if they are running, which is more dev-friendly than crashing your test system.
                                          1. 1

                                            Beside future developers, they’re also useful for documenting expectations for static analysis tools. The Clang static analyzer, for example, takes them into account.

                                            1. 1

                                              The “find the NULL as soon as possible” makes most sense to me. I guess I was thinking that using it (straight away) would provide this, but I agree we may do non-dangerous things (like storing it somewhere) before we deref it.

                                              Thank you.

                                            2. 1

                                              Some systems are configured to allow mapping memory at NULL, which would open up a potential NULL ptr deref vulnerability wherein arbitrary data was stuffed at 0x0.

                                            3. 2

                                              I love using assert. It’s simple and concise. In a project I wrote to integrate with Pushover, I use assertions at the beginning of any exported function that takes pointers as arguments.

                                              Sample code:

                                              EXPORTED_SYM
                                              bool
                                              pushover_set_uri(pushover_ctx_t *ctx, const char *uri)
                                              {
                                              
                                              	assert(ctx != NULL);
                                              	assert(uri != NULL);
                                              	assert(ctx->psh_uri == NULL);
                                              
                                              	ctx->psh_uri = strdup(uri);
                                              	return (ctx->psh_uri != NULL);
                                              }
                                              

                                              Also: I almost always use calloc over malloc, so that I know the allocation is in a known state. This also helps prevent infoleaks for structs, including compiler-introduced padding between fields. Using calloc does provide a little perf hit, but I prefer defensive coding techniques over performance.

                                              1. 1

                                                The one issue with using calloc() is that a NULL pointer on a system does not have to be all-zero (C standard, POSIX requires a NULL pointer to be all zeros). Yes, in source code, a literal 0 in a pointer context is NULL, but internally, it’s converted to whatever the system deems a NULL address.

                                                I’ve used a custom malloc() that would fill memory with a particular value carefully selected per architecture. For the x86, it would fill the memory with 0xCC. As a pointer, it will probably crash. As an unsigned integer, it’s a sizable negative number. As an unsigned integer, it’s a large number. And if executed, it’s the INT3 instruction, aka, breakpoint. For the Motorola 68000 series, 0xA1 is a good choice—for all the above, plus it can cause a misaligned read access for 16 or 32 bit quantities if used as an address. I forgot what value I used for MIPS, but it was again, for the same reasons.

                                            4. 6

                                              I just want to salute you for using “begging the question” to describe circular reasoning. I don’t often see that original meaning in the wild.

                                              1. 2

                                                Assuming NULL is 0x0 is wrong. Of course. The article’s entire argument falls down right there.

                                                1. 2

                                                  On what architectures is NULL not 0x0? I can’t seem to think of any off-hand.

                                                  1. 3

                                                    Multics has it be -1, AFAIK. Probably not anything with non-integer pointers either like x86 segmentation.

                                                    1. 1

                                                      Here’s a good page on exactly that question: http://c-faq.com/null/machexamp.html

                                                1. 2

                                                  I commited code that broke the build today because I doubly defined something. The worst thing is that I proof-read it. It built, but I didn’t want to run all tests because it was only a six-line diff and I didn’t want to wait for over an hour. It also passed review.

                                                  Let’s say I’m having trouble focusing working from home with the heat.

                                                  1. 1

                                                    This highlights how friction can be an issue — if your tests take a full hour, it must be very trying to one’s patience. Is this something your team could theoretically speed up by throwing more hardware at?

                                                    1. 1

                                                      I’ve thought of this as well. The simple answer is no – the tests are already run in parallel. The more detailed answer is no – but we could at least run the tests on a different machine than the development machine so that I could check out another branch and do some actual work while the tests are running.

                                                      In this particular instance, I could and should have run a specific testsuite which only takes about a minute but is a quick smoke test.

                                                      1. 1

                                                        Got it.

                                                        The context for my question was based on a mistake I made in the past. Basically I was too cheap to pay for sufficient parallelization on our continuous integration server (CircleCI), despite my head engineer saying the tests could be run 5x+ faster. Considering the cost of an engineer’s time vs. a few extra docker container instances, it was a false economy.

                                                        1. 1

                                                          With that many tests, ideally during development you run only the unit tests for the files/modules you have edited, and possibly their transitive dependencies. Bonus points if your testing tool watches for file changes and then takes a snapshot of changed files and runs tests on that, instead of on your development directory, allowing you to continue making changes.

                                                          Another possibility is to push your complete commits to side branches, and defer merging to your main branch to a bot that rebases them, tests them, and then merges them (assuming Git). Something like bors (combined with git-test), if you’re using Github.

                                                          1. 1

                                                            Unfortunately, there are no unit tests, and not everything is nicely modular. Every test is an integration test. The test suite consists of hundreds of thousands of integration tests.

                                                            This sounds horrible, but there are practical reasons that this is the way things work (and they are not all ‘no time to rewrite everything’).

                                                            Another possibility is to push your complete commits to side branches, and defer merging to your main branch to a bot that rebases them, tests them, and then merges them (assuming Git). Something like bors (combined with git-test), if you’re using Github.

                                                            We’re not, but that is a terrific idea. I just wonder how well it plays with the way things are currently set-up.

                                                    1. 1

                                                      Perhaps Gitlab might be something for you? Free for open source, and it might be feature complete enough for you:

                                                      https://about.gitlab.com/solutions/open-source/projects/

                                                      1. 1

                                                        I did take a look at Gitlab, but their offerings seem to be closer to development tools; what our project is really suffering from is deployed servers. AFAICT Gitlab doesn’t seem to offer anything in that space unfortunately.

                                                      1. 3

                                                        In my (limited[0]) experience your best bet is to find a company that actually uses your open source project and ask them if they’d be open to sponsoring something.

                                                        I know that this is usually a lot easier for bigger and older projects (until they are too big, then it’s sometimes not so easy to say “it’s nice that you want to sponsor a server, but we have certain requirements, so either you comply or it would be wasted”).

                                                        I have no idea about your project and your computing needs, but if bad comes to worse, a few cloud VMs could be all that’s needed. In one project we have a single VM as as demo server (it’s a web-based software) and while the original author does host the rest of the infra on a personal server, I’m pretty sure it could be handled by another 5 EUR VM, just for downloads, website, forum. (So maybe think hard if you really need several servers if forum+wiki+bugtracker only get low traffic, that makes funding a lot easier)

                                                        Again it depends on the case, but if one of your employers is open source friendly, it also doesn’t hurt asking if they’d be willing to sponsor a server (in exchange for a “sponsored by” banner).

                                                        Depending on how your main contributors’ countries of residency’s laws are, maybe it’s possible to just use Github Sponsors, or Ko-Fi or Patreon so that one person can “collect”?

                                                        Also there are several associations that act as proxy for collecting donations for open source projects, https://techcultivation.org/#overview comes to my mind or the “Wau-Holland-Stiftung”, but I don’t know which projects they would support, I guess it depends on size again.

                                                        [0]: exhibit a - the software I mention in the text, no big problem here, just the one demo server is nice, for security reasons I wouldn’t want to run several copies of the same software in all versions with public admin access on the infra that hosts the downloads

                                                        exhibit b - I contributed to PHP in the past and it’s a massive project, afaik finding sponsors for hosting and hardware has never been a problem (for 20 years), but it’s more this “if you want to be a mirror, you need to fulfill these criteria”

                                                        exhibit c… - most other projects I contributed to were fine with what SourceForge/Github/etc.pp offer for open source projects, more and more complemented by cheap cloud VMs where the maintainer(s) simply invest a little money. I know this might be speaking from privilege a bit, but for most software developers 1 EUR for the domain per month + 10 EUR for hosting is doable.

                                                        1. 1

                                                          Thank you for the thorough response!

                                                          Unfortunately corporate sponsorship is not really an option, and we definitely don’t have the clout that PHP has/had (closer to the opposite in fact..). Our computing needs are rather lightweight, apart from needing a fair amount of disk storage; we could pile it onto one VM, and there’s certainly stuff that would fit in the ‘always free’ tiers of various cloud providers. So that is worth exploring.

                                                          Managing ownership myself is a possibility, and prices would be affordable, but single-owner servers is what led to our current situation. I’m also reluctant to move to a model where one person is in charge of the entire cash flow unless the process is reasonably transparent. On that front, techcultivation.org looks worth checking out, thanks for the link.

                                                        1. 7

                                                          I’m currently a Python dev (apparently this is the most recent turn my career has taken), and I’m really bummed out by its web story outside of Django.

                                                          My last gig was Elixir, before that Node, and some Rails and Laravel in there. The tooling in the Python ecosystem, especially around migrations and dependency management, just feels clunky.

                                                          It singlehandedly sold me on Docker just so I didn’t have to mess with virtualenvs and multiple runtimes on my system and all of that. Like, what happened? Everybody groused about 2-to-3 (which is still hilarious) but like even without that I feel like the ecosystem has been vastly outstripped by “worse” technologies (see also, NodeJS).

                                                          1. 4

                                                            It singlehandedly sold me on Docker just so I didn’t have to mess with virtualenvs

                                                            One thing that made virtualenvs almost entirely painless for me was using direnv: in all my python project directories I have a bash script named .envrc that contains source .venv/bin/activate, and now cd-ing in/out of that directory will enter/exit the virtualenv automatically and instantaneously. It’s probably possible to set it up to switch pyenv environments as well.

                                                            1. 3

                                                              One of the reasons why Python packaging still feels so clunky compared to other ecosystems is that the Python ecosystem is a lot more diverse thanks to e.g. the scientific stack that has very different needs than the web peeps so there’s never gonna be an all-encompassing solution like Cargo. Pipenv tried and failed, poetry is carving a niche for itself.

                                                              But the primitives are improving. pip is currently growing a proper resolver and doesn’t e.g. Ruby still need a compiler to install binary packages? As long as long as you don’t use Alpine for your Docker images, Python’s wheels are great (they’re just a bit painful to build).

                                                              1. 1

                                                                How did pipenv fail?

                                                                1. 4

                                                                  Short answer: it’s too complex which makes it buggy and there wasn’t a release in over a year. IOW: It’s falling over it’s own weight.

                                                                  Long answer: https://hynek.me/articles/python-app-deps-2018/

                                                              2. 3

                                                                The tooling in the Python ecosystem, especially around migrations and dependency management, just feels clunky.

                                                                Currently working on a Rails app, coming from the Flask ecosystem. You have no idea how much I can miss SQLAlchemy and Alembic.

                                                                I agree about dependency management, but certainly not about migrations. Modifying models and auto-generating migrations works much better than the other way around for me.

                                                              1. 3

                                                                For the past few weeks now I’ve been looking for a code review tool that could be used together with gitolite, and was surprised to find that all of the available options are non-ideal or unusable for me in one way or another. I’m reluctantly turning towards making my own. My requirements are somewhat minimal, but I’d also like to tackle some of the design flaws I think github’s PR review system suffers from, by e.g. storing all versions of a PR, making it easy to compare PR versions, being able to save review progress, etc.

                                                                My current implementation is a flask app with a webhook endpoint that takes a repository name; when called, the named repository is fetched, and branches with names matching a given regex are treated as WIP branches. There are some views for diff rendering as well. It’s all refreshingly minimal.

                                                                If I manage to get code review working nicely, I’d also like to have test, lint, and code coverage results displayed in-line, per-commit. But that’s a ways off yet.

                                                                1. 6

                                                                  I’m surprised nobody has mentioned Leo. It’s a tree-based text/code editor. The tree is very flexible, and leaf nodes contain code/text/whatnot. You can even reference leafs from more than one place.

                                                                  It’s not code-specific. That is, you can use it for any text-based programming language (though there will be a step to export linear-text files), or you can use it for documentation, or todo lists, or whatever. It’s very flexible.

                                                                  I used it a long time back to do some literate programming. It’s evolved quite a bit since then.

                                                                  1. 2

                                                                    Leo is maybe more an outliner than a tree editor – the nuance being that a tree editor should have a notion of schema, while an outliner can be more freeform. They’re definitely two branches of the same family tree, but I feel there are unique challenges when interactively editing structured data that needs to (eventually) conform to a given schema.

                                                                    1. 1

                                                                      That looks quite neat; reminds me of a similar terminal program called hnb, although hnb seems more minimal and more targeted towards personal information management. Leo seems much more poweful and worth exploring, thanks for bringing it to my attention!

                                                                    1. 2

                                                                      The same applies for Java as well. Java’s reference types don’t get passed by reference. Instead, the variables are effectively pointers, and the pointers get passed by value. It’s a subtle distinction but an important one when you’re talking about programming language design.

                                                                      1. 2

                                                                        Barbara Liskov (for the CLU language) termed this call by sharing, where the callee gets references to the arguments but can’t change what they refer to. A nice thing is that you can understand primitives (that might be passed by value as an optimisation) in the same way, as long as you lack reference equality. Since they’re immutable you can’t change them and discover that they aren’t actually shared!

                                                                        Nowadays most people just understand it as ‘references passed by value’, which is less fun. ☺️

                                                                        1. 2

                                                                          The term call by sharing still reveals a fundamentally mechanistic conception of argument passing. A better way to think about it is to make a distinction must be made between values, which exist in the semantics of a programming language (which is a timeless mathematical object, thus the question of whether it’s mutable doesn’t even make sense), and representations, which exist in computer memory (which obviously exists in time and might mutate). Now things become clearer: there is always a single number 42 and a single string “Hello, world!”, but they may be represented arbitrarily many times in computer memory.

                                                                      1. 4

                                                                        I’ve been hacking together a small and mostly LDoc-compatible documentation generator for distribution with luakit, so that up-to-date docs can be installed with the main program. Shortcoming: type inference isn’t as nice. Benefit: it’s very easy to tweak the documentation generation and include project-specific information, e.g. auto-generate nice-looking keybinding descriptions.

                                                                        1. 6

                                                                          The idea of stealthy ad blocking is nice, but I think it will be limited in practice. It will always be possible to draw opaque rectangles over ads, but anything further, such as reclaiming the space used by ads, is fairly easy to detect with JS. Since stealthy ad blocking merely obscures the ad, it also does not save bandwidth, prevent user tracking, save CPU usage, or protect against malware; all commonly cited reasons for blocking ads.

                                                                          1. 8

                                                                            A thing to consider: most websites do not actually require Javascript, they only claim to. Most of the sites that do require Javascript do it with negative amount of benefit to user. Also, a large fraction looks much better if you kill all their CSS, but that’s another sad story.

                                                                            I guess what happens is that an empty-profile container-separated (and not root inside the container, of course) browser instance will get spun up for just enough to load the page, scroll through it and load all the scroll-loaded content, then pass the now-static content to a sanitizer which can remove everything that looks like an ad.

                                                                            Tracker handling is a complicated story if websites start checking the responses for more than presence and freshness, until that you can just load them with randomized parameters and random referrer.

                                                                            On the other hand, a single large enough click-spoofing false-flag-operation conflict can redraw the web landscape faster than throwaway containers come to browsing…

                                                                          1. 2

                                                                            This is great; I really liked Luakit when I used it previously before finding out about the massive security problems. I’m glad to see it updated to the Webkit 2 API.

                                                                            However, I think the documentation here glosses over the fact that there are still serious practical security problems with this library even after switching to the new API: https://blogs.gnome.org/mcatanzaro/2017/02/08/an-update-on-webkit-security-updates/

                                                                            1. 2

                                                                              Thanks for the feedback! I’ve added a short summary of that link at the top of the download section.

                                                                              1. 1

                                                                                Glad to see it. I would love to try out LuaKit again once Debian fixes their Webkit packages.