You might consider putting some basic benchmark comparisons in the README.
Thanks for the suggestion! We added some :)
I’ve never quite understood the reason to prefer this sort of deployment pattern for features rather than using environment variables or command-line arguments to toggle feature flags. I know that it is later-bound, which is preferable in general, but the extra moving parts look like additional maintenance costs to me.
It starts making sense (to me, at least) once you have enough hosts. At work I’ve worked on systems running on around a thousand hosts, and being able to toggle flags without doing an infra rollout is nice.
At Google, flags were flipped using command-line arguments to containers, and special flag-flipping deployments could be done to only flip flags but not update or relocate containers. It did not impact velocity, because in practice, flag flips were batched and given their own deployment cadence.
I worry that a fancy feature gate becomes an anxiety-relief tool where product owners can endlessly toggle features on and off in order to fine-tune the user experience. But feature gates and flag flips are not meant for that; they are meant to allow code to roll out incrementally without incrementally enabling new features, improving stability. Googlers were encouraged to deprecate and remove old flags as part of service maintenance. This cuts down on the overall number of code paths, improving code quality.
I’ve seen this failure mode play out a number of times… the number of feature gates grows out of control and it becomes impossible to predict how the system is going to behave. It relies on gates not as a release control tool, but as some kind of central database that the product cannot work without, then bad things happen.
Just like with any tool, there are trade offs and teams can shoot themselves in the foot if they don’t have any discipline.
The core idea is to separate the deployment of code from the release of features. Environment variables are immutable, updating them requires restarting processes. When there are a few hundred instances of a program running, restarting them all can take hours sometimes. This means that if something goes wrong, you need as much time to rollback to a stable state as you have spent rolling out the latest version, meanwhile your customers are suffering from the software defect.
Decoupling the deployment and release using feature gates helps scale engineering teams and reduce the risk associated with shipping software updates. Propagation of feature gate changes is <10s, and so is the rollback time in case something goes wrong.
But the major advantage of feature gates is the ability to target specific customers, or groups of customers, to enable the updates for. A typical release driven by feature gates will go through a dozen updates, enabling customer tiers from the least risky to the most; which are usually from free to highest paying accounts. This can be expressed with immutable configuration as well, but would require a restart of the fleet each time a new customer tier is enabled.
This is not the only way to solve these problems, but it is one that is often useful.