1. 15
  1. 8

    One must also make sure that the feature flags are ephemeral. We have 310 different configuration options in our application at the time of writing this. Many of them are flags to enable or disable certain features. This makes sense because not every customer wants it alike.

    I would have used a more temporal system for flags if we were to implement them during A/B testing or similar. Every configuration option that has every existed must be kept around for legacy reasons.

    1. 10

      This was a hard-learned lesson for me. I have “temporary” feature flags that have been in production for nearly a decade now. Any feature flag system I’d be integrating today needs some kind of expiry date and notification process.

      1. 2

        Yeah, I think you have to have a process in place to integrate feature flagged stuff into your product after a while so you don’t have to deal with them a decade later. That’s, of course, can be done if you have SaaS or single-install-source solution. If you have a situation like Enpo above with different, customized, installations for each client you are pretty much toast.

      2. 2

        That is … mind boggling. How on earth do you even attempt to test any amount of thesetup options?

        How many of those flags are single-deploy / single-user ones written sort of on-demand for a certain client and hence only used by one deploy? Was doing it as a fork / patch / (other way using version control) ever considered? How is it day to day to work with?

        Sorry, I have so many questions – it is just such an extreme case I am so curious how it actual works day to day – is it pain most days or just something you don’t think about?

      3. 6

        I try to avoid feature flags unless deployment and release absolutely must be separated. Each feature flag doubles the number of available code paths. Ten feature flags yield 1,024 possible combinations. How sure are you that each feature flag is truly independent and that there is no undesirable behavior that can emerge from unanticipated combinations of feature flags? I am rarely sure.

        Sometimes there are good reasons to deploy code with features disabled. More often, in my experience, the reasons are poor and the decision is made by non-technical business functions (marketing, sales) because they do not understand the cost in complexity of littering code with these kinds of branches.

        1. 3

          Not having the flag means that you are coupling “put out this new feature” with “roll out these fixes”, so if there’s any issues with the new feature it also means rolling back fixes.

          If you’re on a decent sized team, with several people working on independent features, making sure that “merge into master” doesn’t immediately couple the fate of all these features is very valuable.

          Granted, in our case we use feature flags almost exclusively for “features in QA/testing before release”, and at most there are two feature flags that can affect code execution in a specified block of code. If you are writing feature flags that interop with each other 1024 different ways I think you don’t have a feature flag problem, you have a coupling problem.

        2. 6

          My experience is that you want as few of these as possible. They can attract tech debt, especially with large products.

          It’s worth thinking about how you test and maintain your product when using Feature Flags. If you are just using these to roll out code then I don’t see much problem with them because it’s a fairly local engineering concern. Feature flag usage to control complete features, or A/B test are more problematic because you can end up using tools that delegate flags to PMs who may roll to 100%, but then you may end up not removing the flags from code.

          Further, you have to add up all your user toggles plus feature flags to understand all of the variations of your product. There is a tendency to want to say that you don’t want to add toggles for end users, but then pressures from sales, marketing, etc. force you down a path of adding Feature Flags. A properly designed in-product toggle may have been the lowest cost of ownership since it is then taken seriously as a variation you support versus something that can be somewhat hidden and missed by later engineers. Plus, it can be a lever to create collateral like public documentation that helps people understand why a feature exists or works the way it does. This doesn’t happen when someone puts a hack behind a feature flag that later gets enabled for all of your top high-touch customers.

          1. 2

            Further, you have to add up all your user toggles plus feature flags to understand all of the variations of your product.

            It’s worse than addition. I think it’ll be 2^n variations, assuming you have n toggles and each one has two states, on or off.

          2. 4

            This is like asking for technical debt

            1. 3

              Part of me is like, “this is very 2007”

              Another other part of me is like, “the number of people who still don’t do this is quite dissappointing”

              but I guess that goes for most things that engineering companies get away with :/

              1. 3

                I’m still yet to grasp the benefit of feature flags over feature branches. If you want people to test ‘pre-release’ code, build and deploy a separate instance off your feature branch.

                The likelihood that you want to target specific users with half-done functionality in production seems remarkably rare to me.

                1. 2

                  Judging by how much success teams like the Google Chrome team have with this approach, it’s obviously got big upsides.

                  I have a slight worry. What do you do about this scenario: flagged out feature deployed, its lightly tested code has a serious bug (data destruction or leaking, i.e. one where the consequences persist even if you turn the flag off after), it gets switched on by accident or malice in production, havoc ensues?

                  1. 9

                    This is similar to the incident that destroyed Knight Capital Group. A feature flag was re-purposed (!) to toggle a feature unrelated to the original feature for which the flag had been added. A deploy went wrong such that some servers were running the old code with the old flag semantics and some the new code with the new flag semantics. They lost $440 million in 45 minutes.

                    1. 1

                      Ooh thanks. I was sure there was a recent-ish example in the news but I couldn’t remember.

                      1. 1

                        And this is why some FAANG companies use Immutable Infrastructure internally.

                        Code and configuration are built and packed together and go through all integration and pre-production CI steps without changes.

                    2. 1

                      Recently I stumbled across https://martinfowler.com/articles/feature-toggles.html, might be worth checking out if you want a more in depth look at feature flag implementations.