1. 39

  2. 4

    I think one important feature missing in most secret handling solutions is the ability to limit access to secrets to specific users/groups. In nixops, users need to be in the keys group to access secrets, but once they are, they can access all the secrets. The solution presented in the blog-post seems to have the same problem.

    So in my solution for secret management (part of my experimental deployment system), I require each secret to specify an owning user or group, which the code then installs into a path only accessible to that user/group. The file structure looks like this:

    /var/lib/nixus-secrets/active  root:root    0755   # Directory containing all active persisted secrets and data needed to support it
      + included-secrets           root:root    0440   # A file containing line-delimited json values describing all present secrets
      + per-user                   root:root    0755   # A directory containing all secrets owned by users
      | |
      | + <user>                   <user>:root  0500   # A directory containing all secrets owned by <user>
      |   |                                            # Permissions are as restrictive as possible, some programs like ssh require this
      |   |
      |   + <name>                 <user>:root  0400   # A file containing the secret <name>
      + per-group                  root:root    0755   # A directory containing all secrets owned by groups
        + <group>                  root:<group> 0050   # A directory containing all secrets owned by <group>
          + <name>                 root:<group> 0040   # A file containing the secret <name>
    1. 4

      How so? With something like this I am at least 98% sure that only the relevant user (xesite in this case) can read the secrets:

          within.secrets.xesite = {
            source = ./secrets/xesite.env;
            dest = "/srv/within/xesite/.env";
            owner = "xesite";
            group = "within";
            permissions = "0400";
      1. 3

        Ah! I had a misunderstanding there regarding the encryption. While the secret is encrypted using the hosts public key, it has to be decrypted with the private key, which is only accessible to root, not every user. Sorry for that, taking back what I said above!

        1. 2

          Argh, I feel a bit stupid now. NixOps supports that as well. The only point I could make is that the filesystem layout I showed doesn’t expose which user has which secrets, but not sure how worth that actually is :P

          1. 1

            I’d be willing to argue that your system is a very conventional thing than mine is. Conventions do help, but they can also quickly grow into bloated behemoths. I wanted to just put files where I expect them to be so I can remove a bunch of hacks around loading config.

      2. 1

        Interesting approach! The one thing that slightly bothers me is that it means that a side-channel that lets you somehow retrieve the host key from the sshd means that all secrets in the store would suddenly be readable.

        I guess it would be easy enough to generate a unique key per host that isn’t read by any persistent process however.

        1. 1

          Yeah, the only reason I chose ssh host keys was that it was easy to prototype with. In a production setting I’d probably use some kind of persistent host key.

          1. 1

            Nice approach! I like the exploration via systemd, as I learned something new there :)

            I have a question about storing secrets: in mkSecretOnDisk function, wouldn’t this:

            age -a -r "${key}" -o $out ${source}

            store the secret in plaintext on the builder? For doing builds on the local machine, this might not be an issue for you, I would assume.

            1. 1

              I built a small internal tool at $work to automate our NixOS deployments that integrates with Terraform. Still want to push to open source it, but simply haven’t had time to work on it lately.

              The setup we have is that secrets are defined in Nix lang but not part of any derivation. (Like nixops, I believe.) We then make them available in /run/secrets during activation only. Regular activation scripts are used to install them wherever needed.

              (Minor obstacle is that we always need to check existence, because activation runs are not always initiated by our tool.)

              1. 2

                We (not royal we, but my team) also built something that helps out deployments and integrates with Terraform, but didn’t have time to open-source it. It’s mostly based on https://github.com/tweag/terraform-nixos, but we had to rework the whole process of sending secrets to hosts (Terraform is not great when copying, as it leaves files around). For us, the secrets are defined in Terraform, and stored in the terraform state, but during the deployment they are stored on the remote machine, outside of nix store. The downside is that we have to redeploy on every secret change (which is simple), and ensure the services reload (which is not so simple).

                1. 1

                  Our setup is different from the Tweag approach in that we have a server/agent setup. A Terraform provider uploads a flake to the server which builds it, instead of building locally on the machine running Terraform. We pass secrets and other variables from Terraform to Nix by injecting a vars.json into the flake as we upload. Once the build completes, an agent (running on the target machine) downloads and activates the configuration.

                  Service reload is still an issue yes, because the Nix activation doesn’t notice any change if just secrets were updated. I still have to tackle it, but was thinking of adding a hash of (a subset of) secrets in, for example, the systemd unit as a comment. Just so that the file (and derivation) changes, and Nix understands it needs to give it a restart.

                  1. 2

                    We keep a secret per file/service and then this helps a bit:

                    systemd.paths = {
                      # we rely on this to detect changes to keys and
                      # automatically trigger the restart of the service
                      hydra-server-watcher = {
                        wantedBy = [ "multi-user.target" ];
                        pathConfig = {
                          PathChanged = [ "/var/keys/admin_password" ];
                    systemd.services = {
                      hydra-server-watcher = {
                        description = "Restart hydra-server on credentials change";
                        wantedBy = [ "multi-user.target" ];
                        after = [ "network.target" ];
                        serviceConfig = {
                          Type = "oneshot";
                          ExecStart = "${pkgs.systemd}/bin/systemctl restart hydra-server.service";

                    I forgot why we couldn’t rely just on systemd.paths. This works reasonably well, however we sometimes run into nginx not reloading on the latest Let’s Encrypt certificates (they are also pushed from the deployer, via Terraform, as our machines don’t have access to the internet, so we can’t use http challenge).