1. 39
    1. 10

      I have to plug my article about this marvellous little piece of hardware: Fixing the TPM: Hardware Security Modules Done Right.

      To me, that’s how big the TKey is: its main idea (deriving an application specific secret) obsoleted the TMP. Before this I could kind of forgive the TPM’s complexity, but now any justification for this pile of bloat is gone: we have a better way.

      1. 4

        FYI, we’re aiming to ship the first CHERIoT chips in 2024. I think it would be a much better platform for your ideas because you can properly compartmentalise access to keys and so on (we’re most likely using the GF 22nm process, so will have non-volatile storage on the die, which will let you build requirements about persistent key storage into your code signing rules). If you’ve got an Arty A7, you can play with our prototyping platform now, but I hope we can get you one of the chips once they’re packaged.

        1. 2

          I think it would be a much better platform for your ideas because you can properly compartmentalise access to keys and so on

          Yes, I remember the long thread where you eventually sold me on compartmentalisation, which the TKey as such cannot do. I may be able to implement that on the unlocked version, but (i) the tiny FPGA it runs on is already packed full, and (ii) I have yet to write a single line of Verilog. Not to mention the other safety features of CHERIoT, so, yeah, colour me enthused.

          If you’ve got an Arty A7, you can play with our prototyping platform now,

          I don’t, though maybe I’ll purchase one if (once?) we have a decent free software toolchain for it. But first, I need to write some hello-world blinking LED on my TKey unlocked and learn how to FPGA.

          but I hope we can get you one of the chips once they’re packaged.

          That would be beyond awesome.

          1. 3

            I don’t, though maybe I’ll purchase one if (once?) we have a decent free software toolchain for it.

            We have a F/OSS toolchain for the software bits (which are the only bits shared between the prototyping platform and the final version).

            It looks as if openFPGALoader supports the board, which is a really useful discovery because I’d been wondering how we’d distribute updated bit files to partners (installing Vivado is a huge amount of pain and suffering, since it requires waiting for a human to verify your export compliance status and does not give helpful error messages that this is the reason for the problem).

            I am currently using Vivado in a Docker container with Rosetta. This is fine for building, since that can run from the command line, but programming the FPGA requires using X11 to display the (awful Java) GUI and running a little program on the Mac that exposes the USB programming interface via their remote cable protocol. This is a lot of string and duck tape. Being able to just run openFPGALoader on my Mac will be a huge improvement.

            F4PGA supports the board, but I don’t know how much integration work is needed to make our prototyping platform build with it. It looks like it should support our existing constraints files. I’d love to make that a supported flow.

            1. 1

              How’s the performance of Vivado in Rosetta?

              1. 4

                Hard to compare timing exactly (we’re not yet forcing a fixed seed, so the timing is pretty variable across runs), but it seems to take me about as long on my laptop as it takes Kunyan on to the (x86-64) build server that he’s using to build his bitfiles. Producing a 20 MHz bitfile for the CHERIoT Ibex took me <10 minutes. The 33 MHz one takes about 45 (we’re pretty close to the edge for timing at 33 MHz), but took 15 the first time I ran it. It’s single threaded for almost the entire run, which is annoying (11 cores on my laptop sitting idle, even with max threads set to 12, and the wall clock time is more than half the CPU time).

                After @Loup-Vaillant’s comment, I played a bit with the open source FPGA tools. A lot of the design is in tcl files and so I couldn’t work out how to translate them into something that F4PGA could understand (it seems to assume that all of your build is either Verilog or constraints. Possibly the TCL is setting things that could be expressed in the constraints file somehow?), but loading with openFPGALoader was much faster than using the Vivado GUI, so I can now throw that away and just build and load from the command line.

          2. 1

            Oooh, CHERIoT chips are coming next year? Any idea about pricing, either for just the chips or for devboards? I would really like to get my hands on a real CHERI system.

            1. 3

              Pricing isn’t finalised, we’re working on the exact feature set (driven by customer demands, if you know anyone who might want to buy a lot of them then let me know!). I’m aiming to get close to $1 for v2 in bulk, but v1 will be more expensive. Much cheaper than the FPGA dev boards though. We’re aiming to sell both bare chips and M.2 MicroModules, and probably use an existing dev board that can house the M.2 (there are a bunch of nice off-the-shelf ones).

              We’re using the Arty A7 a prototyping platform. It currently runs the CHERIoT Ibex at 33 MHz and has a working Ethernet interface (I’ll be open sourcing the compartmentalised network stack in January, on my desk it connects to my home network and happily works with IPv4 and v6 but currently has almost everything shoved into one big compartment). The ASIC should be 200-300 MHz, somewhat dependent on the power envelope. The A7 is only about $300, which is fairly cheap for a dev board, but more than an order of magnitude more than an ASIC for final deployment.

              1. 1

                Are M.2 MicroModules the same as the SparkFun MicroMod system? Are there other compatible suppliers?

                I am vaguely interested in higher-density connections for MCU dev boards than the usual 0.1 in pitch pads/pins, especially if there are existing ecosystems I can use. (Tho right now I am more interested in FPC ribbon cables than direct board-to-board connections.)

                1. 1

                  Yup. I think that’s the system the hardware folks have been looking at (I stop at digital logic. Anything that involves physics is someone else’s problem).

                2. 1

                  driven by customer demands, if you know anyone who might want to buy a lot of them then let me know!

                  Alas, I only know hobbyists and small scale makers who might want to buy tens of chips on average. I know a few people who work at large companies that could conceivably ship large volumes, but that’s probably a bit too indirect :p

                  I’m aiming to get close to $1 for v2 in bulk

                  That’s very reasonable. Once V2 is available I’ll pester some of the local electronics distributors to stock a few reels. (ordering small quantities from them tends to be much cheaper than small quantities from international distributors, IME)

                  We’re aiming to sell both bare chips and M.2 MicroModules, and probably use an existing dev board that can house the M.2 (there are a bunch of nice off-the-shelf ones).

                  I hadn’t heard of this before. Is it the same as what SparkFun calls MicroMod? (that was what I found when googling, anyway). Do you have any specific recs for a nice one?

                  I wonder if anyone might want to produce boards in the RPi Pico form factor, which I’ve found quite convenient.

                  I’ll be open sourcing the compartmentalised network stack in January, on my desk it connects to my home network and happily works with IPv4 and v6 but currently has almost everything shoved into one big compartment

                  I’m looking forward to seeing it!

                  The A7 is only about $300, which is fairly cheap for a dev board, but more than an order of magnitude more than an ASIC for final deployment.

                  That’s not so bad, I think I’ll get one of those if I find a job anytime soon.

                  so will have non-volatile storage on the die

                  How much, if you don’t mind me asking? Is it enough to reasonably store firmware, or smaller and suitable just for application data?

                  Edit: final question. How good is it at generating entropy on-chip?

                  1. 2

                    Once V2 is available I’ll pester some of the local electronics distributors to stock a few reels.

                    Note, v2 will not exist unless we sell enough of v1, though hopefully most of those can go to military and critical infrastructure providers, who are willing to pay a (modest) premium for security features that they can’t get anywhere else. My goal has always been to approach no-security microcontrollers in price though.

                    How much, if you don’t mind me asking? Is it enough to reasonably store firmware, or smaller and suitable just for application data?

                    Still finalising that a bit. It looks as if we have quite a bit of area to play with because we’re pad-limited (we need area along the edge of the chip to solder wires to, the smallest chip we can make that has space for all of the external connections we need leaves loads of space in the middle for logic). I really hope we can get enough NVRAM for A/B firmware with execute in place, since that eliminates the need for most secure boot complexity (you validate signatures writing to the B firmware and grant write access to it and the boot toggle only to the compartment that will do that), which also gives us more crypto agility since we can move to quantum-safe signature algorithms when we need to.

                    How good is it at generating entropy on-chip?

                    There will be an on-chip entropy source, which should be adequate for crypto operations (not sure what its sample rate will be yet).

                    1. 1

                      Not, v2 will not exist unless we sell enough of v1,

                      Good luck!

                      I really hope we can get enough NVRAM for A/B firmware with execute in place, since that eliminates the need for most secure boot complexity

                      That’d be really great.

                      There will be an on-chip entropy source, which should be adequate for crypto operations (not sure what its sample rate will be yet).

                      Even a not so good sample rate should be enough to seed a CSPRNG at boot, and depending on how threat model and how hard it is to read NVRAM externally, perhaps saving a seed until next boot?

                      1. 2

                        Even a not so good sample rate should be enough to seed a CSPRNG at boot

                        Yup, that’s my expectation. It gets a little bit interesting with multiple compartments having to trust the random number generator but that’s no different from multiple processes on a conventional OS trusting /dev/random (actually, better, since they know exactly the code in the CSPRNG compartment and know nothing else in the OS can tamper with its internal state).

          3. 3

            I think their website is missing a tl;dr description of the hardware, so let me try it: the main component is a modified PicoRV32-based SoC running on an iCE40 FPGA. That FPGA is interfaced to USB via a CH552 micro-controller (cheap 8051 that natively supports USB). They seem to have a custom hardware RNG.

            1. 7

              The phrase “custom hardware rng” doesn’t fill me with joy

              1. 3

                You can read more details about their TRNG design here: https://github.com/tillitis/tillitis-key1/tree/main/hw/application_fpga/core/trng. tl;dr: Many free-running ring oscillators being sampled in a smart way.

                Betrusted (FPGA-based secure device) took a different path: they both have an avalanche noise generator and an in-FPGA TRNG similar to the one found in Tillitis (see https://www.bunniestudios.com/blog/?p=6097).

                I’m not sure what should be considered acceptable in that domain.

                1. 2

                  I would regard that as a hardware entropy source, rather than a hardware random number generator. It looks great as an input into Fortuna (or Yarrow if you enjoy doing difficult maths), not as a replacement.

                  1. 1

                    I confess I was not convinced by their exact technique: if there’s a bias, even very slight, in the RNG, it is liable to affect every single bit the same way. So instead of using it directly I would rather accumulate somewhere between 256 and 512 bits from that source, then hash it with BLAKE2s to obtain 256 bits I’ll be pretty sure will be close enough to uniformly random.

                    1. 1

                      That seems to be what they recommend, anyway.

                      I spent a little time trying to find a guaranteed good enough procedure for sampling the RP2040 randombit, to feed into Gimli, but I put it on the back burner a while back. I had really hoped that the RPi engineers would actually characterize it, but instead they just merged a really crappy way of using it for low quality random number into their SDK.

                      1. 1

                        That seems to be what they recommend, anyway.

                        Oh, I didn’t know, my bad.

            2. 2

              U2F/FIDO [unchecked]

              Does this mean it’s a goal & could be done? Neat device regardless.

              1. 4

                Having purchased a TKey and played with it, I can confirm it can be done. The main problem here is the communication between the web browser (or any user agent) and the TKey: the TKey itself doesn’t follow any specific standard (it’s a simple serial interface you can use to load programs and transmit data), and it certainly does_not_ behave like a Yubikey, so we probably need to write a dedicated browser plugin to make it work.

                1. 3

                  Seems like a cool project now that Android Fx finally opened up to add-ons (again). Yubico’s products were closed source & unmodifiable, so this is a step in the right direction IMO.

              2. 2

                There is no way of storing a device application (or any other data) on the TKey. A device app has to be loaded onto the TKey every time you plug it in.

                I’m confused as to what this provides. If the host is compromised so is any program loaded.

                1. 14

                  The main feature seems to be that all cryptographic keys the loaded program uses are derived at boot time from a hash of the program and a secret device-unique identifier. That way if the program loaded onto the device is modified, a different key will be derived and any returned signatures would no longer match.

                  I could see this being really useful for a single sign-on project I’m working on where a central auth service manages OAuth with the identity provider and services can request short-lived access tokens to act on behalf of the user. With this, the auth provider service would only need to focus on managing sessions while verifying OAuth and issuing access tokens could all be done on the device, removing the risk of the auth server being compromised and issuing bad tokens.

                  1. 2

                    Updates are a bit of a problem though. If there’s a vulnerability in the program, you need to fix it and then provide an update, but the updated version will have a different boot hash and so will generate different keys. If you’re just using it for identification, that’s probably fine: you log in with the insecure version, load the new code, and replace the device’s old public key with its new one on the server. At update points, you’re now vulnerable to attacker-in-the-browser problems though.

                2. 1

                  https://tillitis.se/products/tkey/ has a bit more information on what it can do currently

                  Its kinda easy to think of applications, kinda hard to make them foolproof however.

                  I expect there will be a import key/export wrapped key/import wrapped key set of functions for a lot of applications, so one doesn’t lock themselves into 1 specific version of an application (with the original key going into offline storage in a safe in another safe in a bunker, etc etc).

                  One example application uses Ed25519, which is very nice.

                  I’d love one, but no idea what I’d use it for yet.