1. 18
  1. 15

    I’m not an electrical engineer, and my connection with cpu design is limited, and a long way from having to deal with actual behavior of physical devices, but here is my interpretation. Note that because I had difficulty understanding everything I also had difficulty simply reading the paper so I have probably missed stuff.

    First of the actual paper is here: https://www.researchgate.net/publication/355683752_Nanometer-Scale_Ge-Based_Adaptable_Transistors_Providing_Programmable_Negative_Differential_Resistance_Enabling_Multivalued_Logic/link/61799f01eef53e51e1f556de/download

    The supplementary info: https://pubs.acs.org/doi/suppl/10.1021/acsnano.1c06801/suppl_file/nn1c06801_si_001.pdf

    They’re using a germanium bridge/layer/gate controller thing. A property of germanium means that they can change the type of transistor. This allows them to convert gates between NAND and NOR, and then through mechanisms I don’t understand allows them to change behavior based on need.

    I can’t determine how practical this is as I don’t understand the paper enough, and can’t find the actual mechanism by which they’re proposing the transistor reduction.

    Because they’re using germanium the host CPU suffers the standard germanium problem (as experienced by cray) in which the device will fail at relatively low temperatures (~80C), while they have temp<->perf chart I don’t see any reference to how fragile the device is (e.g. does an overheat permanently destroy the transistor? it seems like something that would).

    I couldn’t work out the performance of the circuits compared to standard transistor approaches, and I couldn’t work out the performance of circuit transition.

    In the absence of info regarding what class of cpu they’re comparing it’s also difficult to determine how realistic the transistor gains are. A huge proportion of the transistors in a high end cpu are in memory and caches (take an m1max: https://images.anandtech.com/doci/17019/M1MAX.jpg - SLC is memory, the regular blocks in the various CPUs are caches, etc), and the examples they’re giving in the paper seem to involve transition memory or arithmetic circuits. A reduction in the arithmetic transistor count does not seem like it would not be hugely valuable, and the SRAM blocks are generally super performance sensitive, and additional logic controlling behaviour given the volume seems unlikely to be valuable as well (converting an SRAM block to<->from an ALU in the extreme wouldn’t be useful as CPUs already have difficulty saturating their existing ALUs).

    At the low end they’re already using very few transistors, to the extent that the bulk are interface and control logic.

    Overall I think the article and reviews needed stronger evidence to establish the cpu transistor count reduction.

    1. 10

      germanium problem (as experienced by cray) in which the device will fail at relatively low temperatures (~80C)

      Minor nit: Crays problem was with gallium (in gallium arsenide), not germanium.

      1. 2

        Oh derrrr :D

      2. 10

        This is an application of a technology that the paper’s authors have previously explored. I think they have some more details about how they change the conduction mode in this paper: https://www.researchgate.net/publication/311852864_The_RFET_-_A_reconfigurable_nanowire_transistor_and_its_application_to_novel_electronic_circuits_and_systems . The full text isn’t available on ResearchGate but a preprint is available on the, erm, you know, that other big place where they have a lot of papers.

        I am (allegedly :-P) an EE but it’s been so long since I last did substantial engineering that my brain pulled a muscle trying to read that paper. That being said, I’ve been seeing papers claiming that nanowire transistors could squeeze a few more years out of Moore’s law since about 2010 or so. As far as I know (but with the caveat above) they’re still confined to labs.

        The conduction process is a whole other story compared to field-effect transistors (this paper: https://www.sciencedirect.com/science/article/abs/pii/S0038110111002139 has more details). This is an even bigger problem than the Ge part. While the way these things are supposed to work is well understood (by people other than you and me :P), there’s still the matter of getting useful design tools that would allow people to design and simulate circuits that use these tools. I’ve been involved in a project that sought to do exactly that, albeit for a whole other kind of devices, and I think it took about five years (of admittedly modestly-funded research, by industrial standards) to get from a sound theoretical understanding to something you could plug into a VLSI design tool. I mean, something that someone else could use as a starting point to develop something reliable and useful, plugging my shit code proof-of-concept into anything would’ve probably been hopeless.

        I’m also going to go on a limb here and guess that, at best, we only have some simulations and vague guesses about high-frequency functioning for really tightly-packed devices of this kind. Most nanowire-based designs I remember seeing use heavy doping, which poses significant problems on Si devices. IIRC Ge devices wouldn’t suffer as much but modern CPUs don’t exactly work at 25 MHz and aren’t notoriously chilly, either. There’s some literature about high-frequency nanowire transistors but I don’t think I’ve seen anything recent about typical challenges for highly-integrated designs (e.g. leakage current). I think this: https://aip.scitation.org/doi/abs/10.1063/1.4932172 is the most recent thing I’ve read, and the results are actually encouraging, but that’s a micrometer-scale device.

        Then there’s the entirely non-trivial matter of reliably fabricating highly-integrated arrays of RFETs, at scale, and packaging them – presumably – along with a big hunk of CMOS logic. This is obviously a problem with any new design – I’m not trying to pull the smartass “but how are you gonna do this in practice nerd boy?” card here. Semiconductor manufacturing is already bogged down by a lot of problems. Coming up with a way to fabricate these things reliably is presumably no harder than with any novel design, but then you also have to make all that fit into an existing industrial process of mind-boggling complexity.

        IMHO, Tom’s Hardware is at least a decade away, if not more, from reviewing the first device that includes a non-trivial application of nanowire transistors, and I’m pretty sure that non-trivial application is going to be closer to a low-pass filter than a CPU. I’m not saying that’s good or bad, I haven’t been following the field lately. I’m a little skeptical but not because of anything I’ve read in this paper – academia has a tradition of beating dead horses until the last citation has been extracted out of the poor animal’s carcass, and nanotechnologies departments over here in Europe seem to have developed a dead horse fetish in the last fifteen years or so.

        1. 3

          Assuming that this could be made into mass production and the temperature / frequency bits addressed, it’s still not clear exactly how useable it would be. Modern processor design includes a lot of reuse. Each layer in the tool stack depends on some fairly standard components in the lower levels. When a fab introduces a new process, they provide an implementation of a cell library, containing a load of building blocks that the back ends of the higher-level tools can use. These let you move designs from one process to another fairly easily (at least, in comparison to creating a new processor from scratch).

          My intuition is that this technique wouldn’t make it easy to create more efficient implementations of existing cells, it would enable new more flexible cells. It’s not clear how these would then be exposed higher up the stack. If a new process technology requires a redesign of the high-level bits and prevents reuse of components across microarchitectures then it’s going to be very hard to adopt.