1. 2

    Today I finally made the jump and gave my 8 year old but still fantastic laptop (a Toshiba A300-1M9, it has more stuff on it than modern laptops!) a long overdue upgrade. Nothing as fancy as what’s in the article, sadly, 256G was all I could afford. I received the SSD, unpacked it, and was already flummoxed that this little thing, weighing nothing, is able to pack so much data and can work at such a high speed.

    To satisfy my curiosity I copied a 10GB ccache volume from the old disk to a zram block device on my desktop, and from there to the SSD. Took 30 seconds exactly which is just, I have to say the word, amazing.

    On one hand it’s money well spent. But on the other hand, the price for these things - mine costed €80 - seems to be artificially high. Why oh why do they have to be so expensive? And now this, 15 TB, on a 2.5" device with even higher transfer speeds, too! Holy mackerel!

    Sadly, Samsung said nothing about the price in the article. But it must be unaffordable. Or will developments like these drop the price for lower size such as mine.

    Whatever the case may be, 15TB on 2.5" seems incredible. Where does it end, i.e. how much is the theoretical maximum for a 2.5" drive?

    1. 3

      The price is high for a reason - there is massive demand. As popular as SSDs for laptops etc are, something like 96% of the NAND produced is consumed by cellphones.

      SSD prices are coming down steadily too - every year there is improvement in the fabrication process (50nm -> 30nm -> 21nm etc) which means better density/dollar, although we’re hitting the limits of that now. There are also technology improvements along the way, like Samsung’s V-NAND (3-D stacked NAND) which is what makes this 15TB drive possible. The same kind of flash (48-layer V-NAND) is used in this 2TB USB3 SSD

      I’d expect the price for that 15TB drive to be somewhere between US$0.50 and US$1/GB (so, $8K -> $15K, at a guess), although a SAS SSD like this one usually commands a premium over SATA SSDs, so the upper end of that range is more likely

      On the flip side, in terms of IOPs, you could replace an entire disk shelf of the fastest spinning drives with one of these, and still have spare capacity. For some parts of the industry, this is exactly what they’ve been waiting for: an extremely high capacity drive that will sustain 3 complete drive writes per day.

      1. 1

        Whatever the case may be, 15TB on 2.5" seems incredible. Where does it end, i.e. how much is the theoretical maximum for a 2.5" drive?

        I don’t know, but I’m guessing the tradeoff with capacity is how reliably you can get the data out. FS’s like ZFS ans BTRFS might be required to use these larger capacity drives just because they are more likely to corrupt data (I don’t know if they are, but seems likely, at least now).

      1. 5

        “Furthermore, those are probably the maximum "turbo” frequencies being quoted, which are unlikely to be sustained under any kind of real multi-core load. "

        A simple google search could have supplied him the answer to this, which is extra weird since he did link to the intel ark for the i5 chip. Spoiler: The atom does not have a “turbo” mode.

        The atom also isn’t a terrible option, as it does have some benefits. The atom supports up to 64GB of ECC ram, and supports up to PCIE x16, which could be handy if you need to slap in a 10g chelsio or something. I think the atom also has a larger L2 cache (I vaguely recall the i5 having a smaller L2 cache, but a larger L3 cache). The atom chip is also a bit cheaper.

        I do agree that the integrated stuff from netgate/pfsense does seem a bit pricey. The ones from the pfsense store come with support though – cheaper if you don’t need the support to simply purchase directly from netgate without support.

        1. 3

          It gets worse. That’s actually an 8-core Rangeley CPU ( a variant on Avoton, which was pitched as a “server-grade atom”), which was designed for routing/comms situations. Factoring the 8 cores in, it actually performs about the same as his chosen i5-5200U. Rangeley featured Intel QuickAssist, which does a lot of crypo offloead - basically exactly what you want for a VPN endpoint, the stated intended use.

          Of course, for $1400 you get a lot of scooter computers. They’re almost certainly the right fit for him, as he ended up using them for other things. If I wanted a cheap VPN endpoint I’d go for an Ubiquiti Edgerouter Lite at the moment - sub $100, hw accel forwarding @ gige. It’s not a generic “scooter computer” though…

          1. 2

            Edgerouter lite’s are pretty nice. I have one in fact! I got it to fiddle around with – I currently use a lanner box with pfsense on it for my home FW. However, do note that if you aren’t using ipsec, the vpn throughput is going to be slow, as openvpn does not utilize the edgerouter offload/acceleration chip. Just something to be aware of.

            As to $1400 buying lots of scooters… this (or even just the board) is a bit more comparable if you don’t need support for pcie/10g stuffs.

        1. 6

          This seems like a bad compromise.

          No computer should need to be powered off to replace a disk in a RAID set (SATA and SAS are electrically hotswappable, and the majority of controller chipset drivers actively deal with hotswapping; the ones that don’t can be manually rescanned anyway in my experience). If your chassis is such that you have to power it off to physically remove a drive, then it’s worth looking at fixing that, rather than introducing a second system with presumably the same issues. Regarding the RAID toolset, “having to re-learn” the RAID management tools is just another way of saying “I didn’t document my tools properly”.

          Having an entire second system running as a redundant copy is expensive in terms of space and power. It also introduces it’s own risks, the obvious one is that there’s no way of detecting any silent bit errors, and any such errors will be silently copied to the redundant system. Modern RAID systems (even mdadm) will do proactive disk scanning and will pick up any silent bit errors and correct them from the RAID set if needed.

          This approach still doesn’t cover you against a double-disk failure: What if a disk in your primary fails, and at the same time a disk containing the same (or at least some of) the data in the secondary system fails? There is somewhat less risk here than a second disk in a single RAID5 set, but it’s still there. Just as RAID is not equivalent to a backup, neither is this approach.

          1. 3

            The irony here is that they, an enterprise, have enough redundancy and infrastructure to handle the failure rate of consumer-grade drives while I, a consumer, just want the damn thing to work so should probably get an enterprise-grade drive.

            1. 1

              It sounded like enterprise drives offered no measurable improvement in reliability.

              1. 2

                Close: Enterprise drives offered no measurable improvement in reliability for their sample and their workload.

                They didn’t do great statistics, just said something like “we saw 4.6% failure on the enterprise drives and 4.2% failure on the consumer drive”, but no expression of confidence. Furthermore, their systems with enterprise drives were running their core services for the most part, and their consumer drives were not. They had one pod with enterprise sata drives and claimed a ‘statistically consistent’ result, but this is tiny sample - 45 ES drives in a pod vs ~370 ES drives in total vs ~14700 consumer sata drives in total.

                Furthermore, 4.2% is still atrocious, given their pod workload. Or at least, their presumed workload, because they haven’t gone into it much, at least as far as I’ve seen. What I do know is that their model is for customers to do bulk uploads (slow sequential writes) on their systems, which then sits round for ages, and maybe gets read occasionally. It’s fairly unlikely that data gets overwritten at any great rate. Nothing wrong with this model at all, but it’s nothing like an active mail, database, or file server, for example.

                As caboteria said, this doesn’t really matter for Backblaze - they can deal quite happily with a 4.2% failure rate because their operational and data models work fine for it. So if you’re looking at doing something that has similar data/drive use characteristics to what Backblaze do and can swing the same operational model, you’re fine.

                But that’s about as far as their analysis goes, and I wish they’d be more honest about that.