I’m pretty much the last guy on the internet to comment on this cuz the wordpress bookmarklet ate my post last week and I never noticed. Anyway, here’s the link:
Petabytes on a budget: How to build cheap cloud storage | Backblaze Blog.
When I first read it I thought “holy shit these guys are bordering on criminal malpractice” but remembering the age old rule “all engineering is tradeoffs” their setup makes more sense if you think about it. Basically, they must be handling data saftey at a higher network layer and not within the actual box. If you think about it that way, its really just a JBOD and there’s nothing special going on here. All the critical saftey flaws of an individual box are mitigated because you can lose the entire 45 disk box and not lose data. To be honest, I don’t even know why they bothered with raid6. The drives arent’ hot swappable so you’re taking the entire box down for a disk failure anyway. And the performance penalty is atrocious, but if you’re a backups company where most of your storage never even gets read again I suppose you don’t care.
So if you impliment your own network level data protections and don’t care at all about access speed this is a pretty cool setup. For anyone else its data suicide.