What is Erasure Code?
Hadoop 2.7 isn’t out yet, but it’s scheduled to include something called “erasure code.” What the heck is that, you ask? Here’s a quick preview.
The short answer is that erasure code is another name for Reed-Solomon error-correcting codes, which will be used in Hadoop 3.0 as an alternative to brute-force triple replication. This new feature is intended to provide high data availability while using much less disk space.
The longer answer follows.