Scalable distributed file system for large distributed data-intensive applications.It is fault tolerant, runs on commodity hardware, provides high aggregate performance.
Why?
- component failures are norms rather than the exception. they fail often, therefore constant monitoring, error detection, fault tolerance and automatic recovery is integral to the file system
- Files are huge by traditional standards and working with large number of them is a pain when it comes to I/O on these files. So it is important to consider I/O and block sizes when designing
- Files are mutated by appending rather than overwriting the existing data. Reads are usually sequential. Given this access pattern, appending becomes the focus of performance optimization, atomicity and caching.
- Co-designing apps and file system API benifits overall, considering the interaction between the files and applications
Architecture
- Single Master and multiple Chunk Servers, that can be accessed by clinets
- Files are split into fixed size chunks (64 MB), each chunk is identified by 64-bit globally unique chunk_handle assigned by the Master
- Chunks are written in the local disk of the chunk servers.
- Master only stores the metadata of the chunk servers and chunks
- Each chunk is replicated(3-replicas by default) on multiple chunk-servers for reliability.
... unfinished