Let's start from scratch.
Hadoop basically consists of three components
- HDFS(Hadoop Distributed File System)
 
- MapReduce
 
- YARN(Yet Another Resource Negotiator)
 
HDFS, the name explains it all. It is a distributed file system which stores data in commodity hardware. HDFS can store any type of data regardless of Structured, Unstructured and Semi-Structured data. It provides data in a better manner but ends up replicating the data. Being just a File-System it stores data in flat files and HDFS also lacks random Read-Write capabilities.
- It can boost up the speed for accessing Big-Data
 
- It follows the slogan of "Write once, Read Many"
 
- Lacks random Read-Write capabilities
 
MapReduce is a framework used to compute and process Big-Data. Unlike HDFS, MapReduce can access data randomly but HDFS was proven to be good for sequential data accessing. so, this when HBase comes into the picture.
- HBase stores data in terms of Key-Value pair
 
- Low latency in data accessing regardless of the size of the data file in which it needs to search the needed data
 
- Flexibility in Data Model
 
YARN acts like a manager between HDFS and MapReduce.
Hadoop is used for Batch-Processing and HBase is used in Real-Time needs.