Thursday, December 8, 2016

What is Hadoop?

What is Hadoop? What are the basic components that make up a Hadoop (Big Data) system?

At its most simplest terms, these are the components that a Hadoop ecosystem contains.

Hadoop Distributed File System (HDFS)


Java based, distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster. Store and process vast quantities of data in a storage layer that scales linearly. It is designed to run across low-cost commodity hardware.

Hadoop YARN


A resource-management platform responsible for managing computing resources in clusters and using them for scheduling of users’ applications. It provides the resource management and a pluggable architecture for enabling a wide variety of data access methods to operate on data stored in Hadoop with predictable performance and service levels.

Hadoop MapReduce


A programming model for large scale data processing.

Hadoop Common 


Libraries and utilities needed by other Hadoop modules.



Reference:


https://github.com/hortonworks/tutorials/blob/hdp-2.5/tutorials/hortonworks/hello-hdp-an-introduction-to-hadoop/hello-hdp-section-2.md#13-apache-hadoop


No comments:

Post a Comment