Software exercises and notes: What is Hadoop?

What is Hadoop? What are the basic components that make up a Hadoop (Big Data) system?

At its most simplest terms, these are the components that a Hadoop ecosystem contains.

Hadoop Distributed File System (HDFS)

Java based, distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster. Store and process vast quantities of data in a storage layer that scales linearly. It is designed to run across low-cost commodity hardware.

Hadoop YARN

A resource-management platform responsible for managing computing resources in clusters and using them for scheduling of users’ applications. It provides the resource management and a pluggable architecture for enabling a wide variety of data access methods to operate on data stored in Hadoop with predictable performance and service levels.

Hadoop MapReduce

A programming model for large scale data processing.

Hadoop Common

Libraries and utilities needed by other Hadoop modules.

Reference:

https://github.com/hortonworks/tutorials/blob/hdp-2.5/tutorials/hortonworks/hello-hdp-an-introduction-to-hadoop/hello-hdp-section-2.md#13-apache-hadoop

Software exercises and notes

Thursday, December 8, 2016

What is Hadoop?

Hadoop Distributed File System (HDFS)

Hadoop YARN

Hadoop MapReduce

Hadoop Common

Reference:

No comments:

Post a Comment