hadoop | Linuxaria

Introduction to Hadoop

Nov 022013

Apache Hadoop is an open source software project based on JAVA. Basically it is a framework that is used to run applications on large clustered hardware (servers). It is designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance. Rather than relying on high-end hardware, the reliability of these clusters comes from the software’s ability to detect and handle failures of its own.

Credit for creating Hadoop goes to Doug Cutting and Michael J. Cafarella. Doug a Yahoo employee found it apt to rename it after his son’s toy elephant “Hadoop”. Originally it was developed to support distribution for the Nutch search engine project to sort out large amount of indexes.

Hadoop – The Small Application for Big Data

Articles, Reviews No Responses »

May 222012

Back in December 2011, data-intensive Linux users rejoiced as Apache Hadoop reached its 1.0.0 milestone. Setting a benchmark for distributed computing software, this wonderful little program is now into release 1.0.3 but what is Hadoop and how can you benefit from using it?

Designed with ‘web-scale’ operations in mind, Hadoop can handle massive amounts of information, allowing you to quickly and efficiently process volumes of data that other systems simply cannot handle. But that’s just the beginning. Hadoop also allows you to network this process – it can distribute large amounts of work across a cluster of machines, allowing you to handle workloads that a single processor simply cannot manage.