Distributed data processing with Hadoop, Part 2: Going further - Related links

http://www.ibm.com –

The first article in this series showed how to use Hadoop in a single-node cluster. This article continues with a more advanced setup that uses multiple nodes for parallel processing. It demonstrates the various node types required for multinode clusters and explores MapReduce functionality in a parallel environment. This article also digs into the management aspects of Hadoop -- both command line and Web based.

Full story »

Created by born.again.linuxer 14 years 17 weeks ago
Category: High End Tags:

Login to post comments

Distributed data processing with Hadoop, Part 1: Getting started 14 years 17 weeks ago: This article—the first in a series on Hadoop—explores the Hadoop framework, including its fundamental elements, such as the Hadoop file system (HDFS), and node types that are commonly used. Learn how to install and configure a single-node Hadoop cluster, and delve into the MapReduce application. Finally, discover ways to monitor and manage Hadoop using its core Web interfaces.
Hands-on Hadoop for cluster computing 15 years 50 weeks ago: Hadoop is a distributed computing platform that provides a framework for storing and processing petabytes of data. Because it is Java-based, Hadoop runs on Linux, Windows, Solaris, BSD, and Mac OS X. Hadoop is widely used in organizations that demand a scalable, economical (read commodity hardware), efficent, and reliable platform for processing vast amounts of data.
Cloud computing with Linux and Apache Hadoop 14 years 52 weeks ago: This article shows you how to use Apache Hadoop to build a MapReduce framework to make a Hadoop Cluster and how to create a sample MapReduce application which runs on Hadoop. You will also learn how to set up a time/disk-consuming task on the cloud.
Hadoop: When grownups do open source 16 years 8 weeks ago: Hadoop is a library for writing distributed data processing programs using the MapReduce framework. It's got all the makings of a blogosphere hit: cluster computing, large datasets, parallelism, algorithms published by Google, and open source.
HowTo setup a Quorum Disk under Red Hat Linux 15 years 6 weeks ago: Today's tutorial will be on the infamous Quorum disk. When I first setup my GFS2 shared Cluster of 3 nodes, I was quite impressed with the fact that 3 nodes were sharing the same file system. Now that everything was up and running, I wanted to see what would happen if I brought down, 2 out of the 3 nodes in the cluster.
Google grants license for Apache Hadoop 14 years 23 weeks ago: Google has granted a license for a recently granted MapReduce process patent to the Apache Software Foundation for the Apache Hadoop open source framework for distributed computing
How To Create A Cluster Testbed Using CentOS 5 Virtualization And iSCSI 16 years 10 weeks ago: This guide attempts to provide a Xen based test environment where you can practice setting up a two node cluster (cluster setup itself is not discussed here - I'm merely giving you what you need to set it up).
Can You Top This? 15 Practical Linux Top Command Examples 14 years 38 weeks ago: This article is part of the on-going 15 example series where 15 examples will be provided for a specific command or functionality. In this series, earlier we discussed about find command, crontab examples, grep command, history command, ping command, and wget examples. In this article, let us review 15 examples for Linux top command that will be helpful for both newbies and experts.
How to monitor your Red Hat Cluster using Python and SNMP 15 years 5 weeks ago: Now that I am done with the implementation of RHE Cluster with GFS2, I now need to setup monitoring. As you all know, monitoring is a vital part of any environment. Even though we have a cluster of nodes setup, we still need to be aware of what is happening. I created two new commands. The first command is to emulate clustat, but through python and SNMP.

Free Software Daily

Login/register

Distributed data processing with Hadoop, Part 2: Going further - Related links

Categories

Best karma users

Free Software Daily

Login/register

Distributed data processing with Hadoop, Part 2: Going further - Related links

Similar stories

Categories

Best karma users