Blog Archives

An overview of Impala

As enterprises move to Hadoop based data solutions, a key pattern seen is that ‘bigdata’ processing happens in Hadoop land and the resultant derived datasets (such as fine grained aggregates) are migrated into a traditional data warehouse for further consumption. The reason this pattern exists

Tagged with: , , , ,
Posted in Hadoop, Technical

Elephants in the Clouds

Over the past one year, there have been a lot of new product / project announcements related to running Hadoop in Cloud environments. While Amazon’s Elastic MapReduce continued with enhancements over its base platform, players like Qubole, Mirantis, VMWare, Rackspace

Tagged with: ,
Posted in Hadoop, Technical

Setting up a single node Hadoop cluster based on trunk

Apache Hadoop is an open source project that provides the capability to store and process petabytes of data. The project releases tested software that can be downloaded and used on clusters of varying sizes. However, this post is a compilation

Tagged with:
Posted in Hadoop, Technical

YDN blog on first Indian Hadoop summit

The first Indian Hadoop summit was held on February 28th, 2010 at Bangalore, as part of an event organized along with CloudCamp. A blog post on this is available from Yahoo! Developer Network. The post has links to all the

Tagged with:
Posted in Technical