Blog Archives

An overview of Impala

As enterprises move to Hadoop based data solutions, a key pattern seen is that ‘bigdata’ processing happens in Hadoop land and the resultant derived datasets (such as fine grained aggregates) are migrated into a traditional data warehouse for further consumption. The reason this pattern exists

Tagged with: , , , ,
Posted in Hadoop, Technical

A couple of package management tricks

Few weeks back, I spent some time trying out Cloudera Impala. The target cluster was a 15 node cluster of CentOS VMs running CDH 4.5. The idea of the trial was to gain experience installing and running Impala on a dataset for a

Tagged with: , ,
Posted in Technical