Coder Social home page Coder Social logo

impala's Introduction

Cloudera Impala

Cloudera Impala is a distributed query execution engine that runs against data stored natively in Apache HDFS and Apache HBase. This public repository is a snapshot of our internal development repository that will be updated periodically as we prepare new releases.

The rest of this README describes how to build Cloudera Impala from this repository. Further documentation about Cloudera Impala can be found here.

Building Cloudera Impala on CentOS 6.2

Prerequisites

Installing prerequisite packages

sudo yum install boost-test boost-program-options libevent-devel automake libtool flex bison gcc-c++ openssl-devel \
make cmake doxygen.x86_64 glib-devel boost-devel python-devel bzip2-devel svn libevent-devel cyrus-sasl-devel \
wget git unzip openldap-devel db4-devel

Note: Impala requires Boost 1.4.2 or later.

Note: Ubuntu 12.04 (and later) requires the libevent1-dev package to work with Thrift v0.9

Install LLVM

wget http://llvm.org/releases/3.3/llvm-3.3.src.tar.gz
tar xvzf llvm-3.3.src.tar.gz
cd llvm-3.3.src/tools
svn co http://llvm.org/svn/llvm-project/cfe/tags/RELEASE_33/final/ clang
cd ../projects
svn co http://llvm.org/svn/llvm-project/compiler-rt/tags/RELEASE_33/final/ compiler-rt
cd ..
./configure --with-pic
make -j4 REQUIRES_RTTI=1
sudo make install

Install the JDK

Make sure that the Oracle Java Development Kit 6 is installed (not OpenJDK), and that JAVA_HOME is set in your environment.

Install Maven

wget http://www.fightrice.com/mirrors/apache/maven/maven-3/3.0.4/binaries/apache-maven-3.0.4-bin.tar.gz
tar xvf apache-maven-3.0.4.tar.gz && sudo mv apache-maven-3.0.4 /usr/local

Add the following three lines to your .bashrc:

export M2_HOME=/usr/local/apache-maven-3.0.4
export M2=$M2_HOME/bin  
export PATH=$M2:$PATH 

And make sure you pick up the changes either by logging in to a fresh shell or running:

source ~/.bashrc

Confirm by running:

mvn -version

and you should see at least:

Apache Maven 3.0.4...

Building Cloudera Impala

Clone the Impala repository

git clone https://github.com/cloudera/impala.git

Set the Impala environment

cd impala
. bin/impala-config.sh

Confirm your environment looks correct:

(11:11:21@desktop) ~/src/cloudera/impala-public (master) $ env | grep "IMPALA.*VERSION"
IMPALA_CYRUS_SASL_VERSION=2.1.23
IMPALA_HBASE_VERSION=0.92.1-cdh4.1.0
IMPALA_SNAPPY_VERSION=1.0.5
IMPALA_GTEST_VERSION=1.6.0
IMPALA_GPERFTOOLS_VERSION=2.0
IMPALA_GFLAGS_VERSION=2.0
IMPALA_GLOG_VERSION=0.3.2
IMPALA_HADOOP_VERSION=2.0.0-cdh4.1.0
IMPALA_HIVE_VERSION=0.9.0-cdh4.1.0
IMPALA_MONGOOSE_VERSION=3.3
IMPALA_THRIFT_VERSION=0.9.0

Download required third-party packages

cd thirdparty
./download_thirdparty.sh

Build Impala

cd ${IMPALA_HOME}
./build_public.sh -build_thirdparty

Wrapping up

After a successful build, there should be an impalad binary in ${IMPALA_HOME}/be/build/debug/service.

You can start an Impala backend by running:

${IMPALA_HOME}/bin/start-impalad.sh -use_statestore=false

Note that the start-impalad.sh script sets some environment variables that are necessary for Impala to run successfully.

To configure Impala's use of HDFS, HBase or the Hive metastore, place the relevant configuration files somewhere in the CLASSPATH established by bin/set-classpath.sh. Internally we use fe/src/test/resources for this purpose, you may find it convenient to do the same.

The Impala Shell

The Impala shell is a convenient command-line interface to Cloudera Impala. To run from a source repository, do the following:

${IMPALA_HOME}/bin/impala-shell.sh

impala's People

Contributors

lskuff avatar skye avatar alanchoi avatar ishaan avatar ubell avatar toddlipcon avatar awleblang avatar elicollins avatar laserson avatar

Watchers

James Cloos avatar Than Phan avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.