Coder Social home page Coder Social logo

parlour's Introduction

Parlour

Build Status Gitter chat

Parlour. a place that sells scoops of ice-cream; a cascading-sqoop integration.

Parlour provides a basic Cascading/Scalding Sqoop integration allowing import to and export from HDFS.

It also provides support for the Cloudera/Teradata Connector.

Third-Party Libraries

In order to get Parlour to work - you will need to include the following third-party JARs with the application that you use it in.

If you want to use them within the parlour repository - you will need to put them in lib/.

Oracle Support:

  • ojdbc6.jar: the Oracle JDBC Adapter

Teradata Support:

  • sqoop-connector-teradata-1.2c5.jar: Cloudera Connector Powered by Teradata
  • tdgssconfig.jar: Teradata Driver (Security configuration)
  • terajdbc4.jar: Teradata JDBC Adapter

Cascade Job

import au.com.cba.omnia.parlour.SqoopSyntax._

new ImportSqoopJob(
  TeradataParlourImportDsl()
    .inputMethod(SplitByAmp)
    .connectString("jdbc:teradata://some.server/database=DB1")
    .username("some username")
    .password(System.getenv("DATABASE_PASSWORD"))
    .tableName("some table").toSqoopOptions,
  TypedTsv[String]("hdfs/path/to/target/dir")
)(args)

new ExportSqoopJob(
  TeradataParlourExportDsl()
   .outputMethod(BatchInsert)
   .connectionString("jdbc:teradata://some.server/database=DB1")
   .username("some username")
   .password(System.getenv("DATABASE_PASSWORD"))
   .tableName("some table").toSqoopOptions,
  TypedPsv[String]("hdfs/path/to/data/to/export")
)(args)

Console Job

Parlour includes a sample job that can be invoked from the command-line:

hadoop jar <parlour-jar> \
    com.twitter.scalding.Tool \
    au.com.cba.omnia.parlour.ExportSqoopConsoleJob \
    --hdfs \
    --input /data/on/hdfs/to/sqoop \
    --teradata-method internal.fastload \
    --teradata-fastload-socket-hostname myhostname1 \
    --connection-string "jdbc:teradata://database/database=test" \
    --table-name test \
    --username user1 \
    --password $PASSWORD \
    --mappers 1 \
    --input-field-delimiter \| \
    --input-line-delimiter \n

Teradata Fastload Support

Teradata Internal Fastload requires the use of a coordinating service that runs on the machine that launches the jobs.

As a result - you may need to manually specify which adapter the service should be bound to. This is done using:

TeradataPalourExportDsl(sqoopOptions).fastloadSocketHostName("myhostname")

parlour's People

Contributors

tims avatar stephanh avatar laurencer avatar quintona avatar samroberts avatar

Watchers

 avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.