Coder Social home page Coder Social logo

kafka-connect-hana's Introduction

Kafka Connector for SAP HANA

kafka-connect-hana is a Kafka Connector for copying data to and from SAP HANA.

Table of contents

Install

To install the connector from source, use the following command.

mvn clean install -DskipTests

Include the SAP HANA Jdbc Jar

  • Follow the steps in this guide to access the SAP HANA Jdbc jar.
  • Place it in the same directory as the Kafka Connector jar or under the CLASSPATH directory.

QuickStart

For getting started with this connector, the following steps need to be completed.

  • Create the config file for sink named kafka-connect-hana-sink.properties.
name=test-sink
connector.class=com.sap.kafka.connect.sink.hana.HANASinkConnector
tasks.max=1
topics=test_topic
connection.url=jdbc:sap://<url>/
connection.user=<username>
connection.password=<password>
auto.create=true
schema.registry.url=<schema registry url>
test_topic.table.name="SYSTEM"."DUMMY_TABLE"
  • Start the kafka-connect-hana sink connector using the following command.
./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/kafka/kafka-connect-hana-sink.properties
  • Create the config file for source named kafka-connect-hana-source.properties.
name=kafka-connect-source
connector.class=com.sap.kafka.connect.source.hana.HANASourceConnector
tasks.max=1
topics=kafka_source_1,kafka_source_2
connection.url=jdbc:sap://<url>/
connection.user=<username>
connection.password=<password>
kafka_source_1.table.name="SYSTEM"."com.sap.test::hello"
    • Start the kafka-connect-hana source connector using the following command.
./bin/connect-standalone ./etc/schema-registry/connect-avro-standalone.properties ./etc/kafka/kafka-connect-hana-source.properties

Distributed Mode

In a production environment, it is suggested to run the Kafka Connector on distributed mode

Configuration

The kafka connector for SAP Hana provides a wide set of configuration options both for source & sink.

The full list of configuration options for kafka connector for SAP Hana is as follows:

  • Sink

    • topics - This setting can be used to specify a comma-separated list of topics. Must not have spaces.

    • auto.create - This setting allows creation of a new table in SAP Hana if the table specified in {topic}.table.name does not exist. Should be a Boolean. Default is false.

    • batch.size - This setting can be used to specify the number of records that can be pushed into HANA DB table in a single flush. Should be an Integer. Default is 3000.

    • max.retries - This setting can be used to specify the maximum no. of retries that can be made to re-establish the connection to SAP HANA in case the connection is lost. Should be an Integer. Default is 10.

    • {topic}.table.name - This setting allows specifying the SAP Hana table name where the data needs to be written to. Should be a String. Must be compatible to SAP HANA Table name like "SCHEMA"."TABLE".

    • {topic}.table.type - This is a HANA specific configuration setting which allows creation of Row & Column tables if auto.create is set to true. Default value is column. And supported values are column, row.

    • {topic}.pk.mode - This setting can be used to specify the primary key mode required when auto.create is set to true & the table name specified in {topic}.table.name does not exist in SAP HANA. Default is none. And supported values are record_key, record_value.

    • {topic}.pk.fields - This setting can be used to specify a comma-separated list of primary key fields when {topic}.pk.mode is set to record_key or record_value. Must not have spaces.

    • {topic}.table.partition.mode - This is a HANA Sink specific configuration setting which determines the table partitioning in HANA. Default value is none. And supported values are none, hash, round_robin.

    • {topic}.table.partition.count - This is a HANA Sink specific configuration setting which determines the number of partitions the table should have. Required when auto.create is set to true and table specified in {topic}.table.name does not exist in SAP Hana. Should be an Integer. Default value is 0.

  • Source

    • topics - This setting can be used to specify a comma-separated list of topics. Must not have spaces.

    • mode - This setting can be used to specify the mode in which data should be fetched from SAP HANA table. Default is bulk. And supported values are bulk, incrementing.

    • queryMode - This setting can be used to specify the query mode in which data should be fetched from SAP HANA table. Default is table. And supported values are table, query ( to support sql queries ).

    • {topic}.table.name - This setting allows specifying the SAP Hana table name where the data needs to be written to. Should be a String. Must be compatible to SAP HANA Table name like "SCHEMA"."TABLE".

    • {topic}.poll.interval.ms - This setting allows specifying the poll interval at which the data should be fetched from SAP HANA table. Should be an Integer. Default value is 60000.

    • {topic}.incrementing.column.name - In order to fetch data from a SAP Hana table when mode is set to incrementing, an incremental ( or auto-incremental ) column needs to be provided. The type of the column can be Int, Float, Decimal, Timestamp. This considers SAP HANA Timeseries tables also. Should be a valid clumn name ( respresented as a String) present in the table.

    • {topic}.partition.count - This setting can be used to specify the no. of topic partitions that the Source connector can use to publish the data. Should be an Integer. Default value is 1.

Default Configurations

Examples

The unit tests provide examples on every possible mode in which the connector can be configured.

kafka-connect-hana's People

Contributors

sbcd90 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.