Coder Social home page Coder Social logo

arekczarnik / kafka-backup Goto Github PK

View Code? Open in Web Editor NEW

This project forked from itadventurer/kafka-backup

0.0 0.0 0.0 2.77 MB

Backup and Restore for Apache Kafka

License: Apache License 2.0

Shell 15.76% Python 6.61% Java 76.74% Dockerfile 0.89%

kafka-backup's Introduction

Kafka Backup

Hi all, I am currently not able to maintain this Project on my own. If you are interested in supporting me, please let me know for example by opening an issue.

Kafka Backup is a tool to back up and restore your Kafka data including all (configurable) topic data and especially also consumer group offsets. To the best of our knowledge, Kafka Backup is the only viable solution to take a cold backup of your Kafka data and restore it correctly.

It is designed as two connectors for Kafka Connect: A sink connector (backing data up) and a source connector (restoring data).

Currently kafka-backup supports backup and restore to/from the file system.

Features

  • Backup and restore topic data
  • Backup and restore consumer-group offsets
  • Currently supports only backup/restore to/from local file system
  • Released as a jar file or packaged as a Docker image

Getting Started

Option A) Download binary

Download the latest release from GitHub and unzip it.

Option B) Use Docker image

Pull the latest Docker image from Docker Hub

DO NOT USE THE latest STAGE IN PRODUCTION. latest are automatic builds of the master branch. Be careful!

Option C) Build from source

Just run ./gradlew shadowJar in the root directory of Kafka Backup. You will find the CLI tools in the bin directory.

Start Kafka Backup

backup-standalone.sh --bootstrap-server localhost:9092 \
    --target-dir /path/to/backup/dir --topics 'topic1,topic2'

In Docker:

docker run -d -v /path/to/backup-dir/:/kafka-backup/ --rm \
    kafka-backup:[LATEST_TAG] \
    backup-standalone.sh --bootstrap-server kafka:9092 \
    --target-dir /kafka-backup/ --topics 'topic1,topic2'

You can pass options via CLI arguments or using environment variables:

Parameter Type/required? Description
--bootstrap-server
BOOTSTRAP_SERVER
[REQUIRED] The Kafka server to connect to
--target-dir
TARGET_DIR
[REQUIRED] Directory where the backup files should be stored
--topics
TOPICS
<T1,T2,โ€ฆ> List of topics to be backed up. You must provide either --topics or --topics-regex. Not both
--topics-regex
TOPICS_REGEX
Regex of topics to be backed up. You must provide either --topics or --topics-regex. Not both
--max-segment-size
MAX_SEGMENT_SIZE
Size of the backup segments in bytes DEFAULT: 1GiB
--command-config
COMMAND_CONFIG
Property file containing configs to be passed to Admin Client. Only useful if you have additional connection options
--debug
DEBUG=y
Print Debug information
--help Prints this message

Kafka Backup does not stop! The Backup process is a continous background job that runs forever as Kafka models data as a stream without end. See Issue 52: Support point-in-time snapshots for more information.

Restore data

restore-standalone.sh --bootstrap-server localhost:9092 \
    --target-dir /path/to/backup/dir --topics 'topic1,topic2'

In Docker:

docker run -v /path/to/backup/dir:/kafka-backup/ --rm \
    kafka-backup:[LATEST_TAG]
    restore-standalone.sh --bootstrap-server kafka:9092 \
    --source-dir /kafka-backup/ --topics 'topic1,topic2'

You can pass options via CLI arguments or using environment variables:

Parameter Type/required? Description
--bootstrap-server
BOOTSTRAP_SERVER
[REQUIRED] The Kafka server to connect to
--source-dir
SOURCE_DIR
[REQUIRED] Directory where the backup files are found
--topics
TOPICS
[REQUIRED] List of topics to restore
--batch-size
BATCH_SIZE
Batch size (Default: 1MiB)
--offset-file
OFFSET_FILE
File where to store offsets. THIS FILE IS CRUCIAL FOR A CORRECT RESTORATION PROCESS IF YOU LOSE IT YOU NEED TO START THE BACKUP FROM SCRATCH. OTHERWISE YOU WILL HAVE DUPLICATE DATA Default: [source-dir]/restore.offsets
--command-config
COMMAND_CONFIG
Property file containing configs to be passed to Admin Client. Only useful if you have additional connection options
--help
HELP
Prints this message
--debug
DEBUG
Print Debug information (if using the environment variable, set it to 'y')

More Documentation

License

This project is licensed under the Apache License Version 2.0 (see LICENSE).

kafka-backup's People

Contributors

itadventurer avatar wesselvs avatar gstuder-ona avatar lbmaster001 avatar nickcharles avatar loffek avatar cschellenbach avatar jay7x avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.