Coder Social home page Coder Social logo

instaclustr / cassandra-ttl-remover Goto Github PK

View Code? Open in Web Editor NEW
19.0 34.0 8.0 242 KB

Tool for rewriting SSTables to not contain TTLs

Home Page: https://instaclustr.com

Shell 1.44% Java 98.56%
cassandra ttl remove remover time-to-live sstable sstables netapp-public

cassandra-ttl-remover's Introduction

Cassandra TTL Remover

Tool for rewriting SSTables to remove TTLs

ttl remover impl Instaclustr

TTL remover removes TTL information for SSTables by rewriting them and creating new ones, so it looks like data has never expired and they will never expire either. This is handy for testing scenarios and debugging purposes when we want to have data in Cassandra, visible, but the underlying SSTable has expired them in the meanwhile. You can then load them back e.g. by sstableloader to Cassandra.

We are supporting:

  • Cassandra 2 line (2.2.19)

  • Cassandra 3 line (3.11.14)

  • Cassandra 4 line (4.0.7)

  • Cassandra 4.1 line (4.1.0)

Usage

The user of this software might either grab the binaries in Maven Central, or they may build it on their own.

The project consists of these modules:

  • impl - the implementation of CLI plugin

  • cassandra-{2,3,4} - the implementation of TTL remover

  • buddy-agent - Byte Buddy agent used upon Cassandra 3 and 4 TTL removal

Each remover for each respective Cassandra version removes TTLs from SSTables a little bit differently. It is notoriously known that Cassandra is little bit hairy when it comes to re-usability in other projects, so we are extending the modifying of some Cassandra classes by copying them over and rewriting stuff which can not be out of the box (e.g. the case for Cassandra 2.2.x is particularly strong here).

The impl module contains an interface SSTableTTLRemover which all Cassandra-specific modules implement. The SPI mechanism will load the concrete remover just because it was put on a class path. Hence, the mechanism to switch between Cassandra versions is to place the correct implementation JAR on the class path and the impl module will do the rest.

buddy-agent contains an agent which is used upon execution for Cassandra 3 and 4 TTL removal. The purpose of this module is to mock some DatabaseDescriptor static methods. Normally, this class is initialized when a Cassandra process is run, but we are not running anything. The removal logic uses these methods—normally we would have to have proper database schema in $CASSANDRA_HOME/data and logic which deals with reading and writing SSTables, and for example would also fetch data from system tables. This is not desirable and by introducing this module the only thing necessary is $CASSANDRA_HOME with libs/jars so we can populate the classpath.

The released artifacts do not ship Cassandra with it—you have to have $CASSANDRA_HOME set—pointing to your Cassandra installation from which you need to remove TTL information from SSTables.

run.sh script

It is recommended to use run.sh helper script if you want to remove TTLs. You are welcome to modify this script at its end to support your case. At the bottom, you see:

CLASSPATH=$CLASSPATH./impl/target/ttl-remover.jar:./cassandra-2/target/ttl-remover-cassandra-2.jar

java -cp "$CLASSPATH./impl/target/ttl-remover.jar:./cassandra-2/target/ttl-remover-cassandra-2.jar"
    \$JVM_OPTS \
    com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI \
    --cassandra-version=2 \
    --sstables \
    /tmp/sstables2/test \
    --output-path \
    /tmp/stripped \
    --cassandra-yaml \
    $CASSANDRA_HOME/conf/cassandra.yaml \
    --cassandra-storage-dir \
    $CASSANDRA_HOME/data

On the other hand, for Cassandra 3/4, the command would look like this. Notice we are using byte-buddy agent, unlike for Cassandra 2 case, and we are also specifying CQL statement as we not have have access to data dir (it may be completely empty) but we need to construct metadata upon TTL removal, so we do it programmatically.

java -javaagent:./buddy-agent/target/byte-buddy-agent.jar \
    -cp "$CLASSPATH./impl/target/ttl-remover.jar:./cassandra-3/target/ttl-remover-cassandra-3.jar" \
    $JVM_OPTS \
    com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI \
    --cassandra-version=3 \
    --sstables \
    /tmp/original-3/test/test \
    --output-path \
    /tmp/stripped \
    --cql \
    'CREATE TABLE IF NOT EXISTS test.test (id uuid, name text, surname text, PRIMARY KEY (id)) WITH default_time_to_live = 10;'

All configuration options are as follows—you get this by the help command just after class specification:

Usage: ttl-remove [-hV] [-c=[INTEGER]] [-d=[DIRECTORY]] [-f=[FILE]] -p=
                  [DIRECTORY] [-q=[CQL]] [-s=[DIRECTORY]] [-t=[FILE]]
command for removing TTL from SSTables
  -p, --output-path=[DIRECTORY]
                         Destination where SSTable will be generated.
  -f, --cassandra-yaml=[FILE]
                         Path to cassandra.yaml file for loading generated
                           SSTables to Cassandra, relevant only in case of
                           Cassandra 2.
  -d, --cassandra-storage-dir=[DIRECTORY]
                         Path to cassandra data dir, relevant only in case of
                           Cassandra 2.
  -s, --sstables=[DIRECTORY]
                         Path to a directory for which all SSTables will have
                           TTL removed.
  -t, --sstable=[FILE]   Path to .db file of a SSTable which will have TTL
                           removed.
  -c, --cassandra-version=[INTEGER]
                         Version of Cassandra to remove TTL for, might be 2, 3
                           or 4, defaults to 3
  -q, --cql=[CQL]        CQL statement which creates table we want to remove
                           TTL from. This has to be set in case
                           --cassandra-version is 3 or 4
  -h, --help             Show this help message and exit.
  -V, --version          Print version information and exit.

--cassandra-storage-dir is the directory where all your data/SSTables are for your Cassandra installation. This has to be specified; normally it would point to something like /var/lib/cassandra/data to give an example. This has to be specified explicitly.

--cassandra-storage-dir and --cassandra-yaml are only necessary upon Cassandra 2 TTL removal.

--cassandra-yaml is the path to your cassandra.yaml; this has to be specified explicitly too.

Lastly, there has to be --output-path specified too—where your stripped SSTables from TTLs should be.

Load TTL-Removed SSTable to a New Cluster

  1. Create the keyspace and table of the target SStable in the new cluster.

  2. In the source cluster, use the following command to load the ttl-removed SSTable into the new cluster.

    ./sstableloader -d <ip address of new cluster node> [path to the ttl-removed sstable folder]

Build

$ mvn clean install

Tests are skipped by mvn clean install -DskipTests.

Please be sure that your $CASSANDRA_HOME is not set. Unit tests are starting an embedded Cassandra instance which is setting its own "Cassandra home", and having this set externally would confuse tests as it would react to a different Cassandra home.

Further Information

See Danyang Li’s blog ["TTLRemover: Tool for Removing Cassandra TTLs for Recovery and Testing Purposes"](https://www.instaclustr.com/ttlremover-tool-for-removing-cassandra-ttls-for-recovery-and-testing-purposes/)

cassandra-ttl-remover's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cassandra-ttl-remover's Issues

Support 3.11.x

2 PR's necessary, one for each C* branch supporting 3.0 and 3.11.

Error in TTL Remover Script

Hi Team ,

There is some typo in TTL remover script .
"$JAVA" $JAVA_AGENT -cp "$CLASSPATH" $JVM_OPTS -Dstorage-config="$CASSANDRA_CONF"
-Dcassandra.storagedir="$cassandra_storagedir"
-Dlogback.configurationFile=logback-tools.xml
org.apach

It should be
"$JAVA" $JAVA_AGENT -cp "$CLASSPATH" $JVM_OPTS -Dstorage-config="$CASSANDRA_CONF"
-Dcassandra.storagedir="/opt/cassandra/"
-Dlogback.configurationFile=logback-tools.xml
org.apache.cassandra.noTTL.TTLRemover

Also after making changes , its giving the following error .
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/cli/ParseException
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
at java.lang.Class.getMethod0(Class.java:3018)
at java.lang.Class.getMethod(Class.java:1784)
at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544)
at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.cli.ParseException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more

TTLRemover do not compile with Cassandra 2.2

Hello,

Very useful and instructive piece of software, thank you to share.

Anyway, I've followed the instructions on the Readme, but it didn't compile:

    [javac] /home/ferran/TTLRemover/cassandra-2.2-src/src/java/org/apache/cassandra/noTTL/NoTTLReader.java:1958: error: cannot find symbol
    [javac]         return sstableMetadata.replayPosition;
    [javac]                               ^
    [javac]   symbol:   variable replayPosition
    [javac]   location: variable sstableMetadata of type StatsMetadata

And compiler is right, this variable do not exist... on 2.2 (but it exists on 1.6)

To quickly solve the issue I've just deleted the code calling this (noTTL/NoTTLReader.java:1958):

    public ReplayPosition getReplayPosition()
    {
        return sstableMetadata.replayPosition;
    }

And it works!

Exception in thread "main" org.apache.cassandra.db.KeyspaceNotDefinedException: Keyspace system_schema does not exist

Hi Team ,

While executing TTL remover , its giving below given error .

./TTLRemover /opt/cassandra/data/feb_data/table_1-9fbb0adc48f711e8a8b7e55aaace1278/ mc-56-big-Data.db -p /opt/cassandra/data/feb_ttl_drop/

=======
error :

Exception in thread "main" org.apache.cassandra.db.KeyspaceNotDefinedException: Keyspace system_schema does not exist
at org.apache.cassandra.thrift.ThriftValidation.validateKeyspace(ThriftValidation.java:84)
at org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily(ThriftValidation.java:108)
at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:893)
at org.apache.cassandra.cql3.statements.SelectStatement$RawStatement.prepare(SelectStatement.java:888)
at org.apache.cassandra.cql3.QueryProcessor.getStatement(QueryProcessor.java:515)
at org.apache.cassandra.cql3.QueryProcessor.parseStatement(QueryProcessor.java:224)
at org.apache.cassandra.cql3.QueryProcessor.prepareInternal(QueryProcessor.java:268)
at org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:276)
at org.apache.cassandra.schema.SchemaKeyspace.query(SchemaKeyspace.java:1239)
at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:861)
at org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:853)
at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:136)
at org.apache.cassandra.noTTL.TTLRemover.main(TTLRemover.java:160)

TTL Remover not working in Cassandra 4.

Run.sh for Cassandra 4:

For Cassandra 3 and 4.0

CLASSPATH=$CLASSPATH./impl/target/ttl-remover.jar:./cassandra-4/target/ttl-remover-cassandra-4.jar

change versions of jars on classpath to target 3 or 4

change --cassandra-version if necessary

java -javaagent:/opt/cassandra-ttl-remover/buddy-agent/target/byte-buddy-agent.jar
-cp "/opt/cassandra-ttl-remover/impl/target/ttl-remover.jar:/opt/cassandra-ttl-remover/cassandra-4/target/ttl-remover-cassandra-4.jar"
$JVM_OPTS
com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI
--cassandra-version=4
--sstables
/var/lib/cassandra/data/cycling
--output-path
/var/lib/cassandra/data/cycling/stripped
--cql
'CREATE TABLE IF NOT EXISTS test.test (id uuid, name text, surname text, PRIMARY KEY (id)) WITH default_time_to_live = 10;'

=============================================================================
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.0.0 | Cassandra 4.0.7 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cqlsh> show version;
[cqlsh 6.0.0 | Cassandra 4.0.7 | CQL spec 3.4.5 | Native protocol v5]
cqlsh>

Below is the error Log while running run.sh.

root@:/opt/cassandra-ttl-remover# sh run.sh
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Exception in thread "main" java.util.ServiceConfigurationError: com.instaclustr.cassandra.ttl.SSTableTTLRemover: com.instaclustr.cassandra.ttl.Cassandra4TTLRemover Unable to get public no-arg constructor
at java.base/java.util.ServiceLoader.fail(ServiceLoader.java:582)
at java.base/java.util.ServiceLoader.getConstructor(ServiceLoader.java:673)
at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1233)
at java.base/java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1265)
at java.base/java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1300)
at java.base/java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1385)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:132)
at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.getTTLRemover(TTLRemoverCLI.java:134)
at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.run(TTLRemoverCLI.java:100)
at picocli.CommandLine.executeUserObject(CommandLine.java:1919)
at picocli.CommandLine.access$1100(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2332)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2326)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2291)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2159)
at picocli.CommandLine.execute(CommandLine.java:2058)
at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.execute(TTLRemoverCLI.java:119)
at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.main(TTLRemoverCLI.java:77)
at com.instaclustr.cassandra.ttl.cli.TTLRemoverCLI.main(TTLRemoverCLI.java:73)
Caused by: java.lang.NoClassDefFoundError: org/apache/cassandra/db/lifecycle/ILifecycleTransaction
at java.base/java.lang.Class.getDeclaredConstructors0(Native Method)
at java.base/java.lang.Class.privateGetDeclaredConstructors(Class.java:3137)
at java.base/java.lang.Class.getConstructor0(Class.java:3342)
at java.base/java.lang.Class.getConstructor(Class.java:2151)
at java.base/java.util.ServiceLoader$1.run(ServiceLoader.java:660)
at java.base/java.util.ServiceLoader$1.run(ServiceLoader.java:657)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/java.util.ServiceLoader.getConstructor(ServiceLoader.java:668)
... 23 more
Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.db.lifecycle.ILifecycleTransaction
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)

=================================================================
echo $JAVA_HOME
/opt/zulu11.60.19-ca-jdk11.0.17-linux_x64
root@# echo $CASSANDRA_HOME

root@:/opt/cassandra-ttl-remover# echo $CLASSPATH
/opt/cassandra-ttl-remover-1.1.2/impl/target/ttl-remover.jar:/opt/cassandra-ttl-remover-1.1.2/cassandra-4/target/ttl-remover-cassandra-4.jar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.