Coder Social home page Coder Social logo

cassandra-sstable-tools's Introduction

Compile

$ ant -Dcassandra.version=2.x.y

Install

Copy ic-sstable-tools.jar to /usr/share/cassandra

Copy the bin/ic-* into your $PATH

Documentation

Command Description
ic-summary Summary information about all column families including how much of the data is repaired
ic-sstables Print out metadata for sstables the belong to a column family
ic-pstats Partition size statistics for a column family
ic-cfstats Detailed statistics about cells in a column family
ic-purge Statistics about reclaimable data for a column family

ic-summary

Provides summary information about all column families. Useful for finding the largest column families and how much data has been repaired by incremental repairs.

Usage

ic-summary

Output

Column Description
Keyspace Keyspace the column family belongs to
Column Family Name of column family
SSTables Number of sstables on this node for the column family
Disk Size Compressed size on disk for this node
Data Size Uncompressed size of the data for this node
Last Repaired Time of the last incremental repair
Repair % Percentage of data marked as repaired by incremental repair

ic-sstables

Print out sstable metadata for a column family. Useful in helping to tune compaction settings.

Usage

ic-sstables <keyspace> <column-family>

Output

Column Description
SSTable Data.db filename of sstable
Disk Size Size of sstable on disk
Total Size Uncompressed size of data contained in the sstable
Min Timestamp Minimum cell timestamp contained in the sstable
Max Timestamp Maximum cell timestamp contained in the sstable
Duration The time span between minimum and maximum cell timestamps
Level Leveled Tiered Compaction sstable level
Keys Number of partition keys
Avg Partition Size Average partition size
Max Partition Size Maximum partition size
Avg Column Count Average number of columns in a partition
Max Column Count Maximum number of columns in a partition
Droppable Estimated droppable tombstones
Repaired At Time when marked as repaired by incremental repair

ic-pstats

Tool for finding largest partitions. Reads the Index.db files so is relatively quick.

Usage

ic-pstats [-n <num>] [-t <snapshot>] [-f <filter>] <keyspace> <column-family>
-h Display help
-b Batch mode. Uses progress indicator that is friendly for running in batch jobs.
-n Number of partitions to display
-t Snapshot to analyse. Snapshot is created if none is specified.
-f Comma separated list of Data.db sstables to filter on

Output

Summary: Summary statistics about partitions

Column Description
Count (Size) Number of partition keys on this node
Total (Size) Total uncompressed size of all partitions on this node
Total (SSTable) Number of sstables on this node
Minimum (Size) Minimum uncompressed partition size
Minimum (SSTable) Minimum number of sstables a partition belongs to
Maximum (Size) Maximum uncompressed partition size
Maximum (SSTable) Maximum number of sstables a partition belongs to
Average (Size) Average (mean) uncompressed partition size
Average (SSTable) Average (mean) number of sstables a partition belongs to

Largest partitions: The top N largest partitions

Column Description
Key The partition key
Size Total uncompressed size of the partition
SSTable Count Number of sstables that contain the partition

SSTable Leaders: The top N partitions that belong to the most sstables

Column Description
Key The partition key
SSTable Count Number of sstables that contain the partition
Size Total uncompressed size of the partition

SSTables: Metadata about sstables as it relates to partitions.

Column Description
SSTable Data.db filename of SSTable
Size Uncompressed size
Min Timestamp Minimum cell timestamp in the sstable
Max Timestamp Maximum cell timestamp in the sstable
Level Leveled Tiered Compaction level of sstable
Partitions Number of partition keys in the sstable
Avg Partition Size Average uncompressed partition size in sstable
Max Partition Size Maximum uncompressed partition size in sstable

ic-cfstats

Tool for getting detailed cell statistics that can help identify issues with data model.

Usage

ic-cfstats [-r <limit>] [-n <num>] [-t <snapshot>] [-f <filter>] <keyspace> <column-family>
-h Display help
-b Batch mode. Uses progress indicator that is friendly for running in batch jobs.
-r Limit read throughput to ratelimit MB/s
-n Number of partitions to display
-t Snapshot to analyse. Snapshot is created if none is specified.
-f Comma separated list of Data.db sstables to filter on

Output

Summary: Summary statistics about partitions

Column Description
Count (Size) Number of partition keys on this node
Total (Size) Total uncompressed size of all partitions on this node
Total (SSTable) Number of sstables on this node
Minimum (Size) Minimum uncompressed partition size
Minimum (SSTable) Minimum number of sstables a partition belongs to
Maximum (Size) Maximum uncompressed partition size
Maximum (SSTable) Maximum number of sstables a partition belongs to
Average (Size) Average (mean) uncompressed partition size
Average (SSTable) Average (mean) number of sstables a partition belongs to

Largest partitions: Partitions with largest uncompressed size

Column Description
Key The partition key
Size Total uncompressed size of the partition
Tombstones Number of cell or range tombstones
(droppable) Number of tombstones that can be dropped as per gc_grace_seconds
Cells Number of cells in the partition
SSTable Count Number of sstables that contain the partition

Widest partitions: Partitions with the most cells

Column Description
Key The partition key
Cells Number of cells in the partition
Tombstones Number of cell or range tombstones
(droppable) Number of tombstones that can be dropped as per gc_grace_seconds
Size Total uncompressed size of the partition
SSTable Count Number of sstables that contain the partition

Tombstone Leaders: Partitions with the most tombstones

Column Description
Key The partition key
Tombstones Number of cell or range tombstones
(droppable) Number of tombstones that can be dropped as per gc_grace_seconds
Cells Number of cells in the partition
Size Total uncompressed size of the partition
SSTable Count Number of sstables that contain the partition

SSTable Leaders: Partitions that are in the most sstables

Column Description
Key The partition key
SSTable Count Number of sstables that contain the partition
Size Total uncompressed size of the partition
Cells Number of cells in the partition
Tombstones Number of cell or range tombstones
(droppable) Number of tombstones that can be dropped as per gc_grace_seconds

SSTables: Metadata about sstables as it relates to partitions.

Column Description
SSTable Data.db filename of SSTable
Size Uncompressed size
Min Timestamp Minimum cell timestamp in the sstable
Max Timestamp Maximum cell timestamp in the sstable
Partitions Number of partitions
(deleted) Number of row level partition deletions
(avg size) Average uncompressed partition size in sstable
(max size) Maximum uncompressed partition size in sstable
Cells Number of cells in the SSTable
Tombstones Number of cell or range tombstones in the SSTable
(droppable) Number of tombstones that are droppable according to gc_grace_seconds
(range) Number of range tombstones
Cell Liveness Percentage of live cells. Does not consider tombstones or cell updates shadowing cells. That is it is percentage of non-tombstoned cells to total number of cells.

ic-purge

Finds the largest reclaimable partitions (GCable). Intensive process, effectively does "fake" compactions to calculate metrics.

Usage

ic-purge [-r <limit>] [-n <num>] [-t <snapshot>] [-f <filter>] <keyspace> <column-family>
-h Display help
-b Batch mode. Uses progress indicator that is friendly for running in batch jobs.
-r Limit read throughput to ratelimit MB/s
-n Number of partitions to display
-t Snapshot to analyse. Snapshot is created if none is specified.

Output

Largest reclaimable partitions: Partitions with the largest amount of reclaimable data

Column Description
Key The partition key
Size Total uncompressed size of the partition
Reclaim Reclaimable uncompressed size
Generations SSTable generations the partition belongs to

cassandra-sstable-tools's People

Contributors

grom358 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.