Coder Social home page Coder Social logo

whisper-migrator's Introduction

whisper-migrator

A tool for migrating data from Graphite Whisper files to InfluxDB TSM files (version 0.10.0).

This tool can be used in three modes

  1. Get whisper file information. This option displays, number of points in the file and oldest timestamp in the file

migration.go -wspinfo -wspPath=whisper folder

  1. Write to influxdb using go client, clientv2 It uses influxdb go client, clientv2. And migrates data calling HTTP APIs. This option can be invoked as

    migration.go -option=ClientV2 -wspPath=whisper folder -from=<2015-11-01> -until=<2015-12-30> -dbname=migrated -host=http://localhost -port=8086, -retentionPolicy=default -tagconfig=config.json

  2. Write to influxdb using TSMWriter This option, uses TSMWriter and creates .tsm file directly in the influxData folder. This option will write the graphite data faster than the option 1 This option can be invoked as follows

    migration.go -option=TSMW -wspPath=whisper folder -influxDataDir=influx data folder -from=<2015-11-01> -until=<2015-12-30> -dbname=migrated -retentionPolicy=default -tagconfig=config.json

    The influxd daemon process must be restarted to see the migrated data.

Tag Config file

This file is required to specify tags and measurement name for a given pattern. Please see the sample tagconfig file, migration_config.json

whisper-migrator's People

Contributors

pauldix avatar uttamgandhi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

whisper-migrator's Issues

cannot use time.Unix as type int64 in argument to tsm1.NewValue

The whisper migrator doesn't appear to run on my FreeBSD system

# env GOPATH=`pwd` go get github.com/influxdata/whisper-migrator
# env GOPATH=`pwd` go run src/github.com/influxdata/whisper-migrator/migration.go
# command-line-arguments
src/github.com/influxdata/whisper-migrator/migration.go:474: cannot use time.Unix(int64(wspPoint.Timestamp), 0) (type time.Time) as type int64 in argument to tsm1.NewValue

issues running migration.go

I am getting this error when I run the migration go file. I run this command with parameters: migration.go -option=ClientV2 -wspPath=<Whisper_servername>:/data/new/graphite/storage/whisper/ -from=2019-01-01 -until=2019-02-27 -dbname=migrated -host=http://<influxdb_servername> -port=8086, -retentionPolicy=default -tagconfig=config.json -username=myuser,-password=mypassword

But I get this error.

command-line-arguments

./migration.go:340:10: undefined: client.NewHTTPClient
./migration.go:340:31: undefined: client.HTTPConfig
./migration.go:346:19: undefined: client.NewQuery
./migration.go:353:11: undefined: client.NewBatchPoints
./migration.go:353:33: undefined: client.BatchPointsConfig
./migration.go:365:12: undefined: client.NewPoint
./migration.go:371:11: undefined: client.NewQuery
./migration.go:394:26: undefined: client.NewQuery
./migration.go:515:38: cannot use tsmPoint.key (type string) as type []byte in argument to tsmWriter.Write
./migration.go:645:10: undefined: client.NewHTTPClient
./migration.go:645:10: too many errors

I was able to fix the missing packages but not sure how to fix this to start my migration to influxdb

Migrator seems to no longer work.

durr@graphical:~/whisper-migrator$ go run migration.go -wspinfo -wspPath=/tank/carbon/whisper/
# command-line-arguments
./migration.go:515:38: cannot use tsmPoint.key (type string) as type []byte in argument to tsmWriter.Write

I'm not familiar with go at all, so I can't really help fix.

need to have graphite storage finder

This might be the best place for the issue - but this project is dealing with whisper/graphite and so might be..

AFAIK, graphite still has certain advantages over influx when it comes to querying the data from different series (I'm thinking specifically of diffSeries or divideSeries). If there was a influx storage backend http://graphite.readthedocs.org/en/latest/storage-backends.html then you could have the best of both worlds in tools like grafana (which is great because I can mix and match storage engines based on the need). For straightforward queries you could use influx directly. But in other cases where graphite has more query manipulation you could use graphite with an influx finder.

thanks!

WriteTSMPoints: initialization of the mtf.Tags needs to be outside the for-loop

Now it looks like

for i := 1; i < len(patternStr)-1; i++ {
  patternTagValue := strings.Trim(patternStr[i], ".")
  //For each # string, find a match in tag values
  mtf.Tags = make([]TagKeyValue, len(tagConfig.Tags))
  for j, tagkeyvalue := range tagConfig.Tags {
    ...
  }
  ...
}

and we have an empty mtf.Tags every iteration. So if we have more than 1 tag in our JSON-config we will get only the last of them and the others will be empty.

I think, that you need to rewrite it like this

mtf.Tags = make([]TagKeyValue, len(tagConfig.Tags))
for i := 1; i < len(patternStr)-1; i++ {
  patternTagValue := strings.Trim(patternStr[i], ".")
  //For each # string, find a match in tag values
  for j, tagkeyvalue := range tagConfig.Tags {
    ...
  }
  ...
}

Compatibility with InfluxDB 2.0

I cannot find a way to migrate my graphite-whisper db to a new (dockerized) InfluxDB 2.0 database. I'm currently runnning client Influx CLI 2.0.4.
Is the whisper-migrator tool able to do so? The InfluxDB 2.0 API now require auth-tokens and I didn't find any way to provide this parameter in the command-line syntax.
Or do I have, for instance, to frstly export the graphite DB in line-protocol format (is it possible?) and then import it in the InfluxDB 2.0? I've already used this trick (export in line protocl and then imoport it back in InfluxDB 2.0) for another DB coming from a previous InfluxDB 1.8 setup.

Tag matching doesn't work

With this config:

[
  {
    "pattern": "collectd.#TEXT1.#TEXT2.aggregation-cpu-average.#TEXT3",
    "measurement": "aggregation-cpu-average",
    "tags": [
      {
        "tagkey": "tier",
        "tagvalue": "#TEXT1"
      },
      {
        "tagkey": "host",
        "tagvalue": "#TEXT2"
      },
      {
        "tagkey": "metric",
        "tagvalue": "#TEXT3"
      }
    ],
    "field": "value"
  }
]

And this whisper path:

/data/whisper/collectd/dev/asterix/aggregation-cpu-average/cpu-idle.wsp

I get this:

Whisper File /data/whisper/collectd/dev/asterix/aggregation-cpu-average/cpu-idle.wsp
TSM Key-> cpu-idle,tier=dev,=,=#!~#value

"measurement" is ignored and only one tag is matched.

Migration with Option TSMW : fatal : "from time is not less than until time"

I have many wsp files and don't know precisely the date of the oldest data.
I just know it is around the month of February 2013 => I use the following command :
./migration -option=TSMW -wspPath=TEST-from="2013-02-01" -until="2016-01-01" -dbname=migrated -influxDataDir=/data/db -tagconfig=migration_config.json

When I launch this command migrator.go end with a fatal :
log.Fatal(err) into the function MapWSPToTSMByWhisperFile().

In MapWSPToTSMByShard the first treated shard has a from=2013-02-01 and an until=2013-02-04.
The FetchUntilTime function from whisper.go (into "github.com/uttamgandhi24/whisper-go/whisper") is called with these parameters.
The FetchUntil change this "from" value to 2013-02-21 because there are no data before in my wsp files. At this point we have a "from" at 2013-02-21 and a "until" at 2013-02-04 => whisper.go return an error and migrator stop the treatment.

With some debug I found that I have to use a "from" at 2013-02-21 but it took me some time ;-)

Maybe it is possible to ignore this error ("from time is not less than until time") or to inform the user about the "from" he should use ?

Only Dummy data with option TSMW

When I migrate data with ClientV2, everything is Ok. But when I use TSMW option, I have only dummy values.

./migration -option=TSMW -wspPath=TEST -from="2013-02-21" -until="2016-01-01" -dbname=migrated -influxDataDir=/data/db -tagconfig=migration_config.json

[...]

Migrating Data From  TEST/2013/TEST-2013-1142/CPU.wsp For TimeRange  2015-12-28 00:00:00 +0000 UTC 2016-01-01 00:00:00 +0000 UTC Size 3784320
Conversion : 2015-12-28 00:00:00 +0000 UTC, 2016-01-01 00:00:00 +0000 UTC
Migrating Data From  TEST/2013/TEST-2013-1142/DISK.wsp For TimeRange  2015-12-28 00:00:00 +0000 UTC 2016-01-01 00:00:00 +0000 UTC Size 3784320
Conversion : 2015-12-28 00:00:00 +0000 UTC, 2016-01-01 00:00:00 +0000 UTC
Migrating Data From  TEST/2013/TEST-2013-1142/RAM.wsp For TimeRange  2015-12-28 00:00:00 +0000 UTC 2016-01-01 00:00:00 +0000 UTC Size 3784320
Conversion : 2015-12-28 00:00:00 +0000 UTC, 2016-01-01 00:00:00 +0000 UTC
CPU,host=TEST-2013-1142#!~#value
DISK,host=TEST-2013-1142#!~#value
RAM,host=TEST-2013-1142#!~#value
TSM File  /data/db/migrated/default/1561/000000001-000000002.tsm Size  6527
|------------------------------------|
|------Migration Summary-------------|
|------------------------------------|
| No. of whisper files migrated 3|
| TimeTaken 9.861987672s |
| Total Whisper File Size 10.83 MB |
| Total TSM File Size     1.25 MB |
| Percentage of size reduction 88.49
|------------------------------------|

> show measurements
name: measurements
------------------
name
dummy

And with ClientV2 :

./migration -option=ClientV2 -host=http://localhost -port=8086 -from="2013-02-21" -until="2016-01-01" -dbname=migrated -retentionPolicy=default -wspPath=TEST -tagconfig=migration_config.json -username=xxxx  -password=xxxx
> SHOW MEASUREMENTS
name: measurements
------------------
name
CPU
DISK
RAM

I use the same config file in the 2 cases :

[
  {
    "pattern": "TEST.2013.#TEXT1.#TEXT2",
    "measurement": "#TEXT2",
    "tags": [
      {
        "tagkey": "host",
        "tagvalue": "#TEXT1"
      }
    ],
    "field": "value"
  }
]

I tried with influxdb 0.10.0 and 0.10.1. I compiled migration.go with Go 1.6.

CreateShards: only default Addr setting in the client.NewHTTPClient initialization

You have such initialization of the client.NewHTTPClient object

func (migrationData *MigrationData) CreateShards() error {
  c, _ := client.NewHTTPClient(client.HTTPConfig{
  Addr: "http://localhost:8086",
})

I think that you need to rewrite it like this to use command line settings for the host and port if they were defined

func (migrationData *MigrationData) CreateShards() error {
  c, _ := client.NewHTTPClient(client.HTTPConfig{
  Addr: migrationData.host + ":" + migrationData.port,
})

How do you use this thing?

I'm trying to get InfluxDB set up on a ubuntu box that previously ran a Graphite/Carbon database.

I have not worked with go before.

InfluxDB is installed via the apt repository described in the docs.

I have go installed from the ubuntu apt repositories.

durr@graphical:~/whisper-migrator$ ./migration.go
bash: ./migration.go: Permission denied
durr@graphical:~/whisper-migrator$ chmod +x migration.go
durr@graphical:~/whisper-migrator$ ./migration.go
./migration.go: line 1: package: command not found
./migration.go: line 3: syntax error near unexpected token `newline'
./migration.go: line 3: `import ('
durr@graphical:~/whisper-migrator$ go run migration.go
migration.go:7:2: cannot find package "github.com/influxdata/influxdb/client/v2" in any of:
        /usr/lib/go-1.6/src/github.com/influxdata/influxdb/client/v2 (from $GOROOT)
        ($GOPATH not set)
migration.go:8:2: cannot find package "github.com/influxdata/influxdb/tsdb/engine/tsm1" in any of:
        /usr/lib/go-1.6/src/github.com/influxdata/influxdb/tsdb/engine/tsm1 (from $GOROOT)
        ($GOPATH not set)
migration.go:9:2: cannot find package "github.com/uttamgandhi24/whisper-go/whisper" in any of:
        /usr/lib/go-1.6/src/github.com/uttamgandhi24/whisper-go/whisper (from $GOROOT)
        ($GOPATH not set)

How do you actually run this thing? Do I need to have built influxdb or something?

Archive this tool as deprecated.

After many hours of beating my head attempting to get this to function with versions from this decade, I give up.

Please archive this utility to avoid wasting your users' time.

Incorrect datetime

I'm running GO version go1.7.3 darwin/amd64 on OSX 10.12.1

When I use the TSMW option to import data from whisper dataase, my datetime is incorrectly imported into InfluxDB
screen shot 2016-11-09 at 15 41 12

It looks like this code line 473 is converting incorrectly from uint32 to int64

for j, wspPoint := range wspPoints {
  fmt.Println("wspPoint.Timestamp: ", wspPoint.Timestamp);
  tsmPoint.values[j] = tsm1.NewValue(int64(wspPoint.Timestamp), wspPoint.Value)
  fmt.Println("tsmPoint["+strconv.Itoa(j)+"]: ", tsmPoint.values[j])
}

Log output:

wspPoint.Timestamp:  1478518800
tsmPoint[0]:  1970-01-01 01:00:01.4785188 +0100 CET 0.002958
wspPoint.Timestamp:  1478519100
tsmPoint[1]:  1970-01-01 01:00:01.4785191 +0100 CET 3.9e-05
wspPoint.Timestamp:  1478519400
tsmPoint[2]:  1970-01-01 01:00:01.4785194 +0100 CET 5.2e-05
wspPoint.Timestamp:  1478519700
tsmPoint[3]:  1970-01-01 01:00:01.4785197 +0100 CET 5.9e-05
wspPoint.Timestamp:  1478520000
tsmPoint[4]:  1970-01-01 01:00:01.47852 +0100 CET 5.1e-05
wspPoint.Timestamp:  1478520300

I don't have any problem when I use the ClientV2 option.

Invalid memory address or nil pointer dereference

Migration fails at the very beginning of run with an error "Invalid memory address or nil pointer dereference".

[root@metrics whisper-migrator]# go run migration.go -option=TSMW -wspPath=/tmp/metrics/wsp/ -influxDataDir=/var/lib/influxdb/data/ -from=2017-01-01 -dbname=graphite -retentionPolicy=ten_weeks -tagconfig=config.json

Whisper File /tmp/metrics/wsp/
TSM Key-> ,=#!~#value

Whisper File /tmp/metrics/wsp/TT1.wsp
TSM Key-> TT1,=#!~#value

Whisper File /tmp/metrics/wsp/TT1_TT2.wsp
TSM Key-> TT1_TT2,=#!~#value

Whisper File /tmp/metrics/wsp/TT2.wsp
TSM Key-> TT2,=#!~#value
Do you want to continue the migration? Yes/No :
Yes
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x73ae41]

goroutine 9 [running]:
main.(*MigrationData).MapWSPToTSMByWhisperFile(0xc420079200, 0xed0514080, 0xc400000000, 0x0, 0xed0529200, 0x0, 0x0, 0x0, 0x0, 0x0)
/root/go/src/github.com/influxdata/whisper-migrator/migration.go:438 +0x201
main.(*MigrationData).MapWSPToTSMByShard.func1(0xc42001d500, 0xc420079200, 0xc4200b1da0, 0xc4200b1dc0)
/root/go/src/github.com/influxdata/whisper-migrator/migration.go:417 +0x72
created by main.(*MigrationData).MapWSPToTSMByShard
/root/go/src/github.com/influxdata/whisper-migrator/migration.go:418 +0x2ae
exit status 2

config.json:

[
  {
    "pattern": "#TEXT1",
    "measurement": "stats.gauges.prod.my-service.gauge.fooCount.#TEXT1",
    "tags": [
      {
        "tagkey": "count",
        "tagvalue": "#TEXT1"
      }
    ],
    "field": "value"
  }
]

Symbol "_" in wspPath and regexps in patterns

Hello!
I have many metrics from our servers and they are collected with the collectd. Names of these matrics includes symbol _ to make collectd's hostname more readable (ex. <service_name>_<region>_<role1>_<role2>_..._<hostname>).
In my migration_config.json I have

[
  {
    "pattern": "<service>_#TEXT1_#TEXT2_#TEXT3.#TEXT4.#TEXT5.#TEXT6",
    "measurement": "#TEXT6",
    "tags": [
      {
        "tagkey": "service",
        "tagvalue": "<some service name>"
      },
      {
        "tagkey": "region",
        "tagvalue": "#TEXT1"
      },
      {
        "tagkey": "role",
        "tagvalue": "#TEXT2"
      },
      {
        "tagkey": "host",
        "tagvalue": "#TEXT3"
      },
      {
        "tagkey": "type",
        "tagvalue": "#TEXT4"
      },
      {
        "tagkey": "subtype",
        "tagvalue": "#TEXT5"
      }
    ],
    "field": "value"
  }
]

And I have some errors while whisper-migrator is trying to parse this pattern and find it.

Q1: I think that symbol "_" is reserved in your program. How can I go through this?
Q2: can you make some regexp-like patterns in the migration_config.json? Like this

[
  {
    "pattern": "service_([a-zA-Z0-9]+)_([a-zA-Z0-9_]+)_([a-zA-Z0-9]+).([a-zA-Z0-9]+).([a-zA-Z0-9]+).([a-zA-Z0-9]+)",
    "measurement": "$6",
    "tags": [
      {
        "tagkey": "service",
        "tagvalue": "<some service name>"
      },
      {
        "tagkey": "region",
        "tagvalue": "$1"
      },
      {
        "tagkey": "role",
        "tagvalue": "$2" // will catch role1_role2_..._roleN
      },
      {
        "tagkey": "host",
        "tagvalue": "$3"
      },
      {
        "tagkey": "type",
        "tagvalue": "$4"
      },
      {
        "tagkey": "subtype",
        "tagvalue": "$5"
      }
    ],
    "field": "value"
  }
]

Expected import is not found go.uber.org/zap

I always get the same error when executing "go get" command inside whisper_migration folder:
/root/go/src/github.com/influxdata/influxdb/influxql/query_executor.go:12:2: code in directory /root/go/src/github.com/uber-go/zap expects import "go.uber.org/zap"

go version: 1.8
OS: Centos 7

Installation - code in directory /data/src/github.com/uber-go/zap expects import "go.uber.org/zap"

Hi,
I don't have experience running Go programs. I am trying to run: migration.go utility.

I am getting this when trying to run it with command: go run migration.go being at /data/whisper-migrator.

/data/src/github.com/influxdata/influxdb/query/query_executor.go:15:2: code in directory /data/src/github.com/uber-go/zap expects import "go.uber.org/zap"

Prior to running I've downloaded import like this: go get -v however this leaves me now with a same error.
Prior to running go get -v, I was getting this when running migration.go:

migration.go:7:2: cannot find package "github.com/influxdata/influxdb/client/v2" in any of:
	/usr/local/go/src/github.com/influxdata/influxdb/client/v2 (from $GOROOT)
	/data/src/github.com/influxdata/influxdb/client/v2 (from $GOPATH)
migration.go:8:2: cannot find package "github.com/influxdata/influxdb/tsdb/engine/tsm1" in any of:
	/usr/local/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1 (from $GOROOT)
	/data/src/github.com/influxdata/influxdb/tsdb/engine/tsm1 (from $GOPATH)
migration.go:9:2: cannot find package "github.com/uttamgandhi24/whisper-go/whisper" in any of:
	/usr/local/go/src/github.com/uttamgandhi24/whisper-go/whisper (from $GOROOT)
	/data/src/github.com/uttamgandhi24/whisper-go/whisper (from $GOPATH)

My Go variables are configured like this.


export PATH=$PATH:/usr/local/go/bin
export GOPATH=/data

I've installed Go on my Ubuntu 17.04 and this is the: go version go1.9 linux/amd64

Any help much appreciated

Detection of malformed json

When the migration_config.json file is not valid (for example you add a comma at a wrong place ;-) )
there are no warning from the tool.

Example of a file that caused me a problem (it took me some minutes to find why migrator doesn't understand my pattern) :

[
  {
    "pattern": "TEST.2013.#TEXT1.#TEXT2",
    "measurement": "#TEXT2",
    "tags": [
      {
        "tagkey": "host",
        "tagvalue": "#TEXT1"
      }
    ],
    "field": "value"
  },
]

It could be a great help, if the migration tool return a warning when it can't parse the json.

Perhaps is it possible to get error returned by json.Unmarshal and display it as a warning (into the function ReadTagConfig)

err = json.Unmarshal(raw, &migrationData.tagConfigs)
if err != nil {
  fmt.Printf("Can not parse migration_config.json : %s", err)
}

Thanks for this tool, it will be very useful !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.