Coder Social home page Coder Social logo

go-mysql-transfer's People

Contributors

wj596 avatar wufly632 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-mysql-transfer's Issues

导入Mongodb库id使用的是mysql表中的id 导致后面数据覆盖问题

您好:
首先感谢作者做出这么好的工具,最近有使用ETL工具的需求所以使用了一下go-mysql-transfer,从mqsql全量数据导入到mongodb中,由于mysql有多个同样的库,同样的表,到mongodb中把相同的表合并,正常情况mongodb会自动生成id,现在工具会把mysql的主键直接当做id,这样就导致id一样的数据会覆盖前一条,请问作者我们有没有什么办法实现不覆盖。谢谢!

解析后的消息格式再丰富的建议

很不错的工具,有个建议 json格式可以再丰富一下 类似国外的一个工具 maxwell,http://maxwells-daemon.io/ 加上时间戳字段(毫秒级别) 很多时候 一条数据多次update 需要取到最新的一条

insert

{
    "database": "test",
    "table": "maxwell",
    "type": "insert",
    "ts": 1449786310,
    "xid": 940752,
    "commit": true,
    "data": { "id":1, "daemon": "Stanislaw Lem" }
  }

update 新增oldkey 这样可以知道变化内容是什么,下游可以做数据的监控过滤等

{
    "database": "test",
    "table": "maxwell",
    "type": "update",
    "ts": 1449786341,
    "xid": 940786,
    "commit": true,
    "data": {"id":1, "daemon": "Firebus!  Firebus!"},
    "old":  {"daemon": "Stanislaw Lem"}
  }

redis的list数据类型,是否可以用表字段的值作为key

当前的业务场景:需要用表字段A的值去查询其他值的集合,现在这样的表有几张表,且每张表都有表字段A,每张表字段A的值都有重复,综上我决定使用list类型,现在我发现list类型不能以表字段A的值作为主键,请问这个有解决办法吗?

如何从指定的position开始同步

下面这条命令是不是只能指定 position位置,但是不能从这个位置开始同步 ? 因为这条命令执行完毕后就退出了。
其次,能否解答一下,我想从指定的position 位置开始同步应该如何写 ?
[root@test transfer]# ./go-mysql-transfer -config app.yml -position test-bin.000002 597715219
2021-02-03 18:28:10.906826 I | process id: 39716
2021-02-03 18:28:10.906894 I | GOMAXPROCS :2
2021-02-03 18:28:10.906898 I | source mysql(xxxx:3306)
2021-02-03 18:28:10.906903 I | destination elasticsearch(http://xxxx:9200)
The current dump position is : test-bin.000002 597715219

transfer为什么重启会失败

问题描述:
在同步数据到es的过程中,transfer进程宕掉之后,再次重启会失败,报错如下:
[root@test transfer]# cat nohup.out

panic: interface conversion: interface {} is nil, not map[string]interface {}

goroutine 1 [running]:
go-mysql-transfer/service/endpoint.(*Elastic6Endpoint).updateIndexMapping(0xc000350940, 0xc00024aa00, 0xc00003e0a8, 0x301fa01)
D:/dev/golang/projects/go-mysql-transfer/service/endpoint/elastic6.go:131 +0xa73
go-mysql-transfer/service/endpoint.(*Elastic6Endpoint).indexMapping(0xc000350940, 0xc00003e0a8, 0xc0002636a0)
D:/dev/golang/projects/go-mysql-transfer/service/endpoint/elastic6.go:79 +0x1bc
go-mysql-transfer/service/endpoint.(*Elastic6Endpoint).Connect(0xc000350940, 0x1eaf5e0, 0xc000350940)
D:/dev/golang/projects/go-mysql-transfer/service/endpoint/elastic6.go:69 +0x18c
go-mysql-transfer/service.(*TransferService).initialize(0xc0002690a0, 0xc0002690a0, 0xc000080a20)
D:/dev/golang/projects/go-mysql-transfer/service/transfer_service.go:86 +0x238
go-mysql-transfer/service.Initialize(0x0, 0x0)
D:/dev/golang/projects/go-mysql-transfer/service/service.go:35 +0x7d
main.main()
D:/dev/golang/projects/go-mysql-transfer/main.go:121 +0xfe

同步数据量比较大时报,大概不到100万条记录

fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x1a59567, 0x16)
D:/dev/golang/root/src/runtime/panic.go:1112 +0x72
runtime.sysMap(0xc070000000, 0x4000000, 0x3022678)
D:/dev/golang/root/src/runtime/mem_linux.go:169 +0xc5
runtime.(*mheap).sysAlloc(0x300cf00, 0x400000, 0x2ba00c0, 0x0)
D:/dev/golang/root/src/runtime/malloc.go:715 +0x1cd
runtime.(*mheap).grow(0x300cf00, 0x1, 0x0)
D:/dev/golang/root/src/runtime/mheap.go:1286 +0x11c
runtime.(*mheap).allocSpan(0x300cf00, 0x1, 0x7f1943b10e00, 0x3022688, 0x7f1943332a48)
D:/dev/golang/root/src/runtime/mheap.go:1124 +0x6a0
runtime.(*mheap).alloc.func1()
D:/dev/golang/root/src/runtime/mheap.go:871 +0x64
runtime.systemstack(0x0)
D:/dev/golang/root/src/runtime/asm_amd64.s:370 +0x66
runtime.mstart()
D:/dev/golang/root/src/runtime/proc.go:1041

goroutine 64 [running]:
runtime.systemstack_switch()
D:/dev/golang/root/src/runtime/asm_amd64.s:330 fp=0xc0004f9880 sp=0xc0004f9878 pc=0x462a30
runtime.(*mheap).alloc(0x300cf00, 0x1, 0x10e, 0x0)
D:/dev/golang/root/src/runtime/mheap.go:865 +0x81 fp=0xc0004f98d0 sp=0xc0004f9880 pc=0x425651
runtime.(*mcentral).grow(0x301d6d8, 0x0)

mongodb企业版 4.4开启用户验证后无法登陆成功

今天尝试了一下发现 mysql8同步到MongoDB server version: 4.4.2 开启认证后存在无法登陆问题,
报错:
2020-12-28 01:09:24 info GOMAXPROCS :4
2020-12-28 01:09:24 error connection() : auth error: sasl conversation error: unable to authenticate using mechanism "SCRAM-SHA-1": (AuthenticationFailed) Authentication failed.
2020-12-28 01:09:24 error connection() : auth error: sasl conversation error: unable to authenticate using mechanism "SCRAM-SHA-1": (AuthenticationFailed) Authentication failed.
2020-12-28 01:09:24 info closing transfer
[2020/12/28 01:09:24] [info] canal.go:242 closing canal
[2020/12/28 01:09:24] [info] binlogsyncer.go:176 syncer is closing...
[2020/12/28 01:09:24] [info] binlogsyncer.go:203 syncer is closed
账户权限已经测试过了没问题。
app.yml配置
#mongodb连接配置
mongodb_addrs: 127.0.0.1:27017 #mongodb连接地址多个用逗号分隔
mongodb_username: lan
mongodb_password: lan

通读完源码后有几个疑问希望得到解答

  1. 目前多节点部署只是起到了主备的作用?因为备节点是不打开canal的所以可以理解为不工作吗?
  2. 目前对于每一个onRow过来的数据处理都是串行处理,在mysql并发增删改查数量特别大时处理速度是否会存在瓶颈?
  3. 能否提供一个改动方案在同步mysql数据时添加一定量的业务逻辑处理。比如同步某个表数据时需要做一些逻辑运算。

Thank!

日志级别设置无效,请教,谢谢

如下设置,后在store/log/system.log还是显示info级别的日志信息, 是这个设置有哪里不对的吗?


#系统相关配置
#data_dir: D:\transfer #应用产生的数据存放地址,包括日志、缓存数据等,默认当前运行目录下store文件夹
logger:
level: error #日志级别;支持:debug|info|warn|error,默认info

./go-mysql-transfer -stock 运行时报错elastic: Error 503 (Service Unavailable)

[19 14:29:59 root@iZbp1i72h4fylprepdgn5nZ /data/go-mysql-transfer]# ./go-mysql-transfer -stock
2021-03-23 14:30:08.290079 I | process id: 7676
2021-03-23 14:30:08.290096 I | GOMAXPROCS :8
2021-03-23 14:30:08.290101 I | source mysql(47.99.188.92:3306)
2021-03-23 14:30:08.290105 I | destination elasticsearch(http://127.0.0.1:9200)
2021-03-23 14:30:38.313107 I | elastic: Error 503 (Service Unavailable)
elastic: Error 503 (Service Unavailable)
/data/go-mysql-transfer/service/stock_service.go:85:

大佬:我按照安装说明最后执行全量同步时 提示 elastic: Error 503 (Service Unavailable),请问是什么原因?求指教

更换MySQL服务器报错

更换备份的MySQL后, 报错:
2021-01-30 12:02:50.876824 I | transfer run from position(mysql-bin.000396 49461910)
2021-01-30 12:02:50.882445 I | start transfer : ERROR 1236 (HY000): Could not find first log file name in binary log index file
2021-01-30 12:02:50.885584 I | transfer stop

如何可以重置一下啊?

transfer中mysql binlog position的位置问题

首先感谢作者的辛苦付出,我这里在使用的过程中碰到了如下问题,还望作者能解答一下
1、
当我的mysql执行了reset master命令重置了mysql position位置时,go-mysql-transfer工具会报错:
[error] binlogstreamer.go:77 close sync with err: ERROR 1236 (HY000): could not find next log; the first event 'test-bin.000003' at 553, the last event read from './test-bin.000004' at 553, the last byte read from './test-bin.000004' at 553.

所以我想在重启 go-mysql-transfer 的时候,自己指定mysql binlog position的位置 ,根据用法我尝试使用了如下命令,但报错了,请问该如何解决:
./go-mysql-transfer -config app.yml -position test-bin.000002 553
error: The parameter File must be like: mysql-bin.000001

支持对app.yml文件的校验,指出出错行、区段、表名、字段名

D:\Program Files\Devs\go-mysql-transfer>go-mysql-transfer -config app.yml
2021-04-07 13:31:42.213401 I | process id: 9348
2021-04-07 13:31:42.230354 I | GOMAXPROCS :16
2021-04-07 13:31:42.230354 I | source mysql(127.0.0.1:3306)
2021-04-07 13:31:42.230354 I | destination rocketmq(127.0.0.1:9876)
D:/dev/golang/projects/go-mysql-transfer/global/rule.go:360: column_mappings must be table column
D:/dev/golang/projects/go-mysql-transfer/service/transfer_service.go:265:
D:/dev/golang/projects/go-mysql-transfer/service/transfer_service.go:73:

why don't sync to mysql

为什么不增加一个mysql的同步。
虽然已经使用otter在同步数据库,
想找个临时拷贝数据的工作,几百GB左右,看了一下这个工具,发现没有mysql的。

redis多纬度需求以及过期时间设置需求~

需求背景

这样同步redis想必大家都是为了做cache层使用的吧

过期时间设置

当前关于同步到redis 在代码中是写死的过期时间为0 实际需求中可能需要按需写入 例如10秒 1天等情况 也需要更新时自动进行续期

多纬度

举例: 用户表中有 手机号 微信号 ID 一般业务中 是会把ID全量写入cache 其他纬度 例如手机号 作为一个key 映射到id的key 在redis下就为 user:1 value:一串json字符串 user:telephone:13888888888 value: user:1 user:weid:1234 value:user:1

当然了这只是redis作为cache同步的需求 其实使用lua脚本也能实现 但是性能掉落一半 属实有点难以接受~

同步数据过大时,导致redis超时,可否增加配置文件设置超时时间。

2021-02-03 16:21:42 error read tcp 127.0.0.1:51892->127.0.0.1:6379: i/o timeout
[2021/02/03 16:21:44] [info] canal.go:242 closing canal
[2021/02/03 16:21:45] [info] binlogsyncer.go:176 syncer is closing...
2021-02-03 16:21:46 info Canal is Closed
[2021/02/03 16:21:48] [info] binlogsyncer.go:850 kill last connection id 4641
[2021/02/03 16:21:49] [info] binlogsyncer.go:203 syncer is closed
[2021/02/03 16:21:50] [info] binlogsyncer.go:144 create BinlogSyncer with config {1002 mysql 192.168. 1.201 3306 root utf8 false false false UTC false 0 0s 0s 0 false false 0}
[2021/02/03 16:21:51] [info] dump.go:199 skip dump, use last binlog replication pos (mysql-bin. 000031, 203049874) or GTID set
[2021/02/03 16:21:51] [info] binlogsyncer.go:360 begin to sync binlog from position (mysql-bin. 000031, 203049874)
[2021/02/03 16:21:52] [info] sync.go:25 start sync binlog at binlog file (mysql-bin.000031, 203049874)
[2021/02/03 16:21:52] [info] binlogsyncer.go:777 rotate to (mysql-bin.000031, 203049874)
[2021/02/03 16:21:53] [info] sync.go:68 received fake rotate event, next log name is mysql-bin.000031

请问支持多库多表同步到一个ES索引吗?

情形如下:
user 库一张 user 表

uid username gendor age address
1 zhangshan male 33 beijing

info 库中 info 表

uid realname
1 张三

vip 库中 vipinfo表 表

uid amount
1 1234

然后将三张表合并并同步到es:

{
    "uid":1,
    "username":"zhangshan",
    "gendor":"male",
    "age":33,
    "address":"beijing",
    "realname":"张三",
    "amount":1234
}

以后更新任何表,自动同步更新es

全量同步怎么只同步部分,并且报错这个,是什么原因

2020-09-21 11:43:11 info source mysql(rm-######.mysql.cn-chengdu.rds.aliyuncs.com:3306)
2020-09-21 11:43:11 info destination redis(192.168.2.240:6379)
[2020/09/21 11:43:11] [info] binlogsyncer.go:144 create BinlogSyncer with config {1001 mysql rm-#####.mysql.cn-chengdu.rds.aliyuncs.com 3306 xinhu utf8 false false false UTC false 0 0s 0s 0 false false 0}
2020-09-21 11:43:11 info GOMAXPROCS :16
2020-09-21 11:43:16 error write tcp 192.168.9.27:33456->192.168.2.240:6379: i/o timeout
2020-09-21 11:43:17 error write tcp 192.168.9.27:33458->192.168.2.240:6379: i/o timeout
2020-09-21 11:43:17 error write tcp 192.168.9.27:33460->192.168.2.240:6379: i/o timeout
2020-09-21 11:43:17 error write tcp 192.168.9.27:33462->192.168.2.240:6379: i/o timeout
2020-09-21 11:43:17 error write tcp 192.168.9.27:33464->192.168.2.240:6379: i/o timeout
2020-09-21 11:43:18 error write tcp 192.168.9.27:33466->192.168.2.240:6379: i/o timeout
2020-09-21 11:43:18 error write tcp 192.168.9.27:33472->192.168.2.240:6379: i/o timeout
2020-09-21 11:43:19 error write tcp 192.168.9.27:33474->192.168.2.240:6379: i/o timeout
2020-09-21 11:43:20 info closing transfer
[2020/09/21 11:43:20] [info] canal.go:242 closing canal
[2020/09/21 11:43:20] [info] binlogsyncer.go:176 syncer is closing...
[2020/09/21 11:43:20] [info] binlogsyncer.go:203 syncer is closed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.