Coder Social home page Coder Social logo

mysqlbinlog_flashback's Introduction

#目前运行情况 现在已经在阿里的rds上,db为utf8字符集的生产环境下使用。其他环境没有在生产环境下使用,请小心。 #工具简介 ##概述 mysqlbinlog_back.py 是在线读取row格式的mysqld的binlog,然后生成反向的sql语句的工具。一般用于数据恢复的目的。 所谓反向的sql语句就是如果是insert,则反向的sql为delete。如果delete,反向的sql是insert,如果是update, 反向的sql还是update,但是update的值是原来的值。

最简单的例子为 python mysqlbinlog_back.py --host="127.0.0.1" --username="root" --port=43306 --password="" --schema=test --table="test5"

下面是程序输出结果 ls -l log/*

-rw-r--r-- 1 root root 2592 Nov 9 15:44 log/save_data_dml_test_20161109_154434.sql
-rw-r--r-- 1 root root 1315 Nov 9 15:44 log/flashback_test_20161109_154434.sql <--- 反向sq文件
-rw-r--r-- 1 root root 441 Nov 9 15:44 log/save_data_create_table_test_20161109_154434.sql

它会在线连接参数指定mysql,读取binlog,仅仅抽取对schema为test 表名test5的binlog,生成反向sq文件保存在log目录下,其中flash_开头的文件是反向的sql语句。

用mysql命令导入数据是一定指定字符集为utf8mb4,比如

mysql ... --default-character-set=utf8mb4 test < flashback_test_20161109_154434.sql

##详细描述 mysqlbinlog_back.py在线连接参数指定mysql,读取binlog,如果缺省,它通过show binary logs命令找到最近的binlog文件,从文件开头开始解析,一直解析到当前时间退出。

如果指定开始binary log文件名和位置(BINLOG_START_FILE_NAME,BINLOG_START_FILE_POSITION),会从指定binary log文件名和位置开始解析,一直BINLOG_END_TIME结束,中间会自动扫描跨多个binlog.

生成文件目录可以通过OUTPUT_FILE_PATH来指定。目录下有2个类: 一类是反向解析的文件,格式为flashback_schema名_当前时间.sql . 另一类用于审查数据的sql,审查数据的sql用于记录操作类型,sql的老、新值。其中, save_data_create_table_开头的文件用于生成建表语句,save_data_dml用于插入到新的表中。

##参数说明 python mysqlbinlog_back.py --help 看在线的帮助

另外也可以看一下CHANGELOG.txt

##依赖的包和环境 python2.6

pymysql

##内部原理 内部原理

#使用限制 1.支持mysql版本为MySQL 5.5 and 5.6.因为底层使用的是python-mysql-replication包。

2.数据库必须是row格式的。原因看这个链接

3.反向生成的表必须有主键。

4.日志必须在主库存在

5.反向生成的mysql数据类型列出在下面。没有列出的类型没有经过严格的测试,也许有问题

6.支持的类型

允许解析的字段类型,不在里面的会报错

ALLOW_TYPE={ "varchar":True, "char":True, "datetime":True, "date":True, "time":True, "timestamp":True, "bigint":True, "mediumint":True, "smallint":True, "tinyint":True, "int":True, "smallint":True, "decimal":True, "float":True, "double":True, "longtext":True, "tinytext":True, "text":True, "mediumtext":True }

#FAQ 1.mysqlbinlog_back.py 是否对数据库性能造成影响?

基本没有影响。因为代码对mysql的操作就是2种,第一种是伪装成mysql的从库去在线读取日志,对mysql的压力就是传输一下binlog.第二种会读取information_schema.columns系统表

2.对mysql字符集的支持什么

utf8测试通过。gbk方式没有测试,应该问题不大。

原理角度python都用utf8的方式读出数据,内部转换成unicode的方式,然后写文件输出到utf8编码格式的文件

如果数据库是utf8mb4的编码格式,因为产生的sql是utf8格式的,包括了utfmb4的编码,所以没有问题,但是特别注意2点

a.导入mysql时一定指定,mysql ... --default-character-set=utf8mb4

b.如果需要对产生的sql再次编辑,一定注意,因为包括utfmb4的一些符号对你的编辑器来说是不可见得,所以千万不要搞掉了哦。

3.对mysql时间类型的支持是什么

datetime没有时区的概念,所以是啥就是啥。 timestamp经过python转换成datetime,转换成运行程序的环境时区相关的时间

4.底层是用的python-mysql-replication 包,是否可以用原生态的python-mysql-replication替换呢?

不行,因为原生态的包开发的接口不够多,有些功能不具备。所以在它的代码基础上改了部分

5.指定event位置时是否会找出语句的丢失?

一定不能指定位置时指定在dml的位置,位置至少应该在dml之前的table_map的位置,当然更加好的位置应该是在事物开始的位置,也就是begin的位置。 因为一个dml会对应2个event,一个table_map,另一个是dml的event

#联系方式 mail:[email protected]

mysqlbinlog_flashback's People

Contributors

yilai123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mysqlbinlog_flashback's Issues

生成申查SQL有个小bug

具体情况如下:
DB version:MySQL 5.6.29 64bit 社区版
OS: RHEL 6.4 x86_64

1、解析出来的SQL
[root@rac3 log]# cat save_data_dml_test_20161206_164530.sql
#end_log_pos 1014 2016-12-06T16:42:36 1481013756 mysql-bin.000001;
insert into _wuyong_keep_data_(bfr_sex,bfr_id,bfr_user_id,op_datetime,op) values(None,1,2,'2016-12-06 16:42:36','delete');

2、说明
--sex字段如果有空值,对应的空值会被解析成None

3、执行插入会报如下错误,导致插入失败
mysql> insert into _wuyong_keep_data_(bfr_sex,bfr_id,bfr_user_id,op_datetime,op) values(None,1,2,'2016-12-06 16:42:36','delete');
ERROR 1054 (42S22): Unknown column 'None' in 'field list'

4、建议,将空值解析成null,即可避免这个问题的出现。
mysql>insert into _wuyong_keep_data_(bfr_sex,bfr_id,bfr_user_id,op_datetime,op) values(null,1,2,'2016-12-06 16:42:36','delete');
Query OK, 1 row affected (0.11 sec)

UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 673: ordinal not in range(128)

===log will also write to .//mysqlbinlog_flashback.log===
parameter={'start_binlog_file': 'mysql-bin.043654', 'stream': None, 'keep_data': True, 'file': {'data_create': None, 'flashback': None, 'data': None}, 'add_schema_name': False, 'start_time': None, 'keep_current_data': False, 'start_to_timestamp': 1680229920, 'mysql_setting': {'passwd': 'weflow@pms', 'host': '9.148.120.87', 'charset': 'utf8', 'port': 3306, 'user': 'root'}, 'table_name': 'ACT_GE_BYTEARRAY', 'skip_delete': False, 'schema': 'test_full_link_1', 'stat': {'flash_sql': {}}, 'table_name_array': ['ACT_GE_BYTEARRAY'], 'one_binlog_file': False, 'output_file_path': './log', 'start_position': 4, 'skip_update': False, 'dump_event': False, 'end_to_timestamp': 1680230400, 'skip_insert': False, 'schema_array': ['test_full_link_1']}
scan 10000 events ....from binlogfile=mysql-bin.043654,timestamp=2023-03-31T10:32:22
scan 20000 events ....from binlogfile=mysql-bin.043654,timestamp=2023-03-31T10:32:24
scan 30000 events ....from binlogfile=mysql-bin.043654,timestamp=2023-03-31T10:32:24
scan 40000 events ....from binlogfile=mysql-bin.043654,timestamp=2023-03-31T10:32:30
error:'ascii' codec can't decode byte 0xe5 in position 673: ordinal not in range(128)
=====Additional info:dump stack to diagnose =======
Traceback (most recent call last):
File "mysqlbinlog_back.py", line 190, in main
deal_all_event(parameter)
File "/mysqlbinlog_flashback/flashback.py", line 201, in deal_all_event
deal_delete_rows(event,parameter.file,parameter.stat,logfile=stream.log_file,add_schema_name=parameter.add_schema_name)
File "/mysqlbinlog_flashback/flashback.py", line 306, in deal_delete_rows
sql=joint_insert_sql(event.schema,event.table,event.primary_key,row,add_schema_name)
File "/mysqlbinlog_flashback/joint_sql.py", line 41, in joint_insert_sql
(columns,values)=generate_two_array_column_and_value(row["values"])
File "/mysqlbinlog_flashback/joint_sql.py", line 231, in generate_two_array_column_and_value
logger.debug(u"dump row ele: {0}={1},type={2}".format(key,row[key],type(row[key])))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 673: ordinal not in range(128)

回滚语句有误问题请教

mysql版本:8.0.18
mysqlbinlog_back版本:0.1.3
有如下测试:

1、新建一个测试表
2、插入2两条数据
3、根据主键更新
mysql> create table t_moshan_test(id int not null auto_increment primary key,age int not null);
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t_moshan_test select 0,1;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0

mysql> insert into t_moshan_test select 0,2;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0

mysql> update t_moshan_test set age = 1 where id = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> update t_moshan_test set age = 10 where id = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> select * from t_moshan_test;
+----+-----+
| id | age |
+----+-----+
| 1 | 1 |
| 2 | 10 |
+----+-----+
2 rows in set (0.00 sec)

mysql> update t_moshan_test set age = 11 where id = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> select * from t_moshan_test;
+----+-----+
| id | age |
+----+-----+
| 1 | 1 |
| 2 | 11 |
+----+-----+
2 rows in set (0.00 sec)

mysql> update t_moshan_test set age = 12 where id = 2;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0

mysql> select * from t_moshan_test;
+----+-----+
| id | age |
+----+-----+
| 1 | 1 |
| 2 | 12 |
+----+-----+
2 rows in set (0.00 sec)

mysql>

使用工具生成回滚语句如下
/home/moshan/py2env/bin/python mysqlbinlog_back.py --host=172.16.0.20 --username=myuser --port=3306 --password=123456--schema=test --tables=t_moshan_test -S 3306-binlog.000015 -l tmp_sql

#end_log_pos 70122394 2020-04-15T11:02:53 1586919773 3872-binlog.000015;
delete from t_moshan_test where id=1;
#end_log_pos 70122743 2020-04-15T11:02:54 1586919774 3872-binlog.000015;
delete from t_moshan_test where id=2;
#end_log_pos 70123863 2020-04-15T11:05:10 1586919910 3872-binlog.000015;
update t_moshan_test setage=2,id=2 where id=1;
#end_log_pos 70124241 2020-04-15T11:05:15 1586919915 3872-binlog.000015;
update t_moshan_test setage=2,id=1 where id=10;
#end_log_pos 70126105 2020-04-15T11:08:52 1586920132 3872-binlog.000015;
update t_moshan_test setage=2,id=10 where id=11;
#end_log_pos 70126483 2020-04-15T11:08:56 1586920136 3872-binlog.000015;
update t_moshan_test setage=2,id=11 where id=12;

看这个回滚语句明显是有问题的,而且能稳定复现,有空请 帮忙看看

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.