Coder Social home page Coder Social logo

binlog2sql's Introduction

binlog2sql

从MySQL binlog解析出你要的SQL。根据不同选项,你可以得到原始SQL、回滚SQL、去除主键的INSERT SQL等。

用途

  • 数据快速回滚(闪回)
  • 主从切换后新master丢数据的修复
  • 从binlog生成标准SQL,带来的衍生功能

项目状态

正常维护。应用于部分公司线上环境。

  • 已测试环境
    • Python 2.7, 3.4+
    • MySQL 5.6, 5.7

安装

shell> git clone https://github.com/danfengcao/binlog2sql.git && cd binlog2sql
shell> pip install -r requirements.txt

git与pip的安装问题请自行搜索解决。

使用

MySQL server必须设置以下参数:

[mysqld]
server_id = 1
log_bin = /var/log/mysql/mysql-bin.log
max_binlog_size = 1G
binlog_format = row
binlog_row_image = full

user需要的最小权限集合:

select, super/replication client, replication slave

建议授权
GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 

权限说明

  • select:需要读取server端information_schema.COLUMNS表,获取表结构的元信息,拼接成可视化的sql语句
  • super/replication client:两个权限都可以,需要执行'SHOW MASTER STATUS', 获取server端的binlog列表
  • replication slave:通过BINLOG_DUMP协议获取binlog内容的权限

基本用法

解析出标准SQL

shell> python binlog2sql.py -h127.0.0.1 -P3306 -uadmin -p'admin' -dtest -t test3 test4 --start-file='mysql-bin.000002'

输出:
INSERT INTO `test`.`test3`(`addtime`, `data`, `id`) VALUES ('2016-12-10 13:03:38', 'english', 4); #start 570 end 736
UPDATE `test`.`test3` SET `addtime`='2016-12-10 12:00:00', `data`='中文', `id`=3 WHERE `addtime`='2016-12-10 13:03:22' AND `data`='中文' AND `id`=3 LIMIT 1; #start 763 end 954
DELETE FROM `test`.`test3` WHERE `addtime`='2016-12-10 13:03:38' AND `data`='english' AND `id`=4 LIMIT 1; #start 981 end 1147

解析出回滚SQL

shell> python binlog2sql.py --flashback -h127.0.0.1 -P3306 -uadmin -p'admin' -dtest -ttest3 --start-file='mysql-bin.000002' --start-position=763 --stop-position=1147

输出:
INSERT INTO `test`.`test3`(`addtime`, `data`, `id`) VALUES ('2016-12-10 13:03:38', 'english', 4); #start 981 end 1147
UPDATE `test`.`test3` SET `addtime`='2016-12-10 13:03:22', `data`='中文', `id`=3 WHERE `addtime`='2016-12-10 12:00:00' AND `data`='中文' AND `id`=3 LIMIT 1; #start 763 end 954

选项

mysql连接配置

-h host; -P port; -u user; -p password

解析模式

--stop-never 持续解析binlog。可选。默认False,同步至执行命令时最新的binlog位置。

-K, --no-primary-key 对INSERT语句去除主键。可选。默认False

-B, --flashback 生成回滚SQL,可解析大文件,不受内存限制。可选。默认False。与stop-never或no-primary-key不能同时添加。

--back-interval -B模式下,每打印一千行回滚SQL,加一句SLEEP多少秒,如不想加SLEEP,请设为0。可选。默认1.0。

解析范围控制

--start-file 起始解析文件,只需文件名,无需全路径 。必须。

--start-position/--start-pos 起始解析位置。可选。默认为start-file的起始位置。

--stop-file/--end-file 终止解析文件。可选。默认为start-file同一个文件。若解析模式为stop-never,此选项失效。

--stop-position/--end-pos 终止解析位置。可选。默认为stop-file的最末位置;若解析模式为stop-never,此选项失效。

--start-datetime 起始解析时间,格式'%Y-%m-%d %H:%M:%S'。可选。默认不过滤。

--stop-datetime 终止解析时间,格式'%Y-%m-%d %H:%M:%S'。可选。默认不过滤。

对象过滤

-d, --databases 只解析目标db的sql,多个库用空格隔开,如-d db1 db2。可选。默认为空。

-t, --tables 只解析目标table的sql,多张表用空格隔开,如-t tbl1 tbl2。可选。默认为空。

--only-dml 只解析dml,忽略ddl。可选。默认False。

--sql-type 只解析指定类型,支持INSERT, UPDATE, DELETE。多个类型用空格隔开,如--sql-type INSERT DELETE。可选。默认为增删改都解析。用了此参数但没填任何类型,则三者都不解析。

应用案例

误删整张表数据,需要紧急回滚

闪回详细介绍可参见example目录下《闪回原理与实战》example/mysql-flashback-priciple-and-practice.md

test库tbl表原有数据
mysql> select * from tbl;
+----+--------+---------------------+
| id | name   | addtime             |
+----+--------+---------------------+
|  1 | 小赵   | 2016-12-10 00:04:33 |
|  2 | 小钱   | 2016-12-10 00:04:48 |
|  3 | 小孙   | 2016-12-13 20:25:00 |
|  4 | 小李   | 2016-12-12 00:00:00 |
+----+--------+---------------------+
4 rows in set (0.00 sec)

mysql> delete from tbl;
Query OK, 4 rows affected (0.00 sec)

20:28时,tbl表误操作被清空
mysql> select * from tbl;
Empty set (0.00 sec)

恢复数据步骤

  1. 登录mysql,查看目前的binlog文件

    mysql> show master status;
    +------------------+-----------+
    | Log_name         | File_size |
    +------------------+-----------+
    | mysql-bin.000051 |       967 |
    | mysql-bin.000052 |       965 |
    +------------------+-----------+
  2. 最新的binlog文件是mysql-bin.000052,我们再定位误操作SQL的binlog位置。误操作人只能知道大致的误操作时间,我们根据大致时间过滤数据。

    shell> python binlog2sql/binlog2sql.py -h127.0.0.1 -P3306 -uadmin -p'admin' -dtest -ttbl --start-file='mysql-bin.000052' --start-datetime='2016-12-13 20:25:00' --stop-datetime='2016-12-13 20:30:00'
    输出:
    INSERT INTO `test`.`tbl`(`addtime`, `id`, `name`) VALUES ('2016-12-13 20:26:00', 4, '小李'); #start 317 end 487 time 2016-12-13 20:26:26
    UPDATE `test`.`tbl` SET `addtime`='2016-12-12 00:00:00', `id`=4, `name`='小李' WHERE `addtime`='2016-12-13 20:26:00' AND `id`=4 AND `name`='小李' LIMIT 1; #start 514 end 701 time 2016-12-13 20:27:07
    DELETE FROM `test`.`tbl` WHERE `addtime`='2016-12-10 00:04:33' AND `id`=1 AND `name`='小赵' LIMIT 1; #start 728 end 938 time 2016-12-13 20:28:05
    DELETE FROM `test`.`tbl` WHERE `addtime`='2016-12-10 00:04:48' AND `id`=2 AND `name`='小钱' LIMIT 1; #start 728 end 938 time 2016-12-13 20:28:05
    DELETE FROM `test`.`tbl` WHERE `addtime`='2016-12-13 20:25:00' AND `id`=3 AND `name`='小孙' LIMIT 1; #start 728 end 938 time 2016-12-13 20:28:05
    DELETE FROM `test`.`tbl` WHERE `addtime`='2016-12-12 00:00:00' AND `id`=4 AND `name`='小李' LIMIT 1; #start 728 end 938 time 2016-12-13 20:28:05
  3. 我们得到了误操作sql的准确位置在728-938之间,再根据位置进一步过滤,使用flashback模式生成回滚sql,检查回滚sql是否正确(注:真实环境下,此步经常会进一步筛选出需要的sql。结合grep、编辑器等)

    shell> python binlog2sql/binlog2sql.py -h127.0.0.1 -P3306 -uadmin -p'admin' -dtest -ttbl --start-file='mysql-bin.000052' --start-position=3346 --stop-position=3556 -B > rollback.sql | cat
    输出:
    INSERT INTO `test`.`tbl`(`addtime`, `id`, `name`) VALUES ('2016-12-12 00:00:00', 4, '小李'); #start 728 end 938 time 2016-12-13 20:28:05
    INSERT INTO `test`.`tbl`(`addtime`, `id`, `name`) VALUES ('2016-12-13 20:25:00', 3, '小孙'); #start 728 end 938 time 2016-12-13 20:28:05
    INSERT INTO `test`.`tbl`(`addtime`, `id`, `name`) VALUES ('2016-12-10 00:04:48', 2, '小钱'); #start 728 end 938 time 2016-12-13 20:28:05
    INSERT INTO `test`.`tbl`(`addtime`, `id`, `name`) VALUES ('2016-12-10 00:04:33', 1, '小赵'); #start 728 end 938 time 2016-12-13 20:28:05
  4. 确认回滚sql正确,执行回滚语句。登录mysql确认,数据回滚成功。

    shell> mysql -h127.0.0.1 -P3306 -uadmin -p'admin' < rollback.sql
    
    mysql> select * from tbl;
    +----+--------+---------------------+
    | id | name   | addtime             |
    +----+--------+---------------------+
    |  1 | 小赵   | 2016-12-10 00:04:33 |
    |  2 | 小钱   | 2016-12-10 00:04:48 |
    |  3 | 小孙   | 2016-12-13 20:25:00 |
    |  4 | 小李   | 2016-12-12 00:00:00 |
    +----+--------+---------------------+

限制(对比mysqlbinlog)

  • mysql server必须开启,离线模式下不能解析
  • 参数 binlog_row_image 必须为FULL,暂不支持MINIMAL
  • 解析速度不如mysqlbinlog

优点(对比mysqlbinlog)

  • 纯Python开发,安装与使用都很简单
  • 自带flashback、no-primary-key解析模式,无需再装补丁
  • flashback模式下,更适合闪回实战
  • 解析为标准SQL,方便理解、筛选
  • 代码容易改造,可以支持更多个性化解析

贡献者

联系我

有任何问题,请与我联系。邮箱:[email protected]

欢迎提问题提需求,欢迎pull requests!

binlog2sql's People

Contributors

danfengcao avatar imzcy1987 avatar php-cpm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

binlog2sql's Issues

Python3.6 测试ok,加到readme里?

[root@db-admin01 binlog2sql]# time python3.6 ./binlog2sql.py -hXXXX -P3306 -uXXX -p'XXX' -dXXX -tXXX --start-file='mysql-bin.000546' --stop-file='mysql-bin.000547'  > /tmp/1.sql



real    12m15.685s
user    12m8.111s
sys     0m5.974s 

解析出的SQL看了下也是正常的~

所以除了Python3.4,Python3.6也能顺利的解析出了SQL,加到readme里?

第一次跑报错! cursors.py:323: Warning: (1366, "Incorrect string value: '\\xCA\\xC7\\xB7\\xF1\\xCF\\xD4...' for column 'COLUMN_COMMENT' at row)

binlog2sql.py -h XX.XX.XX.XX -u XXX -p XXX -P 3306 --start-file sql-bin.000703 --stop-file sql-bin.000704 --start-datetime '2019-02-28 22:00:00' --stop-datetime '2019-02-28 22:30:00' -d YYY -t YYY > 111.sql

/opt/python3_venvs/test_venv/lib/python3.6/site-packages/pymysql/cursors.py:323: Warning: (1366, "Incorrect string value: '\xCA\xC7\xB7\xF1\xCF\xD4...' for column 'COLUMN_COMMENT' at row 1")
self._do_get_result()
^CTraceback (most recent call last):
File "/opt/dba_dir/binlog2sql/binlog2sql/binlog2sql.py", line 150, in
binlog2sql.process_binlog()
File "/opt/dba_dir/binlog2sql/binlog2sql/binlog2sql.py", line 74, in process_binlog
for binlog_event in stream:
File "/opt/python3_venvs/test_venv/lib/python3.6/site-packages/pymysqlreplication/binlogstream.py", line 432, in fetchone
self.__fail_on_table_metadata_unavailable)
File "/opt/python3_venvs/test_venv/lib/python3.6/site-packages/pymysqlreplication/packet.py", line 139, in init
fail_on_table_metadata_unavailable=fail_on_table_metadata_unavailable)
File "/opt/python3_venvs/test_venv/lib/python3.6/site-packages/pymysqlreplication/row_event.py", line 604, in init
col = Column(byte2int(column_type), column_schema, from_packet)
File "/opt/python3_venvs/test_venv/lib/python3.6/site-packages/pymysqlreplication/column.py", line 14, in init
self.__parse_column_definition(*args)
File "/opt/python3_venvs/test_venv/lib/python3.6/site-packages/pymysqlreplication/column.py", line 45, in __parse_column_definition
self.__read_string_metadata(packet, column_schema)
File "/opt/python3_venvs/test_venv/lib/python3.6/site-packages/pymysqlreplication/column.py", line 62, in __read_string_metadata
metadata = (packet.read_uint8() << 8) + packet.read_uint8()
File "/opt/python3_venvs/test_venv/lib/python3.6/site-packages/pymysqlreplication/packet.py", line 270, in read_uint8
return struct.unpack('<B', self.read(1))[0]
KeyboardInterrupt

环境 python3.6.6, 已执行 pip3 install -r requirement.txt 操作。
请问如何解决该错误?

binlog2sql 无法生成回滚sql

[root@test test]# binlog2sql-master/binlog2sql/binlog2sql.py -h localhost -P 3306 -u root -p -d aproject --start-file="mysql-bin.000001" --start-datetime="2018-11-13 17:00:00" --stop-datetime="2018-11-13 19:00:00" -B > huifu4.sql
[root@test test]# cat huifu4.sql

为空

RDS binlog无法解析

阿里云RDS binlog无法解析,不知道是不是我的方法不对,还是本来就不行

启动多个客户端的时候,server_id冲突

请问是否是有意这样设计,如果是的话是因为什么?

如果没有特别设计,解决该问题path,请参考是否采纳

--- a/binlog2sql/binlog2sql.py
+++ b/binlog2sql/binlog2sql.py
@@ -4,6 +4,7 @@
 import sys
 import datetime
 import pymysql
+import random
 from pymysqlreplication import BinLogStreamReader
 from pymysqlreplication.event import QueryEvent, RotateEvent, FormatDescriptionEvent
 from binlog2sql_util import command_line_args, concat_sql_from_binlog_event, create_unique_file, temp_open, \
@@ -56,8 +57,9 @@ class Binlog2sql(object):
                 if binlog2i(self.start_file) <= binlog2i(binary) <= binlog2i(self.end_file):
                     self.binlogList.append(binary)

-            cursor.execute("SELECT @@server_id")
-            self.server_id = cursor.fetchone()[0]
+            #cursor.execute("SELECT @@server_id")
+            #self.server_id = cursor.fetchone()[0]
+            self.server_id=random.randint(1024,4294967295)
             if not self.server_id:
                 raise ValueError('missing server_id in %s:%s' % (self.conn_setting['host'], self.conn_setting['port']))

问题咨询

elif type(binlogevent) in (WriteRowsEvent, UpdateRowsEvent, DeleteRowsEvent):
for row in binlogevent.rows:
mysql有没有用到定制版呢?
这里获取的binlogevent.rows怎么是空呢?
{'_RowsEvent__only_schemas': None,
'_RowsEvent__only_tables': None,
'_RowsEvent__rows': [],
'_ctl_connection': <pymysql.connections.Connection object at 0x7fb0fcfa87d0>,
'_fail_on_table_metadata_unavailable': False,
'_processed': True,
'columns': [],
'columns_present_bitmap': '\xff',
'columns_present_bitmap2': '\xff',
'complete': False,
'event_size': 28,
'event_type': 31,
'extra_data': '',
'extra_data_length': 2,
'flags': 1,
'number_of_columns': 2,
'packet': <pymysqlreplication.packet.BinLogPacketWrapper object at 0x7fb0fcfa8f10>,
'primary_key': '',
'schema': u'zcy',
'table': u'tb1',
'table_id': 71,
'table_map': {71: <pymysqlreplication.table.Table object at 0x7fb0fcfa8e90>},
'timestamp': 1481685929}

表中文字段转码报错

Traceback (most recent call last):
File "binlog2sql.py", line 150, in
binlog2sql.process_binlog()
File "binlog2sql.py", line 121, in process_binlog
self.print_rollback_sql(filename=tmp_file)
File "binlog2sql.py", line 129, in print_rollback_sql
for line in reversed_lines(f_tmp):
File "C:\Users\30883\Downloads\binlog2sql-master\binlog2sql-master\binlog2sql\binlog2sql_util.py", line 249, in reversed_lines
block = block.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd6 in position 278: invalid continuation byte

表信息:) ENGINE=InnoDB DEFAULT CHARSET=utf8;

工具依赖的模块版本号大于指定版本行不行?

工具很好用,谢谢大神!

我看 requirements.txt 中的模块版本号用的是 “==”,请教下如果这些模块的版本号大于 requirements.txt 中指定的版本,工具是否可以正常使用?
这3个模板都已经有新版本了

PyMySQL==0.7.11
wheel==0.29.0
mysql-replication==0.13

mysql8.0.16 执行命令报错

python binlog2sql.py -h192.168.56.101 -P3306 -uroot -pXUq12 --start-file='binlog.000008'

Traceback (most recent call last):
File "binlog2sql.py", line 152, in
back_interval=args.back_interval, only_dml=args.only_dml, sql_type=args.sql_type)
File "binlog2sql.py", line 48, in init
self.connection = pymysql.connect(**self.conn_setting)
File "/usr/local/lib/python2.7/site-packages/pymysql/init.py", line 90, in Connect
return Connection(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/pymysql/connections.py", line 706, in init
self.connect()
File "/usr/local/lib/python2.7/site-packages/pymysql/connections.py", line 931, in connect
self._get_server_information()
File "/usr/local/lib/python2.7/site-packages/pymysql/connections.py", line 1269, in _get_server_information
self.server_charset = charset_by_id(lang).name
File "/usr/local/lib/python2.7/site-packages/pymysql/charset.py", line 38, in by_id
return self._by_id[id]
KeyError: 255

PyMySQL 0.7.11这个版本相对于mysql8以上的版低了,我更新到PyMySQL-0.9.3就可以了

mysql如果字段是bit(1),回滚会报错Data too long for column 'active' at row 1

Data too long for column 'active' at row 1
之前是通过暴力的int(value),发现部分string的也被转换成了int,不太满足需求
解决办法:
处理 数据fix_object的时候先查询出db schema, 判断字段是否为bint且长度为1,针对这种做int处理
关于mysql description:https://www.python.org/dev/peps/pep-0249/#cursor-attributes

注意事项
1.需要对description按照row['values'].items()排序后才能和row['values'].values()一一对应
2.UpdateRowsEvent需要按照row['before_values'].keys() + row['after_values'].items()排序
3.DeleteRowsEvent按照 row['values'].keys()排序

def fix_object(value, description=None):
        ..........
        if description[1] == 16 and description[3] == 1:
            return int(str_v)
def get_table_description(cursor, binlogevent):
    table = binlogevent.schema + "." + binlogevent.table
    sql = "select * from %s limit 1" % table
    cursor.execute(sql)

    return cursor.description

使用时抛出异常

C:\projects\binlog2sql\binlog2sql>py binlog2sql.py -uuser -ppassword -h127.0.0.1 -P3306 --start-file="mysql-bin.001958"
Traceback (most recent call last):
File "binlog2sql.py", line 150, in
binlog2sql.process_binlog()
File "binlog2sql.py", line 107, in process_binlog
row=row, flashback=self.flashback, e_start_pos=e_start_pos)
File "C:\projects\binlog2sql\binlog2sql\binlog2sql_util.py", line 178, in concat_sql_from_binlog_event
sql = cursor.mogrify(pattern['template'], pattern['values'])
File "C:\Users\henry\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymysql\cursors.py", line 143, in mogrify
query = query % self._escape_args(args, conn)
File "C:\Users\henry\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymysql\cursors.py", line 118, in _escape_args
return tuple(conn.literal(arg) for arg in args)
File "C:\Users\henry\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymysql\cursors.py", line 118, in
return tuple(conn.literal(arg) for arg in args)
File "C:\Users\henry\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymysql\connections.py", line 821, in literal
return self.escape(obj, self.encoders)
File "C:\Users\henry\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymysql\connections.py", line 814, in escape
return escape_item(obj, self.charset, mapping=mapping)
File "C:\Users\henry\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymysql\converters.py", line 25, in escape_item
val = encoder(val, charset, mapping)
File "C:\Users\henry\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pymysql\converters.py", line 42, in escape_sequence
return "(" + ",".join(n) + ")"
TypeError: sequence item 0: expected str instance, dict found

Double类型参数解析错误

原始数据某个字段为double类型,值有 6位小数,但是经过解析成SQL过后只有5位,向上取整了。导致update语句无法更新到具体的数据,更新失败

报了一个错误,python 2.6.6

File "binlog2sql.py", line 73
with temp_open(tmp_file, "w") as f_tmp, self.connection as cursor:
^
SyntaxError: invalid syntax

pymysql.err.InternalError: (1046, u'No database selected')

环境:

  • OS: CentOS 6.5 (阿里云主机)
  • Python: 2.7.6
  • MySQL: 5.6.16 (阿里云RDS)

终端执行命令:

python binlog2sql.py -h10.10.10.10 -uusername -p'password' -P3306 -ddbname --start-file='mysql-bin.000789' > /tmp/raw.sql

错误日志:

Traceback (most recent call last):
File "binlog2sql.py", line 125, in
binlog2sql.process_binlog()
File "binlog2sql.py", line 70, in process_binlog
for binlogevent in stream:
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysqlreplication/binlogstream.py", line 391, in fetchone
self.__freeze_schema)
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysqlreplication/packet.py", line 98, in init
freeze_schema = freeze_schema)
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysqlreplication/row_event.py", line 542, in init
self.column_schemas = self._ctl_connection._get_table_information(self.schema, self.table)
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysqlreplication/binlogstream.py", line 468, in __get_table_information
""", (schema, table))
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysql/cursors.py", line 166, in execute
result = self._query(query)
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysql/cursors.py", line 322, in _query
conn.query(q)
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysql/connections.py", line 837, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysql/connections.py", line 1021, in _read_query_result
result.read()
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysql/connections.py", line 1304, in read
first_packet = self.connection._read_packet()
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysql/connections.py", line 983, in _read_packet
packet.check_error()
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysql/connections.py", line 395, in check_error
err.raise_mysql_exception(self._data)
File "/usr/local/python2.7/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.InternalError: (1046, u'No database selected')

参数

log_bin ON
server_id 8366
binlog_format ROW
max_binlog_size 524288000

备注

并非解析每个binlog文件都会报该错误;
如果指定的数据库名称为不存在的数据库时反而不会报错,但解析出的sql中只有DDL而没有DML语句。

局域网怎么安装

你好,你这样的,我们的生产环境是不可以上外网的,用官网的这种安装方式安装不了呢,pip和git的安装可以解决.
但是运行:pip install -r requirements.txt 的时候会报错,
image
是需要在线安装:
PyMySQL==0.7.11
wheel==0.29.0
mysql-replication==0.13
这三个软件吗?

cann't build

E:\binlog2sql\binlog2sql>python binlog2sql.py -h127.0.0.1 -P3306 -uadmin -p'admi
n' -dtest -t test3 test4 --start-file='mysql-bin.000002'
File "binlog2sql.py", line 67
def compare_items((k, v)):
^
SyntaxError: invalid syntax

bit类型字段反解析问题

执行语法报错:
INSERT INTO TABLE1(ID,NAME.ISxx) VALUES(1,'CDF','0');
执行语法正确:
INSERT INTO TABLE1(ID,NAME.ISxx) VALUES(1,'CDF',b'0');

自动化操作时,参数endFile未知处理

首先非常感谢提供这个工具!

准备利用这个工具自动导出做一些增量备份,不过自动导出时,endFile经常是未知的,是否可以加个参数,实现处理从startFile开始后的所有文件的binlog (应用中,自己已经在这么做了 - - )

Passing of variables

I would like to use cronjob to automate this task.That means those argument,the ones you are using in the command line need to be passed in the script not the terminal.How am i suppose to go about?

--stop-never option has no effect

Hi,

First thank you for your tool because it works well for my usage.

I do have a small issue with --stop-never option that doesn't seem to work because the script ends when it arrives at the end. I suppose that the expected behavior is to wait until a new event arrive in the binary log ? if so the following command grabs SQL correctly but it stop at the end :

python binlog2sql/binlog2sql.py -h X.X.X.X -u root -p --start-file mysql-bin.000001 --start-position=17024124 --stop-never

Thanks
Laurent

解析出的语句会包含非指定表的SQL

你好,我用binlog2sql解析指定的表,发现解析出来的语句中会带有其他表的操作,加上-B选项生成的回滚语句是指定表的,请问是怎么回事?
还有,请问一下这个是只能生成DML的回滚语句么?
下面是我的操作过程:

python binlog2sql.py  -h127.0.0.1 -P3306 -uroot -proot -d test -t t1 --start-file='binlog.000361'  >a.sql
USE b'fdd_esf_push';

##下面是输出结果
alter table t_user_reach_msgs_old drop index idx_create_status;
USE b'fdd_esf_push';
alter table t_user_reach_msgs drop index IX_handle_time;
DELETE FROM `test`.`t1` WHERE `user`='u1' AND `action`=1 LIMIT 1; #start 382 end 571 time 2018-07-24 15:27:28
DELETE FROM `test`.`t1` WHERE `user`='u1' AND `action`=1 LIMIT 1; #start 382 end 571 time 2018-07-24 15:27:28
DELETE FROM `test`.`t1` WHERE `user`='u1' AND `action`=2 LIMIT 1; #start 382 end 571 time 2018-07-24 15:27:28
DELETE FROM `test`.`t1` WHERE `user`='u2' AND `action`=1 LIMIT 1; #start 382 end 571 time 2018-07-24 15:27:28
DELETE FROM `test`.`t1` WHERE `user`='u3' AND `action`=2 LIMIT 1; #start 382 end 571 time 2018-07-24 15:27:28
DELETE FROM `test`.`t1` WHERE `user`='u4' AND `action`=2 LIMIT 1; #start 382 end 571 time 2018-07-24 15:27:28
USE b'test';
alter table t1 add column username varchar(20) default 'a';

python

hi,你好
现在 binlog2sql.py 里面的 python路径是写死的 ,我们当前的机器上保留了 2.6 2.7版本的python,默认是寻找2.6版本的,但是根据 requirements.txt 安装软件时,会安装到 2.7版本下面,/usr/bin/python 默认去找 2.6 。
对于机器上有多个版本的情况 是否可以修改
#!/usr/bin/python

#!/usr/bin/env python
这样可以适配更多的环境类型

MySQL 5.7.20 timestamp(3)回滚以后,毫秒部分右移三位

表结构如下:
CREATE TABLE recover_test (
id int(11) NOT NULL AUTO_INCREMENT COMMENT 'id',
name varchar(20) NOT NULL DEFAULT '' COMMENT '名称',
create_by varchar(50) NOT NULL DEFAULT 'system',
create_time timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3),
modify_by varchar(50) NOT NULL DEFAULT 'system',
modifytime timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
disabled tinyint(3) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (id)
) ENGINE=InnoDB;

binlog日志如下:

at 14464

#180412 15:28:38 server id 736 end_log_pos 14548 CRC32 0x1693107f Query thread_id=81 exec_time=0 error_code=0
SET TIMESTAMP=1523518118.115372/!/;
BEGIN
/!/;

at 14548

#180412 15:28:38 server id 736 end_log_pos 14669 CRC32 0x5769e4de Rows_query

UPDATE ygop_o2o_sop.recover_test SET id=6, name='小吴' WHERE id=3 AND name='小孙'

at 14669

#180412 15:28:38 server id 736 end_log_pos 14746 CRC32 0xf3e74488 Table_map: ygop_o2o_sop.recover_test mapped to number 246

at 14746

#180412 15:28:38 server id 736 end_log_pos 14860 CRC32 0x09380bfa Update_rows: table id 246 flags: STMT_END_F

UPDATE ygop_o2o_sop.recover_test

WHERE

@1=3 /* INT meta=0 nullable=0 is_null=0 */

@2='小孙' /* VARSTRING(60) meta=60 nullable=0 is_null=0 */

@3='system' /* VARSTRING(150) meta=150 nullable=0 is_null=0 */

@4=1523517247.276 /* TIMESTAMP(3) meta=3 nullable=0 is_null=0 */

@5='system' /* VARSTRING(150) meta=150 nullable=0 is_null=0 */

@6=1523517247.276 /* TIMESTAMP(3) meta=3 nullable=0 is_null=0 */

@7=0 /* TINYINT meta=0 nullable=0 is_null=0 */

SET

@1=6 /* INT meta=0 nullable=0 is_null=0 */

@2='小吴' /* VARSTRING(60) meta=60 nullable=0 is_null=0 */

@3='system' /* VARSTRING(150) meta=150 nullable=0 is_null=0 */

@4=1523517247.276 /* TIMESTAMP(3) meta=3 nullable=0 is_null=0 */

@5='system' /* VARSTRING(150) meta=150 nullable=0 is_null=0 */

@6=1523518118.115 /* TIMESTAMP(3) meta=3 nullable=0 is_null=0 */

@7=0 /* TINYINT meta=0 nullable=0 is_null=0 */

at 14860

#180412 15:28:38 server id 736 end_log_pos 14891 CRC32 0xcff37532 Xid = 681
COMMIT/!/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog / /!*/;
DELIMITER ;

End of log file

通过binlog2sql解析出来的sql如下:

python binlog2sql.py -hXXX -PXXX -uXXX -pXXX -dygop_o2o_sop -trecover_test --start-file='mysql-bin.000051' --start-datetime='2018-04-12 15:27:00'

UPDATE ygop_o2o_sop.recover_test SET name='小吴', create_by='system', disabled=0, modify_by='system', create_time='2018-04-12 15:14:07.000276', modifytime='2018-04-12 15:28:38.000115', id=6 WHERE name='小孙' AND create_by='system' AND disabled=0 AND modify_by='system' AND create_time='2018-04-12 15:14:07.000276' AND modifytime='2018-04-12 15:14:07.000276' AND id=3 LIMIT 1; #start 14464 end 14860 time 2018-04-12 15:28:38

-d 参数不能有效过滤

对于ddl,通过都是:
use dbname;
alter table tablename ...;
因为tablename前面没有schema名称,没有做到有效过滤,将非-d的schema也解析出来了

增加JSON格式解析支持(已附上代码),修复UnicodeDecodeError,增加默认sql_type

  1. MySQL 5.7新增JSON格式,虽然没有MongoDB支持得那么好,但是用起来已经很方便了。烦请有时间新增对JSON格式的解析支持,谢谢啦~

    想了想,自己动手先实现一下,还请参考:

import json
...
def process_binlog(self):
    ...
    elif is_dml_event(binlog_event) and event_type(binlog_event) in self.sql_type:
        for row in binlog_event.rows:
            # add self.json
            if self.json:
                for column in binlog_event.columns:
                    # json(column.type) = 245
                    if column.type == 245:
                        value = row['values'][column.name]
                        if isinstance(value, dict):
                            value = json.dumps({fix_object(k): fix_object(v) for k, v in value.items()}, ensure_ascii=False)
                        elif isinstance(value, list):
                            value = json.dumps([fix_object(k) for k in value], ensure_ascii=False)
                        row['values'][column.name] = value
            
            sql = concat_sql_from_binlog_event(cursor=cursor, binlog_event=binlog_event, no_pk=self.no_pk,
                                                row=row, flashback=self.flashback, e_start_pos=e_start_pos)
  1. UnicodeDecodeError: 'utf8' codec can't decode byte ...
    block = platform.system() == 'Windows' and block.decode("gbk") or block.decode("utf-8")
    https://github.com/danfengcao/binlog2sql/issues/33#issuecomment-449703838,这个`bug`也麻烦修复一下~

  2. 感觉设置默认sql_type更加方便新手调试
    self.sql_type = [t.upper() for t in sql_type] if sql_type else ['INSERT', 'UPDATE', 'DELETE']

Set类型解析有问题

CREATE TABLE `ttt` (
  `id` int(11) NOT NULL,
  `c1` set('a','b','c') COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `c2` bit(16) DEFAULT NULL,
  `c3` varchar(64) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

insert into ttt values(1,'c,a',b'11111','{"a":\'M&s\',"c":"asda"}');

解析出来的sql如下:

INSERT INTO `test`.`ttt`(`c3`, `c2`, `c1`, `id`) VALUES ('{\"a\":\'M&s\',\"c\":\"asda\"}', '0000000000011111', ('a','c'), 1);

set类型的值需要做处理,要用逗号join一下

如果有外健关联的情况下不能删除

(1451, 'Cannot delete or update a parent row: a foreign key constraint fails (*.*, CONSTRAINT key_wp_agent FOREIGN KEY (*) REFERENCES * (*) ON DELETE NO ACTION ON UPDATE CASCADE)')

为什么解析一段时间后就Lost connection to MySQL?

Traceback (most recent call last):
File "binlog2sql.py", line 150, in
binlog2sql.process_binlog()
File "binlog2sql.py", line 121, in process_binlog
self.print_rollback_sql(filename=tmp_file)
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 844, in exit
self.commit()
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 788, in commit
self._read_ok_packet()
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 767, in _read_ok_packet
pkt = self._read_packet()
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 987, in _read_packet
packet_header = self._read_bytes(4)
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1033, in _read_bytes
CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')

增加筛选DML类型的参数

在实际应用中,可能只关注某一类型的操作,比如INSERT/DELETE/UPDATE,增加类型参数,便于筛选

UnicodeDecodeError: 'utf-8' codec can't decode byte xxx

python binlog2sql/binlog2sql.py xxxxxxxxxxx --start-file='mysqld.000006' --start-datetime='2018-06-24 13:30:00' --start-position=170802294 -B >callback_script.sql
报错:(代码已更新至目前——20180624 14:16时的master版本)
Traceback (most recent call last):
File "binlog2sql/binlog2sql.py", line 150, in
binlog2sql.process_binlog()
File "binlog2sql/binlog2sql.py", line 121, in process_binlog
self.print_rollback_sql(filename=tmp_file)
File "binlog2sql/binlog2sql.py", line 129, in print_rollback_sql
for line in reversed_lines(f_tmp):
File "/root/softwares/binlog2sql/binlog2sql/binlog2sql_util.py", line 243, in reversed_lines
block = block.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbc in position 0: invalid start byte

不加-B的话,能够正常解析出错误的操作,但是加了-B就报上面的错误,内容是中文,有一些有全角或半角的中文符号

命令执行后错了

Traceback (most recent call last):
File "binlog2sql.py", line 149, in
back_interval=args.back_interval, only_dml=args.only_dml, sql_type=args.sql_type)
File "binlog2sql.py", line 47, in init
with self.connection as cursor:
AttributeError: exit

默认值会被解析成null

你好,当字段为null时,该字段的默认值,在生成回滚SQL时,会被解析成null,导致where匹配不到。

image

生成的回滚语句:
UPDATE test.t2 SET id=4, name='ddd', type=0, status=1, created_at=NULL WHERE id=4 AND name='meizi' AND type=0 AND status=1 AND created_at IS NULL LIMIT 1; #start 259 end 514 time 2017-12-28 13:45:58

可以发现:created_at 被解析为了IS NULL, 没法恢复

我看了这不是你脚本的问题,这是mysql-replication解析的问题,请问如何处理?

解析正向SQL报错

Traceback (most recent call last):
File "binlog2sql/binlog2sql.py", line 125, in
binlog2sql.process_binlog()
File "binlog2sql/binlog2sql.py", line 70, in process_binlog
for binlogevent in stream:
File "/usr/lib/python2.6/site-packages/pymysqlreplication/binlogstream.py", line 370, in fetchone
pkt = self._stream_connection._read_packet()
File "/usr/lib/python2.6/site-packages/pymysql/connections.py", line 983, in _read_packet
packet.check_error()
File "/usr/lib/python2.6/site-packages/pymysql/connections.py", line 395, in check_error
err.raise_mysql_exception(self._data)
File "/usr/lib/python2.6/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.InternalError: (1236, u'Could not open log file')
binlog文件500M,本地一两M的可以解析,连远程的mysql,文件大一点的就抛异常了。

工具对一个1.1M的2进制日志解析没到1/5 就报错

UPDATE manufacture.produce_task SET amount_product_planned=200.000000, workshop_id=24, amount_product_qualified=143.000000, start_time_planned='2019-03-06 08:50:00', project_code='AWTP030267', workgroup_id=NULL, project_product_code='93312001', config_scan_once=0, operator_ids=('445'), production_line_id=0, end_time_planned='2019-03-07 08:50:00', id=34962, process_group_name=NULL, attachment=NULL, operator_group_id=NULL, config_task_operator_exclusive=0, status=2, output_material_code='91312010A', task_code=1903060005, process_name='卷膜(设备)', origin_qc_passed=1, last_handover_operator_id=NULL, updated_at='2019-03-07 10:25:57', end_time_real=NULL, assigned_worker_ids=('445'), re_produce=0, process_seq='2', config_fifo=1, work_station_id=273, created_at='2019-03-06 08:50:45', org_id=30, start_time_real='2019-03-06 08:58:11', amount_product_faulty=0.000000, process_code='21', equipment_ids=() WHERE amount_product_planned=200.000000 AND workshop_id=24 AND amount_product_qualified=142.000000 AND start_time_planned='2019-03-06 08:50:00' AND project_code='AWTP030267' AND workgroup_id IS NULL AND project_product_code='93312001' AND config_scan_once=0 AND operator_ids=('445') AND production_line_id=0 AND end_time_planned='2019-03-07 08:50:00' AND id=34962 AND process_group_name IS NULL AND attachment IS NULL AND operator_group_id IS NULL AND config_task_operator_exclusive=0 AND status=2 AND output_material_code='91312010A' AND task_code=1903060005 AND process_name='卷膜(设备' AND origin_qc_passed=1 AND last_handover_operator_id IS NULL AND updated_at='2019-03-07 09:57:57' AND end_time_real IS NULL AND assigned_worker_ids=('445') AND re_produce=0 AND process_seq='2' AND config_fifo=1 AND work_station_id=273 AND created_at='2019-03-06 08:50:45' AND org_id=30 AND start_time_real='2019-03-06 08:58:11' AND amount_product_faulty=0.000000 AND process_code='21' AND equipment_ids=() LIMIT 1; #start 349340245 end 349362132 time 2019-03-07 10:25:57
Traceback (most recent call last):
File "binlog2sql.py", line 150, in
binlog2sql.process_binlog()
File "binlog2sql.py", line 107, in process_binlog
row=row, flashback=self.flashback, e_start_pos=e_start_pos)
File "/usr/local/src/binlog2sql/binlog2sql/binlog2sql_util.py", line 178, in concat_sql_from_binlog_event
sql = cursor.mogrify(pattern['template'], pattern['values'])
File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 143, in mogrify
query = query % self._escape_args(args, conn)
File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 118, in _escape_args
return tuple(conn.literal(arg) for arg in args)
File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 118, in
return tuple(conn.literal(arg) for arg in args)
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 821, in literal
return self.escape(obj, self.encoders)
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 814, in escape
return escape_item(obj, self.charset, mapping=mapping)
File "/usr/lib/python2.7/site-packages/pymysql/converters.py", line 25, in escape_item
val = encoder(val, charset, mapping)
File "/usr/lib/python2.7/site-packages/pymysql/converters.py", line 42, in escape_sequence
return "(" + ",".join(n) + ")"
TypeError: sequence item 0: expected string, dict found

请教2个问题

非常感谢这个软件,点赞

想要提问几个问题

1.如果是mysql5.5.9,解析执行SQL是没有问题的,但--flaskback好像不能反向解析出来,这是版本问题对吗?

2.我有一个18G的binlog文件,有没有办法做到离线分析呢?毕竟不想消耗主库的资源

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.