Comments (1)
- 原生的全量同步逻辑实现,既不锁库也不锁表 就是普通的select语句查全表数据。如果数据量大于10000,则会通过线程池处理,线程数为核数。会按照单个任务获取10000条数据拆分。
不过在数据量大的情况下,性能不好,一是存在深度分页问题,二是对于线程数的设置,并非cpu密集型任务,单纯用cpu核数感觉也缺乏考量 - 全量同步期间需要注意增量数据的处理,不然可能会丢失更新
- 我改造了源码,简单走了单线程的流式查询,也能满足大部分场景,且相对稳定,当然也存在流式查询的一些问题:比如更加需要重视增量数据的处理,长事务等
- 也可以考虑结合使用datax做离线的全量数据同步 + canal做增量数据同步,原理也是流式查询,不过做了任务拆分+并行化处理
from canal.
Related Issues (20)
- 一个通过页面配置从mysql同步数据到StarRocks的代码分支。
- 监听MariaDB 10.11.5-MariaDB-log问题 HOT 3
- client-adapter.rdb模块在转换时间时丢失精度
- Canal Connector Kafka 可以支持 Confluent Schema Registry 注册吗
- 若select带有函数操作,会导致ES同步失败 HOT 1
- blob 类型字段中文数据同步到kafka, json字段值乱码 HOT 1
- 目前canal推荐使用哪个版本 HOT 1
- canal 已经过滤 DDL 但tsdb (h2.mv.db )还是一直在增大,里面都是:CREATE DATABASE IF NOT EXISTS `test`;
- canal-adapter 1.1.5 连接mysql8.3 启动报错
- canal 1.1.6 连接 MySQL 8.3.0 报错 HOT 2
- canal-adapter rdb dbMapping.targetPK主键映射可以为中文吗?字段映射可以为中文吗?
- canal 1.1.7 deployer启动一刷获取数据库链接错误 HOT 4
- canal 通过 maxscale 连接 mysql 时报错,直接连接 mysql 主库或从库正常 HOT 2
- canal1.1.7 链接rocketmq DESC: No topic route info in name server for the topic: TBW102 HOT 1
- 在使用canal更新es nested类型数据时出现NullPointerException(附解决方法)
- canal rabbitmQ的MQ发送会不会存在消息丢失的情况
- 运行instance后日志报这些错,球球各位大佬看看有遇到过么 HOT 1
- canal集群后多个server重复发送消息 HOT 1
- client-adapter 怎么运行呢?
- flatMessage=false,对应的数据内容是protobuf协议,能不能提供一个转成 FlatMessage 的方法呢?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from canal.