Coder Social home page Coder Social logo

TODO list about syncer HOT 3 OPEN

zzt93 avatar zzt93 commented on July 18, 2024
TODO list

from syncer.

Comments (3)

zzt93 avatar zzt93 commented on July 18, 2024

Done:

  • Output need customization of Spring EL -- Remove spring EL
  • Mysql input field check
  • Add new cold start: batch select (order by id) & batch insert
  • Shorten id:
    • change serverId to port/clientId?
      • serverId: not for unique purpose, but for debug -- removed, to save memory
    • variable integer encoding for position: xxx/123456/gap/xxx
    • shorten offset
  • Support set start binlog file name & position in config file (make it easier to rebuild)
  • Refactor clone & dup semantic -- change to create
  • Reduce memory footprint of StandardEvaluationContext (20% memory reduction)
  • Add file as data source: to read binlog file
  • Update failure log format: not escape json string
  • Order problem: make same id to same thread; strict mode: retry error item and all left; retry only error item
  • Output channel reconnection logic: MySQL & ES
  • Adjust logging level dynamically
  • Add health check endpoint
  • upsert for es output channel if 404
  • Add shutdown hook to do clean up: stop sending data to output target, avoid dup key exception
  • Update to Spring Boot 2.0 for better yaml prompt when config
  • Skip synced item if already synced when startup
  • Add kafka output channel
    • kafka msg consumer has to handle event idempotently;
    • send event using primary key as key
    • deploy SyncData SyncUtil as separate jar to maven central
  • Refactor config naming:
input:
  masters:
    - connection:
        address: ${HOST_ADDRESS}
        port: 27018
      type: Mongo
      repos:
        - name: "chat"
          entities:
          - name: messages
            fields: [time, content]
  • Package refactor:
    • For syncer-data deploy
    • Refactor config package
  • Add kafka version compatiblity in readme.
  • Reduce useless dependency: remove spring boot
  • Refactor filter module design flaw & add nested if and/or enhance switcher
  • Use javassist/cglib/byte buddy JavaCompiler to generate code dynamically rather than spring el
  • Support config key like lower-hyphen
  • Binlog checksum type auto detection
  • kafka MESSAGE TOO LARGE
  • Share same table definition for multiple remote
  • Test framework
  • Refactor SyncData: update event should have before & fields data:
    • add updated() & udpated(String name) method for use
    • add before to get before data
  • Test framework: add update/delete test
  • Update README config example: remove and link to test config dir.
  • Test framework: mongo
  • Check MongoDB whether registered db/collection is exists
  • Batch buffer bug
  • Opt logging: Ack log, MasterConnector
  • Connect to latest binlog flag (cold start usage)
    • de-register cold-start consumer?
    • or use same consumer, different filter?
  • Add consumerId in log
    • or report thread-consumer relation in http port
    • or change thread name to syncer-consumerId-filter-1
  • ConsumerId syntax check: not support -
  • FileBasedMap record last removed position if map is empty
  • Change from tailing oplog to use change stream api: check mongo version when startup
  • ES output channel support nested obj
  • Alter table auto re-sync mysql column index so no need to restart
  • ES client upgrade (5.x, 7.x, not all features, 6.x all features) -- rest client & basic auth to replace xpack & low level rest client
  • Test framework:
    • Mongo update/delete
  • Change filter module to single thread, add partition key support in syncData which will be used in output module (multiple thread)
  • Order problem when id is changed: add scheduler key
    • Joining like this will inevitably cause data inconsistency because the at-least-once-semantic, not do.
    • ES can make it by nested obj
    • Kafka need this
  • Filter module not shutdown but use failure log- Pressure test continue
    • Degradation & Bound queue size: change to fixed sized queue
  • Column filter: _all
  • Cold start

from syncer.

zzt93 avatar zzt93 commented on July 18, 2024

Testing & Implementing

  • Update position even not interested in

  • Share storage in k8sMode

    • Sync meta info to ZK like
    • k8sMode need a instanceId to differentiate
      • storage path /instanceId/syncer/xx
    • config file?
  • Kafka output: timestamp to long;

  • Mysql output: auto add id;

  • #8 [Test Pending] MySQL upsert support: for join table order problem -- ref

  • [Impl Pending] Update sync meta position when consumer not interested in this event?

    • Implement by a simple position flusher typed event?
    • emit when trying to shutdown?
    • emit when num not interested event happened

from syncer.

zzt93 avatar zzt93 commented on July 18, 2024

Not Do

  • Schema mis-match problem -- fix by new cold-start method -- ETL
    • Write schema of all tables to local file, then parse all DDL to update it.
    • Start to load schema from files
    • Cold start
      • connect to latest binlog (can't resolve mis-match in this situation)
  • Netty as http client (idempotence is hard to achieve)
  • Support rpc output channel (idempotence is hard to achieve)
  • Support websocket for long lived connection (idempotence is hard to achieve)
  • Join by query extra data source in output?
  • Make output module non-blocking with callback, so reduce filter-output thread?
    • May cause disorder of event -- make it as config option: non-block-mode

from syncer.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.