Coder Social home page Coder Social logo

missingrowfinder's Introduction

MissingRowFinder

Missing Row Finder or: How I Learned To Stop Worrying and Love the Garbage Collector.

This was one of my first OOP applications.

Problem

A common infrastructure model is to have one central database, referred to here as the Publisher, and multiple Subscriber databases with the same schema at company branches. The Publisher assigns ranges for the PK on a table to its Subscribers, and the data held at the Publisher represents the sum total of the data held at all the Subscribers. [In reality in the model being used both the central and local databases act as publishers and subscribers, replicating data in both directions. In particular with this scenario the local DBs publish up to the central one; the existing name convention is nevertheless followed here and in the code.]

An issue that can sometimes occur, normally after server/network issues, is that at a Subscriber one or more rows will fail to be marked for replication up to the Publisher. The presence of these rows can be found using a simple COUNT(*) on a table and seeing a discrepency between the two DBs. Finding the Id of the missing row(s), however, is non-trivial.

Solution

This application queries both the Publisher and a given Subscriber, effectively performing a binary search to find the Subscriber row that does not exist at the Publisher.

i.e.

  1. Between IDs 1-50 the Subscriber has 40 rows. Publisher has 30.
  2. Split id range in half.
  3. Look at first half. Subscriber has 20 rows between 1-20, Publisher also has 20. Move on and ignore this.
  4. Look at second half. Subscriber has 20 rows between 20-40, Publisher has 10. Split again 1. Look at first half. Subscriber has 10 rows between 20-30, Publisher also has 10. Move on and ignore this. 2. Look at second half. Subscriber has 10 rows between 30-40, Publisher has zero. Add IDs 30-40 to list.
  5. Return list (containt IDs 30-40) to the user.

To prompt replication of the row an innocuous update can then be performed e.g.

UPDATE Table SET BitFlagColumn = BitFlagColumn WHERE Id = @Id

Business Layer

The Console project and Data Access are fairly straight forward, but the Business Layer warrants explanation:

  • PartitionNodeCollection represents the "tree", it holds the first, root node.
  • PartitionNode is one node. It has one parent, and contains two child PartitionNodes: a left and a right.
  • A PartitionNode holds a PartitionPair. A PartitionPair holds the count for number of rows at Publisher (PublisherPartitionCount) and Subscriber (SubscriberPartitionCount).
  • TableSize gets the Maximum replicated ID. We use this for the table size as otherwise we would also bring up any IDs that have not yet replicated.

missingrowfinder's People

Contributors

matthewmcgowan avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.