Coder Social home page Coder Social logo

dbscan-clustering-scheme's Introduction

Merge step of DBSCAN algorithm using Scheme

Instructions

  1. Open import.scm file
  2. Run the file
  3. Write (saveList "clusters.txt" (mergeClusters (import)))

Description

This implementation focuses on the third step of the parallel DBSCAN algorithm. We will merge intersecting clusters from adjacent partitions. The parallel DBSCAN algorithm extracts the clusters of a set by subdividing the region into a number of overlapping partitions. If the partitions overlap with each other, it implies that some points (at the periphery of the partitions) might belong to more than one partition. Consequently, some clusters may contain the same point(s) and are then said to intersect. In this case, these clusters must be merged because they should in fact constitute one large cluster covering more than one partition. The merging can be simply done by changing the label of one of the clusters to the one of the second.

Algorithm explanation

Suppose we have the 5 points of cluster A and the 8 points of cluster C in current ClusterList. We are now processing the partition containing clusters B and D. We first consider cluster D; it has no intersection with the points in ClusterList so its 4 points will simply be added to the list. Now if we consider cluster B, it has 3 points intersecting with cluster A and 4 points intersecting with cluster C. These 7 points will constitute the intersection set I and the labels of I are A and C. Consequently, the cluster label of all points in ClusterList having label A will be changed to B and same for the points having label C. Finally, the points in B are inserted into the ClusterList.

Results

Here is the list of groups obtained after merging. There were 25 groups initially, after the merge step there are 21 left.

Cluster ID before merge step:

(65000001 65000002 65000003 65000004 74000001 74000002 74000003 75000001 75000002
76000001 76000002 76000004 76000003 84000001 84000003 84000002 84000004 85000004
85000001 85000002 85000003 85000007 85000005 85000006 86000001)

Cluster ID after merge step:

(65000001 65000002 65000003 65000004 74000001 74000002 75000002 76000001 76000002
76000004 76000003 84000003 84000002 84000004 85000004 85000001 85000002 85000003
85000007 85000005 86000001)

Clusters ID that were removed: 74000003,75000001,84000001,85000006.


From CSI2520 (Programming Paradigms) Course - Scheme Project

dbscan-clustering-scheme's People

Contributors

celinewmk avatar

Watchers

 avatar

dbscan-clustering-scheme's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.