Coder Social home page Coder Social logo

single-linkage's Introduction

single-linkage++

Description

This algorithm constructs the hierarchical cluster of an n-dimensional dataset using the single-linkage algorithm described in [1].

Running an example

Compile the file:

g++ -o single_linkage single_linkage.cpp

Then, execute the compiled program:

./single_linkage

This will display the following output:

Data:
A: 1, 1 
B: 1.5, 1.5 
C: 2, 2 
D: 1, 1 
E: 9, 9 
F: 1, 1 
G: 7, 7 
H: 9, 11 

Hierarchical cluster:
A,D,F,B,C,E,H,G: 3.9375, 4.1875 
               A,D,F,B,C: 1.3, 1.3 
                        A,D,F,B: 1.125, 1.125 
                               A,D,F: 1, 1 
                                    A,D: 1, 1 
                                       A: 1, 1 
                                       D: 1, 1 
                                    F: 1, 1 
                               B: 1.5, 1.5 
                        C: 2, 2 
               E,H,G: 8.33333, 9 
                    E,H: 9, 10 
                       E: 9, 9 
                       H: 9, 11 
                    G: 7, 7 

The result shown in the terminal could be interpreted as the following dendrogram: alt text

Using code in your own projects

The data shown above corresponds to the example of the function check_example. In this function, you can see how to add data using the class Cluster and how to construct the hierarchical cluster through the function clustering. The following is an example of how to add a single data point named "label_name" with a centroid of (1.3, 2.4):

const int dim = 2;
std::set < Cluster * > clusters;  

// Add new data
{
 Cluster * new_cluster = new Cluster();
 // Configure new cluster
 new_cluster -> set_label("C");
 float point[dim] = {1.3, 2.4};
 new_cluster -> set_centroid(point, dim);
 clusters.insert(new_cluster); // Adding new data
}

After adding the data, we can call the function clustering to construct the hierarchical cluster:

Cluster * big_cluster = clustering(clusters);
big_cluster -> show();// Show hierarchical cluster in terminal

The signature of the function clustering is:

Cluster * clustering(const std::set < Cluster * > & clusters, norm_function norm = norml2)

Notice that by default, the norm function used to compute the distance between clusters is the l2-norm (refer to the source code file norml2). However, this behavior can be changed by implementing another function of type norm_function:

typedef
const float( * norm_function)(Cluster * a, Cluster * b);

// l2-norm
const float norml2(Cluster * a, Cluster * b) {
  const float * centroid_a = a -> get_centroid();
  const float * centroid_b = b -> get_centroid();
  int dim_centroid = a -> get_dim_centroid();
  float dot_product = 0;
  
  for (int i = 0; i < dim_centroid; i++) {
    float diff = centroid_a[i] - centroid_b[i];
    dot_product += diff * diff;
  }
  
  return sqrt(dot_product);
}

Author

Roger Figueroa Quintero - Contact LinkedIn

License

This project is licensed under the MIT License, which permits unrestricted use, modification, and distribution subject to the terms and conditions of the license.

References

[1] Anil K. Jain and Richard C. Dubes. 1988. Algorithms for clustering data. Prentice-Hall, Inc., USA

single-linkage's People

Contributors

roggerfq avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.