Coder Social home page Coder Social logo

annembed's People

Contributors

jean-pierreboth avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

annembed's Issues

UMAP speed benchmarks?

Hello, I use the UMAP python library and was wondering if y'all have done any benchmarks to see which is faster? Also, it would be cool to know if both implementations produce the exact same results all else being equal. I use UMAP for work, so if the Rust version of UMAP is significantly faster, I can help make python bindings for annembed

command line interface

please create a command-line interface that accepts a CSV or CSV file as a source embedding file and applies the umap or hdbscan algorithm on it and writes the results on disk. so the people who don't understand rust also can use your amazing project.
thanks

Mutual K-NN graph and clarifying KGraph

Hello!

Thanks for making this crate and putting in the effort of emulating UMAP in Rust. I look forward to using this crate in my work, however I would like to use annembed using a mutual K-NN graph. To clarify, a K-NN graph is a NN graph where edges are only formed between nodes when both nodes are with n-nearest neighbours of each other.

However, I just want to double check that I'm handling the indexing used by KGraph correctly. Here is the function I have currently:

use annembed::fromhnsw::kgraph::KGraph;
use anyhow::Result;
use log::debug;
use num_traits::{FromPrimitive, Float};
use rayon::prelude::*;

/// Take a k-nearest neighbour graph and return a mutual k-nearest neighbour graph
/// A mutual k-nearest neighbour graph is a nearest neighbour graph where edges are only kept if they are mutual
/// i.e. if node A is a nearest neighbour of node B, and node B is a nearest neighbour of node A
pub fn mutual_knn<F: FromPrimitive + Float + std::fmt::UpperExp + Sync + Send + std::iter::Sum>(mut knn_graph: KGraph<F>) -> Result<KGraph<F>> {
    let mutual_nodes = knn_graph.get_neighbours()
        .into_par_iter()
        .enumerate()
        .map(|(node, neighbours)| {
            let mut mutual_neighbours = Vec::new();
            for neighbour in neighbours {
                if knn_graph.get_neighbours()[neighbour.node].iter().any(|edge| edge.node == node) {
                    mutual_neighbours.push(neighbour.clone());
                } else {
                    debug!("Node {} is a neighbour of node {}, but node {} is not a neighbour of node {}", neighbour.node, node, node, neighbour.node)
                }
            }
            Ok(mutual_neighbours)
        })
        .collect::<Result<Vec<_>>>()?;
    
    
    knn_graph.neighbours = mutual_nodes;
    Ok(knn_graph)
}

My main confusion is, does the index of a node in the neighbours vec of the KGraph correspond the node index stored within the OutEdge object? Or do I have to retrieve the index using the in built functions of KGraph? Or do these function, namely get_data_id_from_idx, only apply when you are trying convert the KGraph index back to the index used in the original data?

Sorry for the bombardment of questions, hopefully I've made myself clear enough!

Cheers,
Rhys

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.