Coder Social home page Coder Social logo

yunuuuu / rcppml Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zdebruine/rcppml

0.0 0.0 0.0 21.05 MB

Rcpp Machine Learning: Fast robust NMF, divisive clustering, and more

License: GNU General Public License v2.0

C++ 97.32% C 0.35% R 2.33%

rcppml's Introduction

Rcpp Machine Learning Library

License: GPL v2

RcppML is an R package for fast non-negative matrix factorization and divisive clustering using large sparse matrices. For the single-cell analysis version of functionality in RcppML, check out zdebruine/singlet.

Check out the RcppML pkgdown site!

RcppML NMF is:

  • The fastest NMF implementation in any language for sparse and dense matrices
  • More interpretable than other implementations due to diagonal scaling
  • Easy to regularize with an L1 penalty

Installation

Install from CRAN or the development version from GitHub:

install.packages('RcppML')                       # install CRAN version
devtools::install_github("zdebruine/RcppML")     # compile dev version

NOTE: RcppML is being actively developed. Please check that your packageVersion("RcppML") is current before raising issues.

Check out the CRAN manual.

Once installed and loaded, RcppML C++ headers defining classes can be used in C++ files for any R package using #include <RcppML.hpp>.

Matrix Factorization

Sparse matrix factorization by alternating least squares:

  • Non-negativity constraints
  • L1 regularization
  • Diagonal scaling
  • Rank-1 and Rank-2 specializations (~2x faster than irlba SVD equivalents)

Read (and cite) our bioRXiv manuscript on NMF for single-cell experiments.

R functions

The nmf function runs matrix factorization by alternating least squares in the form A = WDH. The project function updates w or h given the other, while the mse function calculates mean squared error of the factor model.

library(RcppML)
A <- Matrix::rsparsematrix(1000, 100, 0.1) # sparse Matrix::dgCMatrix
model <- RcppML::nmf(A, k = 10)
h0 <- predict(model, A)
evaluate(model, A) # calculate mean squared error

Divisive Clustering

Divisive clustering by rank-2 spectral bipartitioning.

  • 2nd SVD vector is linearly related to the difference between factors in rank-2 matrix factorization.
  • Rank-2 matrix factorization (optional non-negativity constraints) for spectral bipartitioning ~2x faster than irlba SVD
  • Sensitive distance-based stopping criteria similar to Newman-Girvan modularity, but orders of magnitude faster
  • Stopping criteria based on minimum number of samples

R functions

The dclust function runs divisive clustering by recursive spectral bipartitioning, while the bipartition function exposes the rank-2 NMF specialization and returns statistics of the bipartition.

library(RcppML)
A <- Matrix::rsparsematrix(1000, 1000, 0.1) # sparse Matrix::dgcMatrix
clusters <- dclust(A, min_dist = 0.001, min_samples = 5)
cluster0 <- bipartition(A)

rcppml's People

Contributors

zdebruine avatar ttriche avatar wainberg avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.