Coder Social home page Coder Social logo

microsoft / graspologic-native Goto Github PK

View Code? Open in Web Editor NEW
5.0 6.0 5.0 2.5 MB

graspologic-native is a library of rust components to add additional capability to graspologic (https://github.com/microsoft/graspologic), a python library for intelligently building networks and network embeddings, and for analyzing connected data.

License: MIT License

Python 1.97% Rust 98.03%

graspologic-native's Introduction

graspologic-native

graspologic is a python package for graph statistics.

Some functionality can be best served if compiled into a python native module, both for performance purposes and to share that functionality with web assembly.

graspologic-native is a repository that holds Rust packages. The core packages will be published as crate libraries, and a package using pyo3 will expose the functionality of that library to Python.

Requirements

  • Rust nightly 1.37+ (we are currently using 1.40)
  • Python 3.5+ (we are currently using 3.8)
  • 64 bit operating system

Published Versions

We currently build for x86_64 platforms only, Windows, macOS, and Ubuntu, for python versions 3.6 - 3.11.

Building

If for any reason, the published wheels do not match your architecture or if you have a particularly old version of glibc that isn't sufficiently accounted for in our current build matrix, or you just want to build it yourself, the following build instructions should help.

Note that these instructions are for Linux specifically, though they also should work for MacOS. Unfortunately, the instructions for Windows are a bit more convoluted and I will comment the sections that deviate between the three, as I'm aware of issues.

Before running these instructions, ensure you have installed Rust on your system and you have the Python development headers (e.g. python3.8-dev) for your system.

rustup default nightly
git clone [email protected]:microsoft/graspologic-native.git
cd graspologic-native
python3.8 -m venv venv
pip install -U pip setuptools wheel
pip install pyo3 maturin
cd packages/pyo3
maturin build --release -i python3.8  # this is where things break on windows.  instead of `python3.8` here, you will need the full path to the correct python.exe on your windows machine, something like `-i "C:\python38\bin\python.exe"`

Presuming a successful build, your output wheel should be in: graspologic-native/target/wheels/

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Privacy

graspologic-native does not collect, store, or transmit any information of any kind back to Microsoft.

For your convenience, here is the link to the general Microsoft Privacy Statement.

graspologic-native's People

Contributors

bdpedigo avatar carolyncb avatar daxpryce avatar jonmclean avatar nicaurvi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

graspologic-native's Issues

"Invalid" starting communities fails

Our Leiden implementation allows you to provide a starting dictionary of node -> communities.

The problem that @JonathanLarson found is that if you provide a community dictionary with nodes that have no direct edge relationship to anything in that community, the algorithm as-is falls apart. Leiden's enhancement to Louvain guarantees that partitions/clusters will only be made up of nodes that have some connection to each other - meaning that every node in that partition MUST contain an edge to another node in that partition - with the noted exception of clusters with only a single node.

We presumed that this would always be true within our leiden impl, but we didn't guarantee it if a user provided us a starting community mapping. By happenstance @JonathanLarson was running leiden repeatedly over monthly snapshots of nodes and using the previous month's partitions as a starting point for generating the new partitions. We ran into the one and only panic! at the macro in graspologic native because while two nodes were in this partition, they did not share an edge, and thus the subnetwork was "empty".

f64 and .exp() overflow very easily on large graphs

In our tests the code coming from CWTS' reference implementation for LocalMergingAlgorithm/subnetwork_clustering) our implementation ends up with f64/double overflows pretty easily. I want to go back and rig up a test to show just how frequently it can happen so we have an idea of the scale of the problem, and then also revisit whether we can pick a different function to apply to the qvi for a given join/merge rather than exponent.

Another option would be to normalize our weights to between 0 and 1 - our big problem now is that our weights are often a very large number in the hundreds of thousands.

Figure out how to package and publish an sdist for a maturin built project

3.10 is not yet available outside of an rc for the python github actions, and 3.11 isn't even mentioned. On top of that, we have the M1 build issues which may be solvable through github workflows, but even if they aren't we may as well package an sdist and make THAT work for these configurations.

MacOS M1 Builds?

Graspologic #822 microsoft/graspologic#822 is nominally about having graspologic work nicely with conda. That may be true, but the real problem @adam2392 was experiencing when doing a pip install is not having a valid wheel built and distributed for the M1 cpu.

As for right now, github hosted runners don't seem to support M1 - so the question is "can we figure out an sdist that will build for non-github-hosted-runner systems", so that this can be somewhat self serve?

https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners

The build process isn't overly complex, we just have to figure out how to turn https://github.com/microsoft/graspologic-native#building into a codified way of building it from a source distribution with the python tooling, something we haven't really attempted before. If anyone else has any experience doing this I'd LOVE the help!

what's the deal with this package?

and is there something I can do to help (over the summer)? My housemate and I were planning to learn Rust, because we thought it looked fun, and this seems like a good way to learn it while also contributing to something, if stuff still needs to be done on it (I'm not really familiar with the part of graspologic that uses this, or what it's for)

Add `n_init` to Leiden

Cross-post from microsoft/graspologic#757

Would like to have a way to set the number of initializations of Leiden, and return the best modularity final partition.

Discussed in the thread above, as it is costly to convert to-from python, we thought it would make sense to do this in Rust.

cc: @dwaynepryce @loftusa

CICD releases aren't setting the pre-release flag correctly in Github's release action

We release on merge/push to dev or main. If on dev, we want the prerelease flag to be set - we also pick a version number that indicates a pre-release to PyPI.

I thought we were doing that, but it seems to be not working. As it is now I'm constantly marking it as pre-release after the fact and I'd prefer I didn't have to do that. At the least set it to true at all times and I'll manually set it to false later.

Create README for Pyo3 only

The overall README has references and requirements to Python that aren't entirely true (when we publish the crate, for instance, Python won't even matter in that context).

Further we need to explain the purpose behind the companion library for it. We also want to make sure we give credit and reference CWTS' paper and reference implementation that we used at the basis for our own leiden.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.