Coder Social home page Coder Social logo

hyperbolics's Introduction

hyperbolics

Hyperbolic embedding implementations of Representation Tradeoffs for Hyperbolic Embeddings + product embedding implementations of Learning Mixed-Curvature Representations in Product Spaces

Hyperbolic embedding of binary tree

Setup

We use Docker to set up the environment for our code. See Docker/README.md for installation and launch instructions.

In this README, all instructions are assumed to be run inside the Docker container. All paths are relative to the /hyperbolics directory, and all commands are expected to be run from this directory.

Usage

The following programs and scripts expect the input graphs to exist in the /data/edges folder, e.g. /data/edges/phylo_tree.edges. All graphs that we report results on have been prepared and saved here.

Combinatorial construction

julia combinatorial/comb.jl --help to see options. Example usage (for better results on this dataset, raise the precision):

julia combinatorial/comb.jl -d data/edges/phylo_tree.edges -m phylo_tree.r10.emb -e 1.0 -p 64 -r 10 -a -s

Pytorch optimizer

python pytorch/pytorch_hyperbolic.py learn --help to see options. Optimizer requires torch >=0.4.1. Example usage:

python pytorch/pytorch_hyperbolic.py learn data/edges/phylo_tree.edges --batch-size 64 --dim 10 -l 5.0 --epochs 100 --checkpoint-freq 10 --subsample 16

Products of hyperbolic spaces with Euclidean and spherical spaces are also supported. E.g. adding flags -euc 1 -edim 20 -sph 2 -sdim 10 embeds into a product of Euclidean space of dimension 20 with two copies of spherical space of dimension 10.

Experiment scripts

  • scripts/run_exps.py runs a full set of experiments for a list of datasets. Example usage (note: the default run settings take a long time to finish):

    python scripts/run_exps.py phylo -d phylo_tree --epochs 20
    

    Currently, it executes the following experiments:

    1. The combinatorial construction with fixed precision in varying dimensions
    2. The combinatorial construction in dimension 2 (Sarkar's algorithm), with very high precision
    3. Pytorch optimizer in varying dimensions, random initialization
    4. Pytorch optimizer in varying dimensions, using the embedding produced by the combinatorial construction as initialization
  • The combinatorial constructor combinatorial/comb.jl has an option for reporting the MAP and distortion statistics. However, this can be slow on larger datasets such as wordnet

    • scripts/comb_stats.py provides an alternate method for computing stats that can leverage multiprocessing Example usage: python scripts/comb_stats.py phylo_tree -e 1.0 -r 2 -p 1024 -q 4 to run on 4 cores

hyperbolics's People

Contributors

albertfgu avatar belizgunel avatar chrismre avatar fredsala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hyperbolics's Issues

h-MDS Dimensions

I'm trying the code in the hMDS folder, but I encountered some trouble in the proposed example:
julia hMDS/hmds-simple.jl -d data/edges/phylo_tree.edges -r 100 -t 0.1 -m savetest.csv

All the results are equals:
Output:
h-MDS. Info:
Data set = data/edges/phylo_tree.edges
Dimensions = 100
Save embedding to savetest
Scaling = 0.1

Number of nodes is 344
Time elapsed = 0.17120695114135742
Doing h-MDS...
elapsed time: 2.771179635 seconds
elapsed time: 4.263825574 seconds
Building recovered graph...
elapsed time: 0.297506376 seconds
Getting metrics...

Distortion avg, max, bad = 0.032613823778721615, 4.335859234069171, 762.0
MAP = 0.6170225406053894

but watching at savetest.csv I noticed that the dimensions are less than 100.
In particular the savetest.csv file has 345 columns (the 344 elements of phylo_tree.edges and the scaling factor) and 81 rows (and consequently dimensions).

Why am I receiving less than the 100 dimensions required?

I've tried with few dimension and I got the following results:
3 required dimensions -> 2 output rows
4 required dimensions -> 3 output rows
10 required dimensions -> 7 output rows

Thank you in advance.

wrong Embedding dimensions

when dim was set to 2, actually the dims of embeddings is 3, and --visualize doesn't work. But set the dim to 1, and the dimension become 2, Visualisation can work, but the graph was totally wrong.

I just use the example command line in README file. What wrong with the code?

Embeddings extraction from PyTorch model.

Thank you for this awesome project!

I am trying to use your PyTorch implementation in order to train a model and extract the embedding matrix from it. As a result, I am getting values that are not between 0 and 1, as it is supposed to be for hyperbolic embeddings, but I am not sure whether it is normal or I need to do some post-processing in order to get the right embeddings.

The way I'm trying to extract the matrix from the model is the following:

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load(model_load_file).to(device)
embedding_flatten = model.embedding().data
embedding_dim = (embedding_flatten.shape[0] // model.n)
num_nodes = model.n
embedding_matrix = embedding_flatten.view(num_nodes, embedding_dim)

Any help would be much appreciated. Thanks in advance!

Julia Version Problems

Hello,

What version of Julia do I need to run comb.jl? It seems like a lot of the commands are now outdated.

Thanks

Example of embedding non-tree graph

Do you have any examples of running your algorithm on non-tree graphs? From the paper, it sounds like sometimes you use a Steiner tree and sometimes a BFS tree -- are you able to give some details on how you got the reported numbers for the diseases, Gr-QC and wordnet datasets?

Thanks!

Parameters to reproduce pytorch results from paper

In Tables 3 and 4 in the paper, you report 0.237 distortion and 0.951 MAP for the phylo tree using the pytorch implementation. Are you able to share the parameters you used to get those results?

Running the parameters from the README:

python pytorch/pytorch_hyperbolic.py learn data/edges/phylo_tree.edges --batch-size 64 -r 10 -l 5.0 --epochs 100 --checkpoint-freq 10

yields

...
2018-12-04T22:47:47 99 loss=94.07083774585367
2018-12-04T22:47:47 final loss=94.07083774585367
2018-12-04T22:47:48 Compare matrices built
2018-12-04T22:47:49 Distortion avg=0.6756635903819984 wc=271.74577409770677 me=6.7022873965209975 mc=40.54522851985782 nan_elements=0.0
2018-12-04T22:47:49 MAP = 0.3775067551581052
2018-12-04T22:47:49 data_scale=1.0 scale=0.0

(I removed the -w phylo_tree.r10.emb because I didn't have the warmstart file, so maybe that's changing the results.)

Thanks!
~ Ben

h-MDS from distance matrix

Hi, Am I correct in understanding that hmds-simple.jl using the -k argument implements the h-MDS algorithm 2 of section 4 of the paper?

I'm running into some difficulty using this function. I've tried running the command julia hMDS/hmds-simple.jl -k data/test.pickle -r 10 -t 0.1 -m savetest.csvpip both with my setup and using the docker, but I get errors in both. The test.pickle file is a pickled torch 2D tensor, but I also tried running the same command with a space-separated distance matrix and a pickled numpy file (in case I had misunderstood).

In my setup I get the error:

ERROR: LoadError: PyError ($(Expr(:escape, :(ccall(#= /home/jacob/.julia/packages/PyCall/RQjD7/src/pyfncall.jl:44 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'RuntimeError'>
RuntimeError('Overflow when unpacking long',)
  File "/home/jacob/hyperbolics/utils/load_dist.py", line 44, in load_emb_dm
    m = torch.load(file).to(device)
  File "/home/jacob/miniconda3/lib/python3.6/site-packages/torch/serialization.py", line 368, in load
    return _load(f, map_location, pickle_module)
  File "/home/jacob/miniconda3/lib/python3.6/site-packages/torch/serialization.py", line 533, in _load
    if magic_number != MAGIC_NUMBER:

In docker I get the error:

ERROR: LoadError: ArgumentError: Module LinearAlgebra not found in current path.
Run `Pkg.add("LinearAlgebra")` to install the LinearAlgebra package.
Stacktrace:
 [1] _require(::Symbol) at ./loading.jl:435
 [2] require(::Symbol) at ./loading.jl:405
 [3] include_from_node1(::String) at ./loading.jl:576
 [4] include(::String) at ./sysimg.jl:14
 [5] process_options(::Base.JLOptions) at ./client.jl:305
 [6] _start() at ./client.jl:371
while loading /root/hyperbolics/hMDS/hmds-simple.jl, in expression starting on line 4

Thanks for the help!

Combinatorial example error: delt_idx not defined

I'm trying the example proposed in the README, but I'm having some trouble with the comb.jl file.

While running the command:
julia combinatorial/comb.jl -d data/edges/phylo_tree.edges -m phylo_tree.r10.emb -e 1.0 -p 64 -r 10 -a -s

with the dimension parameter greater than 2, I have the following error:
ERROR: LoadError: UndefVarError: delt_idx not defined

which is originated in the rdim.jl file, at the 249th line, in function place_on_sphere().
It seems that in such function the variable "is forgotten" when the points_idx parameter is increased.
When the dimension is set to 2, the embedding is successfully created with all the measurements completed.
I'm running Julia 0.7.

Thank You in advance.

--use-svrg doesn't work

So I tried using the --use-svrg command and it didn't work.

First it told me I could Hyperbolic_Parameter from hyperbolic_parameter. I changed this so that it imports HyperbloidParameter instead. So that line went fine until we got to line 94 of svrg.py file where I get the following error


File "/hyperbolics-master/pytorch/svrg.py", line 94, in step
    for i, (data, target) in enumerate(self.data_loader):
ValueError: too many values to unpack (expected 2)

I was wondering if there was a quick fix for this.

keyword incorrect

Traceback (most recent call last):
File "pytorch/pytorch_hyperbolic.py", line 393, in
_parser.dispatch()
File "/opt/conda/lib/python3.6/site-packages/argh/helpers.py", line 55, in dispatch
return dispatch(self, *args, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/argh/dispatching.py", line 174, in dispatch
for line in lines:
File "/opt/conda/lib/python3.6/site-packages/argh/dispatching.py", line 277, in _execute_command
for line in result:
File "/opt/conda/lib/python3.6/site-packages/argh/dispatching.py", line 260, in _call
result = function(*positional, **keywords)
File "pytorch/pytorch_hyperbolic.py", line 302, in learn
m = cudaify( Hyperbolic_Emb(G.order(), rank, initialize=m_init, learn_scale=learn_scale, exponential_rescale=exponential_rescale) )
File "/root/hyperbolics/pytorch/hyperbolic_models.py", line 84, in init
self.w = Hyperbolic_Parameter(x)
File "/root/hyperbolics/pytorch/hyperbolic_parameter.py", line 12, in new
ret = super(nn.Parameter, cls).new(cls, data, requires_grad=requires_grad)
TypeError: new() received an invalid combination of arguments - got (Tensor, requires_grad=bool), but expected one of:

  • (torch.device device)
  • (torch.Storage storage)
  • (Tensor other)
  • (tuple of ints size, torch.device device)
    didn't match because some of the keywords were incorrect: requires_grad
  • (object data, torch.device device)
    didn't match because some of the keywords were incorrect: requires_grad

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.