Coder Social home page Coder Social logo

parmat's Introduction

PaRMAT

PaRMAT is a multi-threaded RMAT graph generator. Using PaRMAT, you can create very large directed or undirected RMAT graphs. PaRMAT is designed to exploit multiple threads and to avoid failing when there is not a lot of memory (RAM) available.

A little bit of background

Those who work on graph processing solutions clearly know that RMAT graphs usually imitate the structure of graphs that are extracted from real-world origins hence are of great importance. AFAIK, there are only two publicly avaiable RMAT graph generators out there: GTgraph and SNAP. GTGraph, which is especifically built for RMAT graph generation purpose, fails during creation of very very large graphs (half a biilion edges on a machine with 4 GB of RAM was the furthest it could go). In addition, it only creates directed graphs and cannot avoid existence of duplicate edges in the graph. How about SNAP? SNAP is on the other hand a really big graph analysis and mining library. To generate RMAT graphs in SNAP, you've got to open up the source code (snap/examples/graphgen/) and add the capability of RMAT graph generation (well, that's the way I know, there may be a nicer way). It only creates undirected graphs and again fails to create very large RMAT graphs (with explicitly specified number of edges and vertices).

PaRMAT

PaRMAT is a piece of software designed to create large RMAT graphs, even on a machine with limited amount of available memory. PaRMAT divides the adjacency matrix into squares and the workload between multiple threads. PaRMAT provides a number of options for the RMAT graph: being directed or non-directed, disallowing duplicate edges, sorting the output, varying RMAT graph parameters, etc. Directecd and non-directed graphs generated by PaRMAT show the same degree distribution as GTGraph's and SNAP's. PaRMAT does not fail when the specified graph is very large and creates the graphs faster than GTGraph and SNAP.

PaRMAT Requirement

To compile PaRMAT, you'll need to have a C++ compiler that supports C++11 features.

Making PaRMAT

I personally could create PaRMAT on Ubuntu 12.04 and Ubuntu 14.04 (g++ versions 4.8.1 and 4.8.2 respectively). Run make in the Release folder, and everything will be taken care of. I don't think it would be hard to make and run PaRMAT in a machine with a different operating system.

Running PaRMAT

PaRMAT needs two required command-line arguments: -nVertices and -nEdges which provide the number of vertices and number of edges in the specified graph respectively. For example, running:

./PaRMAT -nVertices 100000 -nEdges 1000000

creates a graph with a million edges and a hundred thousand vertices. Other accepted command-line arguments can be found by executing the program with no argument. I repeat them in below:

Usage: 	Required command line arguments:
	-Number of edges. E.g., -nEdges 1021
	-NUmber of vertices. E.g., -nVertices 51
Additional arguments:
	-Output file (default: out.txt). E.g., -output myout.txt
	-RMAT a parameter (default: 0.45). E.g., -a 0.42
	-RMAT b parameter (default: 0.22). E.g., -b 0.42
	-RMAT c parameter (default: 0.22). E.g., -c 0.42
	-Number of worker CPU threads (default: queried/1). E.g., -threads 4
	-Output should be sorted based on source index (default: not sorted). To sort: -sorted
	-Allow edge to self (default:yes). To disable: -noEdgeToSelf
	-Allow duplicate edges (default:yes). To disable: -noDuplicateEdges
	-Will the graph be directed (default:yes). To make it undirected: -undirected
	-Usage of available system memory (default: 0.5 which means up to half of available RAM may be requested). E.g., -memUsage 0.9

In addition to above arguments, there are a number of parameters internal to the program itself. They are accessible in internal_config.hpp. After applying modifications to this file, you obviously need to re-compile the program to see the effects.

Citing

@inproceedings{wsvr,
 author = {Khorasani, Farzad and Gupta, Rajiv and Bhuyan, Laxmi N.},
 title = {Scalable SIMD-Efficient Graph Processing on GPUs},
 booktitle = {Proceedings of the 24th International Conference on Parallel Architectures and Compilation Techniques},
 series = {PACT '15},
 pages = {39--50},
 year = {2015}
}

parmat's People

Contributors

farkhor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

parmat's Issues

Failed to create rmat30

Hi,

I tried using PaRMAT to generate rmat30, but even after running for 48 hours it didn't write anything to the output file.

What could be wrong?

Thanks

One vertex too many created

Hi,

first of all thanks for this very useful R-MAT generator! I stumbled upon the following issue:

When creating a graph it seems that one more vertex than specified is created. Using the following command:

./PaRMAT -nEdges 1000 -nVertices 100 -sorted -a 0.25 -b 0.25 -c 0.25

(i.e. with equal edge distribution) will create edges containing vertices from 0 to 100 (both inclusive). Those are 101 vertices and not 100 as specified.

I hope this description is useful.

generate_edges bug

If i understand your program correctly you are first electing the first vertex of the edge and then selecting the second vertex of the edge. Say we have a 2x2 matrix with RMAT parameters as a,b,c,d. where a+b+c+d=1. Then probability of edge (1,1) occurring should be a according to the definition of RMAT graphs. But in your program it would be (a+b)(a+c) which is not the same as a. Have I misunderstood something?
I think you can't break down the probability of an event occurring into two independent events like this.

Reg PaRMAT usage in Windows/MAC

Hi,

I am sorry to trouble you. I tried installing it in windows. Its showing Dll error and it is not a valid win32 application. I tried this a few times but of no use. I also tried downloading Xcode in my other MAC system and tried running it. It shows build successfully but I do not know what to do after that as there are no steps to follow kind of thing for ParRMAT to use in MAC. It will be helpful for me if you can provide me the detail steps for installation of PaRMAT in MAC os. Please let me know if you require any other details of my system.

Thanks,
Narendra.

Outputting Matrix Market Format

Matrix Market is a standard format for various workloads.

  1. Should add a flag that outputs data in Matrix Market Format to easily load in workloads.
  2. Also Matrix market requires indices to start from zero instead of one. Would be great to have a feature that offsets all the indices by 1

-- Kaustubh Shivdikar

Compile Problem

I use make in Release folder, but it has the following errors:

g++ -O3 -Wall -c -fmessage-length=0 -pthread -std=c++0x -MMD -MP -MF"src/Edge.d" -MT"src/Edge.d" -o "src/Edge.o" "../src/Edge.cpp"
In file included from ../src/Edge.hpp:6:0,
from ../src/Edge.cpp:1:
../src/internal_config.hpp:9:7: error: expected nested-name-specifier before ‘EdgeIndexType’
../src/internal_config.hpp:9:7: error: ‘EdgeIndexType’ has not been declared
../src/internal_config.hpp:9:21: error: expected ‘;’ before ‘=’ token
../src/internal_config.hpp:9:21: error: expected unqualified-id before ‘=’ token
In file included from ../src/Edge.cpp:1:0:
../src/Edge.hpp:14:2: error: ‘EdgeIndexType’ does not name a type
../src/Edge.hpp:20:21: error: expected ‘)’ before ‘rec_src’
../src/Edge.cpp:4:11: error: expected constructor, destructor, or type conversion before ‘(’ token
../src/Edge.cpp: In copy constructor ‘Edge::Edge(const Edge&)’:
../src/Edge.cpp:9:2: error: ‘src’ was not declared in this scope
../src/Edge.cpp:9:16: error: ‘const class Edge’ has no member named ‘src’
../src/Edge.cpp:10:2: error: ‘dst’ was not declared in this scope
../src/Edge.cpp:10:16: error: ‘const class Edge’ has no member named ‘dst’
../src/Edge.cpp: In member function ‘Edge& Edge::operator=(const Edge&)’:
../src/Edge.cpp:14:2: error: ‘src’ was not declared in this scope
../src/Edge.cpp:14:16: error: ‘const class Edge’ has no member named ‘src’
../src/Edge.cpp:15:2: error: ‘dst’ was not declared in this scope
../src/Edge.cpp:15:16: error: ‘const class Edge’ has no member named ‘dst’
../src/Edge.cpp: In member function ‘bool Edge::selfEdge()’:
../src/Edge.cpp:20:11: error: ‘src’ was not declared in this scope
../src/Edge.cpp:20:18: error: ‘dst’ was not declared in this scope
../src/Edge.cpp: In function ‘bool operator<(const Edge&, const Edge&)’:
../src/Edge.cpp:24:10: error: ‘const class Edge’ has no member named ‘src’
../src/Edge.cpp:24:20: error: ‘const class Edge’ has no member named ‘src’
../src/Edge.cpp:26:15: error: ‘const class Edge’ has no member named ‘src’
../src/Edge.cpp:26:25: error: ‘const class Edge’ has no member named ‘src’
../src/Edge.cpp:28:15: error: ‘const class Edge’ has no member named ‘dst’
../src/Edge.cpp:28:25: error: ‘const class Edge’ has no member named ‘dst’
../src/Edge.cpp: In function ‘bool operator==(const Edge&, const Edge&)’:
../src/Edge.cpp:35:15: error: ‘const class Edge’ has no member named ‘dst’
../src/Edge.cpp:35:26: error: ‘const class Edge’ has no member named ‘dst’
../src/Edge.cpp:35:37: error: ‘const class Edge’ has no member named ‘src’
../src/Edge.cpp:35:48: error: ‘const class Edge’ has no member named ‘src’
../src/Edge.cpp: In function ‘std::ostream& operator<<(std::ostream&, Edge&)’:
../src/Edge.cpp:39:15: error: ‘class Edge’ has no member named ‘src’
../src/Edge.cpp:39:36: error: ‘class Edge’ has no member named ‘dst’
../src/Edge.cpp: In function ‘bool operator==(const Edge&, const Edge&)’:
../src/Edge.cpp:36:1: warning: control reaches end of non-void function [-Wreturn-type]
../src/Edge.cpp: In function ‘bool operator<(const Edge&, const Edge&)’:
../src/Edge.cpp:32:1: warning: control reaches end of non-void function [-Wreturn-type]
../src/Edge.cpp: In member function ‘bool Edge::selfEdge()’:
../src/Edge.cpp:21:1: warning: control reaches end of non-void function [-Wreturn-type]
make: *** [src/Edge.o] Error 1

Thanks

How can I choose the power-law degree exponent?

To generate a power law graph, SNAP provide a parameter p, standing for the power-law degree exponent.
So how can I choose this? Use different a,b,c and d ? If like this, what formula between a,b,c,d and power-law degree exponent?

Generating 20 Billion Edges Data

Hi,

Have you tried using this tool to generate a graph with 20 billion edges? How long it will take? I am generating it using my single PC, it is still working after 36 hours.

Number of vertices doesn't always match input

Hi. Thanks for the app. Not sure if you're still working on this but is there a specific calculation done to get the right number of vertices back? For example, the following...

./PaRMAT -nVertices 32 -nEdges 128 -noDuplicateEdges -noEdgeToSelf

Returns 31 vertices instead of 32

[feature suggestion] Possible estimation of the runtime?

I hope you can add function like dynamic estimation of the time needed to complete.
I ran the

./PaRMAT -nVertices 500000000 -nEdges 25000000000 -output 3.txt  -threads 10  -sorted

for 7hrs, and I do wish to know how long should I expect.

Explaining the output data

Hi,

I was able to run the code. Could you please explain what the output RMAT format is?
I tried for nVertices = 10 and nEdges = 5,
output:
0 5
2 3
7 3
1 5
8 3

How to run PaRMAT in windows

Hi,

I am new to programming but i had downloaded all the files but do not know how to use them in windows to generate the RMAT graph. Please suggest me the how to run PaRMAT in windows

Meaning of "unirected"

Hi @farkhor ,

Is it correct that the -undirected flag has an effect only when -noDuplicateEdges was also used, in which case it prevents generating both x y and y x edges?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.