The fgssjoin_2_files's intro from rafaelquirino

fgssjoin_2_files's Introduction

fgssjoin: Filtering GPU-based Set Similarity Join

Parallel Set Similarity Join algorithms for CUDA.

The data/ directory contains test data, from the dblp dataset.

To compile the project just run the script compile.sh. You must have CUDA environment installed, and the first variable in the Makefile inside src/ directory (CUDA_INSTALL_PATH) properly configured.

To run the project, execute bin/fgssjoin executable file, created after the compilation process, with options -f (data file, with each record in one line), -q (size of the qgrams, 3 is a good value) and -t (similarity threshold, between 0.0 and 1.0).

Examples

Compilation

Standard compilation:

user@host:~$ ./compile.sh

Compile/recompile the whole project:

user@host:~$ ./compile.sh all

Compile/recompile specific files:

user@host:~$ ./compile.sh file1 file2 etc...

Clean executable and object files

user@host:~$ ./compile.sh clean

Execution

Execution example, printing result to STDOUT

user@host:~$ bin/fgssjoin -f data/dblp_t_18k.txt -q 3 -t 0.9

Exeution example, printing result to an OUTPUT FILE

user@host:~$ bin/fgssjoin -f data/dblp_t_18k.txt -q 3 -t 0.9 > output

Reference:

Quirino R., Junior S., Ribeiro L. and Martins W. (2017). fgssjoin: A GPU-based Algorithm for Set Similarity Joins . In Proceedings of the 19th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-247-9, pages 152-161. DOI: 10.5220/0006339001520161

Recommend Projects

rafaelquirino / fgssjoin_2_files Goto Github PK

fgssjoin_2_files's Introduction

fgssjoin: Filtering GPU-based Set Similarity Join

Examples

Reference:

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent