PIM-tree

PIM-tree is a theoretically and practically efficient comparison-based ordered index for Processing-In-Memory (PIM). It achieves high performance against data skew by providing a sweet spot for both data locality and load-balance of PIM modules. This repository contains the implementation of PIM-tree index used in our paper.

If you use PIM-tree, please cite our paper:

[1] PIM-tree: A Skew-resistant Index for Processing-in-Memory. Hongbo Kang, Yiwei Zhao, Guy E. Blelloch, Laxman Dhulipala, Yan Gu, Charles McGuffey, and Phillip B. Gibbons. In Proceedings of the VLDB Endowment (PVLDB), 16(4): 946-958, December 2022. [doi][arXiv].

[2] The Processing-in-Memory Model. Hongbo Kang, Phillip B Gibbons, Guy E Blelloch, Laxman Dhulipala, Yan Gu, Charles McGuffey. 2021. In Proceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures. 295–306. [doi].

Related Repositories:

The codes for range-partitioning PIM-based indexes serving as PIM-tree's competitors can be found in this repo at branch range_partition.
The codes for baseline "Jump-Push Search" and "Push-Pull Search" used in the study of the impact of different optimizations can be found in this repo at branch jumppush_pushpull.
Implementation of shared-memory competitors is referred to SetBench.

Requirements

This implementation was created to run the experiments in the paper. Current implementation of PIM-tree can only run on UPMEM machines. This codeset is built on UPMEM SDK.

Building

To build everything, enter the root of the cloned repository. You will need to change NR_DPUS in Makefile to the number of DPU modules on your machine before you start building. Then run your desired command listed below.

Command	Description
make	Compiles and links
make test	Compiles and start a test case
make energy	Compiles a version used for energy data collection
make debug	Compiles a version for debugging
make clean	Cleans the previous compiled files

The build produces files in build directory, which can be classfied into two types:

Host program(pim_tree_host): The host application used to drive the system.
DPU program(pim_tree_dpu): Application run on the DPUs.

DPU applications ending with query types (insert, delete, scan, predecessor, get_update, build, init) are produced due to limited space of DPUs' instruction memories. A single DPU application handling all the query requests are too large to fit in the IRAM, and thus has to split.

Host and DPU applications ending with no_shadow_subtree serve as a competitor of Jump-Push Baseline introduced in the paper.

Host and DPU applications ending with energy are used to collect energy consumption data.

Note: Our codes uses PAPI as performance collectors. Please install this codebase before you run.

Running

The basic structure for a running command is:

./build/pim_tree_host [--arguments]

Please refer to the following list to set the arguments:

Argument	Abbreviation	Used Scenario*	Usage Description
`--file`	`-f`	F	Use `--file [init_file] [test_file]` to include the path of dataset files
`--length`	`-l`	TG	Use `-l [init_length] [test_length]` to set the number of initializing and testing queries
`--init_batch_size`	NA	TG	Use `--init_batch_size [init batch size]` to set the batch size of initializing queries
`--test_batch_size`	NA	TG	Use `--test_batch_size [test batch size]` to set the batch size of testing queries
`--output_batch_size`	NA	G	Use `--output_batch_size [batch size for output file]` to set the batch size of the output file
`--get`	`-g`	TG	Use `-g [get_ratio]` to set the ratio of Get queries
`--predecessor`	`-p`	TG	Use `-p [predecessor_ratio]` to set the ratio of Predecessor queries
`--insert`	`-i`	TG	Use `-i [insert_ratio]` to set the ratio of Insert queries
`--remove`	`-r`	TG	Use `-r [remove_ratio]` to set the ratio of Delete queries
`--scan`	`-s`	TG	Use `-s [scan_ratio]` to set the ratio of Scan queries
`--alpha`	NA	TG	Use `--alpha [?x]` to set the skew of testing queries
`--nocheck`	`-c`	A	Stop checking the correctness of the tested data structure
`--noprint`	`-t`	A	Do not print timer name when timing
`--nodetail`	`-d`	A	Do not print detail
`--top_level_threads`	NA	FT	Use `--top_level_threads [#threads]` to set the number of threads for CPU side pipelining
`--push_pull_limit_dynamic`	NA	FT	Use `--push_pull_limit_dynamic [limit]` to set the push-pull limit
`--output`	`-o`	G	Use `--output [init_file] [test_file]` to set paths of output datasets
`--generate_all_test_cases`	NA	G	Generate the output dataset with function `generate_all_test()`. Ignore other arguments.
`--init_state`	NA	FG	Enables sequential writing of initializing dataset

Please refer to the following list for used scenarios*:

Used Scenario Abbreviation	Scenario Description
F	Testing with existing dataset files
T	Directly testing with self-set Zipfian workloads
G	Generating new dataset files with Zipfian workloads
A	Any time

For detailed information about workload generation and test execution, check function generate_all_test and exec in ./pim_base/include/host/driver.hpp for more detail.

Code Structure

This section will introduce the contents in each code file and how the codes are organized.

`/common`

This directory contains the configurations of the hardware used and the data structure.

common.h: Numeric configurations of the data structure.

settings.h: Configuration for program modes. (whether to turn on (1) debugging (2) tracking of various metrics)

task_base.h: Macros defining the interface of DPU function calls (tasks sent between CPU and DPUs).

`/dpu`

This directory contains source codes used to build the DPU applications.

bnode.h: Functions related to "distributed chunked skip list nodes" in layer 2 and 1.

cache.h: Definitions and functions related to "cache_init_record", a request generated when building shadow subtrees for new layer 2 nodes.

data_block.h: Definitions and functions related to "data_block", a variable length vector on MRAM, used as the storage unit for Bnodes.

dpu_buffer.h: Auxiliary function for procession of variable length data on DPUs.

dpu.c: Main function for the DPU applications. Handling function calls (tasks) from the CPU host application.

gc.h: DPU garbage collection.

hashtable_l3size.h: Definition and implementation of the local linear-probing hash table on each DPU, used in GET, INSERT, and DELETE operations.

l3_ab_tree.h: Implementation of replicated upper part (L3 in our paper) with ab-tree.

l3_skip_list.h: Implementation of replicated upper part (L3 in our paper) with skip list. (abandoned)

node_dpu.h: Defining node classes used in Layer 3, 2, 1, and data nodes.

pnode.h: Implementation of data nodes.

statistics.h: Stats collection.

storage.h: Maintaining consistency of WRAM heap.

`/host`

This directory contains source codes used to build the CPU host application.

compile.hpp: Handling the potential cases where DPU programs overflow the IRAM of the DPUs.

host.cpp: Main function for the CPU host application.

operation.hpp: Functions for database queries on the CPU host side.

papi_counters.hpp: Experiment stats collection.

`/pim_base/include`

This submodule contains efficient low-level function calls we implemented, including efficient CPU-DPU communication, pipelining, performance stats collection, database correctness testing, argument parsing, random distribution generators and other miscs.

We wrap this up into high-level function calls, so that readers could get a clarified overview of our data structure design without getting too involved in the details of UPMEM programming.

ljqcodelove / pim-tree Goto Github PK

pim-tree's Introduction

PIM-tree

Related Repositories:

Requirements

Building

Running

Code Structure

`/common`

`/dpu`

`/host`

`/pim_base/include`

pim-tree's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent