Coder Social home page Coder Social logo

cirrus-kv's Introduction

Cirrus

Travis Build Status Coverity Scan Build Status

Cirrus is a remote data access system for interacting with disaggregated memory from uInstances in a performant fashion.

Requirements

This library has been tested on Ubuntu >= 14.04 as well as MacOS 10.12.5. Additional Mac requirements and instructions are listed below.

It has been tested with the following environment / dependencies:

  • Ubuntu 14.04
  • g++ 5.4
  • Boost
  • autotools
  • Mellanox OFED 3.4 (optional)
  • cmake
  • cpplint
  • snappy
  • bzip2
  • zlib

You can install these with

$ sudo apt-get update && sudo apt-get install build-essential autoconf libtool g++-6 libboost-all-dev cmake libsnappy-dev zlib1g-dev libbz2-dev && sudo pip install cpplint

MacOS Requirements

Building on MacOS requires the installation of gettext, boost and wget. Please ensure that automake, autoconf, xcode command line tools, and gcc/g++ are installed. gcc/g++ can be installed using macports, and the port select command allows you to set the new version of gcc/g++ as the one you want to use. The remaining programs can be installed using homebrew. cpplint can be installed via pip.

gettext can be installed as follows using homebrew:

$ brew install gettext
$ brew link --force gettext

To install wget do:

$ brew install wget

To install gcc/g++ do:

$ port install gcc5
$ port select --list gcc

To install boost do:

$ sudo port install boost

To install cpplint do:

$ pip install cpplint

Building

$ ./boostrap.sh
$ make

Running Tests

To run tests, run the following command from the top level of the project:

$ make check

To create additional tests, add them to the TESTS variable in the top level Makefile.am . Tests are currently located in the tests directory. To change the ip address (necessary for RDMA), change the address in tests/test_runner.py.

Benchmarks

To run benchmarks execute the following command from the top of the project directory

$ make benchmark

This will leave log files for each benchmark run in the top directory. To add additional benchmarks, modify the script run_benchmarks.py, located in the benchmarks directory. The benchmarks are currently set to run locally, but may be set to run using a remote server by manually changing the ip address in the benchmark files. However, this then makes it so that the benchmarks must be manually launched from the command line after starting the server remotely. Additionally, the log files will be left in the benchmarks directory.

Benchmark Results (outdated)

  • Single node burst of 128 byte put (synchronous) - latencies
    msg/s: 166427
    min: 5
    avg: 5.34385
    max: 93
    sd: 0.993202
    99%: 6
  • Single node burst of 128 byte put (async) - latencies
    min: 50 us
    avg: 261.7 us
    max: 460 us
    sd: 118.149 us
    99%: 459 us
  • Single node contention 10 source clients 128 byte put (sync)
    Average (us) per msg: 16
    MSG/s: 61715.9
    Average (us) per msg: 9 # with 6 clients
    MSG/s: 105298
  • Single node contention 6 source clients 128 byte put (async)
    Average (us) per msg: 11
    MSG/s: 87242.9

Static analysis with Coverity

Download coverity scripts (e.g., cov-analysis-linux64-8.5.0.5.tar.gz)

tar -xof cov-analysis-linux64-8.5.0.5.tar.gz

Make sure all configure.ac are setup to use C++11

cov-build --dir cov-int make -j 10

tar czvf cirrus_cov.tgz cov-int

Upload file to coverity website

cirrus-kv's People

Contributors

devloop0 avatar jcarreira avatar tyleradavis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cirrus-kv's Issues

Add benchmark suite

We should be able to do

make benchmark

and get some numbers about latency and throughput.

The user probably has to specify the name/ip of another server where we can run the Cirrus server.

Look into the benchmarks folder to see an example of a benchmark.

Make -j may not work

Due to dependency on local .a not specified.

Example, in src/server:

g++ -Wall -Wextra -ansi -fPIC -std=c++1z -pthread  -o bladepoolmain bladepoolmain-bladepoolmain.o -L. -lserver -lrdmacm -libverbs -L../authentication/ -lauthentication -L../utils/ -lutils -L../common/ -lcommon 
/usr/bin/ld: cannot find -lserver
collect2: error: ld returned 1 exit status
make: *** [bladepoolmain] Error 1
make: *** Waiting for unfinished jobs....
/usr/bin/ld: cannot find -lserver
collect2: error: ld returned 1 exit status
make: *** [allocmain] Error 1

Error Message When TCPClient is destructed

When client destructor is called as it goes out of scope, the following message is shown:
terminate called without an active exception

This may be due to not detaching or joining the two loose threads during the destructor.

Make Ethernet+RDMA work seamlessly

We should be able to easily change between RDMA or Ethernet.

Additionally, we should be able to disable all the RDMA dependencies when compiling on an ethernet-only environment (e.g., EC2).

Add better documentation

We should have automatic generation of code documentation.

Doxygen seems like a good tool to do this.

Segfault in Throughput.cpp

Throughput.cpp benchmark crashes to to a segfault when attempting the 10 MB put test.

The offending line is

test_throughput<10   * 1024 * 1024>(num_runs / 100);

GDB output:
screen shot 2017-07-10 at 12 35 15 pm

Data Replication

Should we have a mechanism to replicate data to multiple servers?

If so, should this be strongly consistent or eventually consistency would be fine? Or both?

It seems to me eventually consistent would be enough for a large class of applications, namely machine learning.

Throughput at 128 byte level in benchmarks low

Currently, we are only seeing throughput of about 20 MB/s on 128 byte puts (before the introduction of the new interface). We should be seeing speeds of about 1 GB/S

Current speeds: (MB/s, messages/s)
128 bytes: 20.7 MB/s, 162072
4K bytes: 556.371 MB/s, 135833
50K bytes: 2445.7 MB/s, 47767.9
1M bytes: 4442e MB/s, 4236.22
10M bytes: 4369.74 MB/s, 416.731

Make test of Cirrus in any cluster

Right now some of our tests make some assumptions specific to our development environment (e.g., IPs of servers).

We should allow make test to run anywhere.

Concurrency bug when doing async RDMA reads

Successive async reads can end up writing to the same memory address. This can lead to buggy reads.

The same bug also exists with async writes.

This bug can be replicated by running the iterator test in the iterator branch.

Make TCPServer and BladeAllocServer consistent

Right now TCPServer has the attribute max_objects which is inconsistent with the interface of the RDMAServer (which uses pool_size).

Both should use a raw count of bytes: pool_size.

In the same way, tcpservermain should allow setting the size of the server pool as a command line argument (but have default size).

Things to do:

  1. Make TCPServer use pool_size (number of bytes in the pool) instead of the number of objects. Also, make it an argument of the constructor as in the BladeAllocServer
  2. Make tcpservermain and bladeallocmain get arguments with the size of the pool (they both should have a default size of 10GB).
  3. Fix the memory exhaustion test to use a small value for the server pools so that the servers throw an exception earlier. Tests should be quick

Fix ip/port hardcoded values

We should think of getting rid of the IP/port hardcoded values scattered throughout the tests.

A benefit of the hard coded values is that they simplify testing because we only need to call the binary to run the test -- no need to create a custom launch script.

We will need to create a python script to launch these tests.

The IP/port values can come from a few places:

  1. ./configure --test_IP=127.0.0.1 --port=18723
  2. make IP=127.0.0.1 PORT=127.0.0.1 test

@TylerADavis What do you think?

Fix test_mt.cpp test

Test is designed to ensure system works with multiple threads on one client. However, a concurrency issue exists as all threads read and write from the oid 1, thus overwriting one another. Additionally, a memory leak exists as d2 is never freed. As it stands now, the test will likely never pass.

Benchmark 'Throughput.cpp' Hangs at 10MB Test

After installing cirrus via the on-site instructions on two servers, I modified throughput.cpp on both servers to point to the IP address of the server running tcpservermain. Then I ran the following commands:

On one server:
nohup ./tcpservermain & disown

On the other:
nohup ./throughput & disown

throughput.cpp began running tests, outputting the corresponding information:

throughput_128.log:

throughput 128 test
msg/s: 13.0619
bytes/s: 1671.92

throughput_406.log:

throughput 4096 test
msg/s: 13.0543
bytes/s: 53470.2

throughput_51200.log:

throughput 51200 test
msg/s: 24.9987
bytes/s: 1.27993e+06

throughput_1048576.log:

throughput 1048576 test
msg/s: 24.9772
bytes/s: 2.61905e+07

The benchmark hung after this point. nohup.out reads:

Warming up
Warm up done
size is 128
Measuring msgs/s..
Warming up
Warm up done
size is 4096
Measuring msgs/s..
Warming up
Warm up done
size is 51200
Measuring msgs/s..
Warming up
Warm up done
size is 1048576
Measuring msgs/s..

Environment:
Ubuntu 14.04 LTS (Amazon EC2 m4.large)
g++ 6.3

Also ran automake --add-missing in order to make ./bootstrap.sh work without error.

Implement CacheManager eviction policy

Ideally, we would like to provide the ability for the developer to provide its own eviction policy.

This might not be the way to go (at least for now) because:

  1. hard to abstract away internals of the cache from the eviction policy
  2. eviction policy if not tightly implemented with the cache becomes inefficient

References:
Redis eviction policies: https://redis.io/topics/lru-cache

Statically link against cityhash/libcuckoo or remove dependency

Currently we have a dependency on cityhash/libcuckoo. This means we need to set LD_LIBRARY_PATH to run any binary that uses Cirrus. For instance:

[joao@havoc:/data/joao/ddc/tests/object_store]% LD_LIBRARY_PATH=/data/joao/ddc/third_party/libcuckoo/cityhash-1.1.1/src/.libs ./test_fullblade_store

We should remove this dependency or find a way to statically link this library.

Add async operations

The current interface does not support asynchronous gets/puts, and so the code for these tests was commented out of some tests, and removed from test_fullblade_store. These tests should be added back once the interface supports asynchronous operations.

Add back the asynchronous operations, as well as true prefetching.

Look into coveralls.io

Coverage statistics can be useful in driving the development of tests.

We may want to use gcov to get these stats and publish them into coveralls.io

Bandwidth Benchmark sometimes stalls on >=1MB puts

The test benchmarks/throughput.cpp runs well on object sizes up to 50 kilobytes, but occasionally stalls on larger objects. Present in bandwidth_benchmark branch. As logging must be disabled to get the speeds for the test, the cause of stalls is not readily apparent. Benchmark had been run without resetting server in-between, could this cause issues? Errors about "pthread_setaffinity_np error 22" were thrown as well on occasion, and only in the later revisions of the test.

Current speeds: (MB/s, messages/s) (at time of issue creation)
128 bytes: 20.7 MB/s, 162072
4K bytes: 556.371 MB/s, 135833
50K bytes: 2445.7 MB/s, 47767.9
1M bytes: 4442e MB/s, 4236.22
10M bytes: 4369.74 MB/s, 416.731
100M byes: stalled entirely

Edit: ran the benchmark once more after resetting the remote server, and all tests ran, albeit after a long delay. Strangely, despite the tests taking so long, the results for transfer speeds are still rather high. This almost makes me think that the stall is happening outside of the timed section.

100M bytes: msg/s: 42.8607 bytes/s: 4494.27MB/s

~4.5 gigabytes/s is the highest I've seen any benchmark run

Behavior for when connecting to already populated store

At the moment, all state about the store is kept locally. If a client connects to a remote store that already contains objects, it will have no knowledge of the ObjectIDs in the store or the Mem_addr/ peer_rkey + other information associated with them.

Should we implement some way for the client to get state from the server, and if so, how?

Coverity results

Coverity issues a sizeable set of alarms on the existing code. We should check the report.

Cirrus/Disaggregation for GPUs

Opening the discussion for thinking of GPU disaggregation.

Two things come to mind:

  1. Attaching GPUs to uInstances

This allows us to pay for a cheaper instance. However, GPUs are so much more expensive than any instance that the savings here are likely to be negligible.

  1. GPU as a Service model

GPUs are expensive and are exclusively allocated to a single user. However, they are likely to not be fully utilized at all times. This means they could be shared among concurrent users.

We could build a service that provides high levels of GPU virtualization by keeping the dataset remote. Isolation between concurrent tasks could be enforced in software (has been shown to work, e..g., Singularity, but not sure about this adversarial context).

Change license to Apache 2

We should make Cirrus compatible with the Apache 2 license.

This entails removing the copyright messages from the source files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.