eth-asl's People
eth-asl's Issues
Logging: Be careful of logging in hotloops
Critical sections shouldn't contain expensive logging lines.
Either only print every x-thousand iterations or save the data to main memory and log it afterwards.
Handle address already in use
Add support for memcached evictions and empty messages
middleware has to support the fact that previously stored items can be evicted from the
memcached servers over time, in which case the answer to a GET is an \empty" message (see
protocol description 7). This behavior can happen both with single-key and multi-key GETs.
Measure factory parsing time
Invalid requests: Read until next newline
Implement key and value length constraints
arbitrary key and value sizes smaller or equal to 1024B in size.
Section 7.1: Build queuing model M/M/1
Run experiment Section 4: Throughput for Writes - Full System
NOT_STORED answer from memcached
The memcached servers can also send a NOT_STORED answer. We should treat this as an error. (Check the protocol specifications for details).
Setup worker threads
Each worker thread has a dedicated TCP connection to every memcached server. These
connections are set up when the middleware is started and they are only closed when the
middleware shuts down.
Requests are sent by the worker threads to the servers according to the specications
mentioned above. Worker threads wait until an operation has been completed and send
the answer from the server to the client before handling the next request.
Section 7.2: Build queuing model M/M/m
Make sure write() always writes everything
This was a hint from the TAs. Make sure we always write all the bytes from our buffer. Do it in a loop. Don't rely only on a single write.
Run experiment Section 2.2: Baseline without Middleware -Two Servers
Avoid copying requests around
Try to work with the same ByteBuffer as long as possible.
Try out Azure platform and develop automated test scripts
Implement instrumentation / statistics
see project description document from GDrive
Log metrics for experiments
Therefore, the
middleware code needs to be instrumented and report several metrics. Note that due to the
multi-threaded architecture of the middleware, statistics should be collected per each thread
and merged together when the middleware is stopped.
The middleware has to collect aggregate statistics per, at most, ve second windows. These
statistics must include:
- Average throughput
- Average queue length
- Average waiting time in the queue
- Average service time of the memcached servers
- Number of GET, SET, and multi-GET operations
At the end of the experiment, the middleware has to output, in addition to the previously
described statistics, the following:
- Histogram of the response times in tenth of a millisecond (100�s) steps
- Cache miss ratio (i.e., \empty" responses returned by the memcache servers)
- Any error message or exception that occurred during the experiment
Implement SET-operation for worker threads
The middleware forwards the requests to all servers in order
to replicate the data, and waits for a response from all of them. For successfully replicating
data, your middleware needs to perform the following:
- Recognize a SET operation encoded possibly in multiple network packets.
- The middleware has to forward the complete SET request to all memcached servers.
It sends it to all memcached servers before checking for an answer. - Wait until all memcached servers report successful execution, a send a single success
message to the client (in the same format as memcached would). - In case of an error on one or multiple servers, relay one of the error messages.
Write up Section 5
Write up Section 3
Implement net-thread
Opens up a TCP socket, takes request and puts them into the queue.
The middleware accepts connections and requests from clients on a TCP port of your
choice, specied when you launch the middleware. There is a single thread (net-thread)
that listens for incoming requests on the specied port and enqueues them into the request
queue.
Write up Section 2
Implement Startup & graceful Shutdown
The middleware has to be parameterizable at startup using three parameters:
L: Address to listen on.
P: Port to listen on.
T: Number of memcached servers.
M: IP addresses of the memcached servers.
T: Number of threads in the thread pool.
S: If multi-GETs are sharded.
At startup time, the middleware uses command line arguments to nd the set of servers
(at most 3) to connect to, and how many worker threads it should have. The net-thread
is always present.
There should be either a command line interface that allows stopping the middleware, or a
hook to catch a \kill" from Linux 8.
Implement sharded Multi-GET
MULTI-GET: In the case of GET operations with multiple keys, the middleware has to
support two modes:
- In the second mode (sharded read), the middleware splits the multi-GET into a set of
smaller multi-GET requests (not single GET requests), one request for each server.
The middleware should send out all requests before attempting to read answers
from the servers. The middleware is responsible for assembling the responses and
reordering values, if necessary, from all servers before sending the complete response
of the initial multi-GET back to the client.
Handle ClosedSelectorException differently
SocketsHandler> selector.select() method call.
Currently we use a catch-all for any type of exception and shut down the system.
Maybe we should handle ClosedSelector differently, i.e. keep a set ofo all sockets and recover by reinitialization of the selector.
Analogously we could handle ClosedSocket exceptions on the ServerSocketChannel by re-init. the ServerSocket.
ClosedSocketEx for the SocketChannels we can ignore, as this could mean, that the client has closed the connection.
Implement GET-operation for workers
For GET operations with a single key, the middleware acts as a load balancer and forwards
each request to only one of the servers. To distribute load equally you can choose to
implement either (a) a round-robin load balancer or (b) parse the keys and hash them to
determine the index of the server to read from. No load-dependent schemes are required.
We do expect, however, some proof that on average all memcached servers are subject to
the same load.
Run experiment Section 5.3: Gets and Multi-gets - Histogram
Write up Section 6
Run experiment Section 6: 2K Analysis
Run experiment Section 3.2: Baseline with Middleware - Two Middlewares
Run experiment Section 5.2: Gets and Multi-gets - Non-sharded Case
Run experiment Section 3.1: Baseline with Middleware - One Middleware
Run experiment Section 5.1: Gets and Multi-gets - Sharded Case
Does memtier parse errors correctly?
Write up Section 4
Run experiment Section 2.1: Baseline without Middleware - One Server
see Section 2 of the report outline for a description of what is asked.
Write up the section afterwards
RequestFactory: Fix newline handling
Currently we are splitting up lines on \r\n. This could be unreliable when data blocks contain \r\n by chance.
For set commands, the size of the data block is known. When implementing set, keep that in mind and don't just check each line split by \r\n.
Graceful shutdown if connection to Memcached servers drop
TA said on Oct 12: We don't have to guarantee fault tolerance. If errors occur: Best to keep code wrapped with try-catch, log the errors and then gracefully shutdown.
Implement non-sharded Multi-GET
MULTI-GET: In the case of GET operations with multiple keys, the middleware has to
support two modes:
- In the rst mode (non-sharded read), multi-GET operations are treated as regular
requests and are forwarded to one server which handles the entire request. In this
case, the answer from the server will contain multiple values and will be signicantly
larger than a single value. Your middleware has to be able to handle up to 10 values
in the same response.
Create schedule for project
based on issues
Setup and log optional statistics
In addition to the metrics mentioned above, we encourage adding instrumentation to any
part of the project that is deemed important. Furthermore, we highly recommend using addi-
tional statistics and diagnostic tools (like dstat) to get additional insight (e.g., CPU utilization,
network bandwidth, etc.).
Devise storage solution for experiment data
Make sure ant works
Writeup Section 1: System Overview
Setup experiment plan
of all the experiments that need to be done
Add log4j to ant build
Section 7.3: Build network of queues
Revert ThreadPool design to running threads in a loop
ATM we are creating a new worker object for each request.
Reverting back should speed things up.
Handle invalid requests
Each worker thread in the thread pool must be able to handle all three types of requests
(SET, GET, multi-GET). If the clients send a request that does not belong to one of these
types, the worker thread has to record this event and discard data until a newline character
is encountered.
Implement threadpool for workers
The net-thread puts all incoming requests into the internal request queue from which
requests are dequeued by worker-threads residing in a thread pool. The thread pool can
be xed size, but its size has to be a parameter given to the middleware at startup time.
A maximum number of 128 threads in this thread pool has to be supported.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.