cw75 / tiered-storage Goto Github PK
View Code? Open in Web Editor NEWAn Elastic, Tiered-Storage Service
License: Apache License 2.0
An Elastic, Tiered-Storage Service
License: Apache License 2.0
There are a few places where we're needlessly passing around references to things like integers. Change those to pass by value and make them const
when possible.
use heavy hitters
Currently, if you send a malformed request from the user to the proxy, the proxy segfaults. e.g., if you type PUT a 1
instead of PUT a:1
... that should throw an error, not crash. :-)
Let Indy client proxy accept a script that contains the commands (rather than requiring us to type them in...)
currently we recompile a few things on each executable
A few months back the request lifecycle looked something like this:
We concluded that this was bad because this meant that the (potentially large) value now made two hops: Server to routing and routing to user. The simple solution at the time was to make the routing layer simply respond with the addresses of the correct server and have the user communicate directly with the server. The result was a request lifecycle that looks like this:
However, we’ve since changed the architecture of the user & routing components to make them asynchronous. When a user sends a request, it includes it’s IP address in the request so that the routing layer knows where to respond to, because it might have to make a request to determine the correct replication factor for the key. In light of that change, we should change the request pattern:
Pros:
Cons:
cc @cw75
these are ostensibly separated because there's much more serving traffic than routing traffic, and we can deploy many fewer routing nodes. Still, doesn't seem very harmful to have that lightly-used service embedded in the server process
#60 made me realize that we have inconsistent styles, for example, around #define
guards. By default, I usually go with Google's style guides, but I don't know a ton about "good" C++ development, so I'm happy to defer to anyone else with stronger opinions on the subject. Either way, we are inconsistent right now, so we should fix that.
Once we finish this refactor, we should also probably run clang-format
on the code because I'm pretty sure we have some weird indentation stuff going on, imports are not alphabetical, etc. I'll eventually put this into the Travis build as well.
@cw75 @jhellerstein: Tagging you guys, so that we can come to a decision on this soon and move forward.
Will be easier to read/maintain/test
Global
and Local
in ReplicationFactorRequest
and ReplicationFactor
)Which Protocol Buffer version should I use? proto2? proto3? Doesn't matter?
When there are no storage servers, the routing node fails with a floating point exception because it tries to hash into the global hash ring when the ring is empty. It just prints out error!
, and the routing process fails silently. This is obviously bad, and we should instead return a more useful error message to the user.
Why are some external dependencies like Google Test and Google Benchmark put in the vendor directory but others like zeromq and Protocol Buffers aren't?
Can't actually have a project that build executables called "server" and "user", etc.
If we want to have multiple client proxies, there exists a race condition where a server joins and a client proxy is added immediately after and therefore doesn't get the newest server in the list of servers that it uses to construct a hash ring. We can avoid this by having the clients gossip their lists of servers every time the list is updated or a new client joins.
common.h has a lot of stuff that seems like it should be configurable at launch time, not compile time.
In the Kubernetes deployment, we rely on an environment variable called SERVER_TYPE
to tell the storage server process whether it should start as a memory or disk tier node. If no such variable is set when the server process is started (with ./build/src/bedrock/server
), the server seg faults and fails to start. We should either have a more graceful error message or have a default (probably makes sense to default to being a memory tier server?) instead of failing silently.
Currently, when a key is queried during key redistribution, the server might say "key does not exist" because the key may not have arrived at the new node yet.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.